-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use online CPUs rather than max_ncpus for taskq thread count #10282
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10282 +/- ##
==========================================
+ Coverage 79.52% 79.55% +0.03%
==========================================
Files 389 389
Lines 123120 123120
==========================================
+ Hits 97906 97945 +39
+ Misses 25214 25175 -39
Continue to review full report at Codecov.
|
I know that some hypervisors also do this, to allow scaling up the number of CPUs without rebooting |
Yes, but This is a bare metal machine that I am using. |
While systems which support hot plugging are rare, they do exist. My concern with this change is that it would cause such a system to panic when adding a cpu. Which isn't great. Instead of changing |
Due to hotplug support or BIOS bugs sometimes max_ncpus can be an absurdly high value. I have a system with 32 cores/threads but reports max_ncpus == 440. This many threads potentially cripples the system during arc_prune floods for example. boot_ncpus is the number of working CPUs when called so use that instead. Signed-off-by: DHE <git@dehacked.net>
I went over the code and see what you mean. Yeah hotplugging CPUs could cause.. umm.. problems. I went ahead and changed any taskq that references I'm not qualified to provide an upper bound on arc_prune thread counts and just left it a boot_ncpus. At least it's not 14 times too high any more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Let me also refer you to PR #10331 where decreasing the number of prune threads has been proposed.
Due to hotplug support or BIOS bugs sometimes max_ncpus can be an absurdly high value. I have a system with 32 cores/threads but reports max_ncpus == 440. This many threads potentially cripples the system during arc_prune floods for example. boot_ncpus is the number of working CPUs when called so use that instead. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: DHE <git@dehacked.net> Closes openzfs#10282 (cherry picked from commit 57434ab)
Due to hotplug support or BIOS bugs sometimes max_ncpus can be an absurdly high value. I have a system with 32 cores/threads but reports max_ncpus == 440. This many threads potentially cripples the system during arc_prune floods for example. boot_ncpus is the number of working CPUs when called so use that instead. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: DHE <git@dehacked.net> Closes openzfs#10282 (cherry picked from commit 57434ab)
Due to hotplug support or BIOS bugs sometimes max_ncpus can be an absurdly high value. I have a system with 32 cores/threads but reports max_ncpus == 440. This many threads potentially cripples the system during arc_prune floods for example. boot_ncpus is the number of working CPUs when called so use that instead. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: DHE <git@dehacked.net> Closes openzfs#10282
Due to hotplug support or BIOS bugs sometimes max_ncpus can be an absurdly high value. I have a system with 32 cores/threads but reports max_ncpus == 440. This many threads potentially cripples the system during arc_prune floods for example. boot_ncpus is the number of working CPUs when called so use that instead. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: DHE <git@dehacked.net> Closes openzfs#10282
Some systems have issues or allow CPU hotplugging that can make
max_cpus absurdly large. As these systems are rare and the
consequences of too many threads can be debilitating limit this
to a known good value.
Signed-off-by: DHE git@dehacked.net
Motivation and Context
I have a machine where, for some reason, num_possible_cpus() returns 440. The correct number of CPUs/cores/threads in this system is 32. Don't know if that's a BIOS bug or something else, but having 440 instances of
arc_prune
"working" is a disaster.I don't know if there's a better fix, but for now I'm drawing attention to what kills my system.
Alternative workaround: kernel commandline option
nr_cpus=32
for this system.Description
Change
max_cpus
to be the same asboot_cpus
How Has This Been Tested?
Build tested as a trivial change
Types of changes
Checklist:
Signed-off-by
.