Showing posts with label cpufreq. Show all posts
Showing posts with label cpufreq. Show all posts

Tuesday, 22 November 2016

linux-4.8-ck8, MuQSS version 0.144

Here's a new release to go along with and commemorate the 4.8.10 stable release (they're releasing stable releases faster than my development code now.)

linux-4.8-ck8 patch:
patch-4.8-ck8.lrz

MuQSS by itself:
4.8-sched-MuQSS_144.patch

There are a small number of updates to MuQSS itself.
Notably there's an improvement in interactive mode when SMT nice is enabled and/or realtime tasks are running, or there are users of CPU affinity. Tasks previously would not schedule on CPUs when they were stuck behind those as the highest priority task and it would refuse to schedule them transiently.
The old hacks for CPU frequency changes from BFS have been removed, leaving the tunables to default as per mainline.
The default of 100Hz has been removed, but in its place a new and recommended 128Hz has been implemented - this just a silly microoptimisation to take advantage of the fast shifts that /128 has on CPUs compared to /100, and is close enough to 100Hz to behave otherwise the same.

For the -ck patch only I've reinstated updated and improved versions of the high resolution timeouts to improve behaviour of userspace that is inappropriately Hz dependent allowing low Hz choices to not affect latency.
Additionally by request I've added a couple of tunables to adjust the behaviour of the high res timers and timeouts.
/proc/sys/kernel/hrtimer_granularity_us
and
/proc/sys/kernel/hrtimeout_min_us

Both of these are in microseconds and can be set from 1-10,000. The first is how accurate high res timers will be in the kernel and is set to 100us by default (on mainline it is Hz accuracy).
The second is how small to make a request for a "minimum timeout" generically in all kernel code. The default is set to 1000us by default (on mainline it is one tick).

I doubt you'll find anything useful by tuning these but feel free to go nuts. Decreasing the second tunable much further risks breaking some driver behaviour.

Enjoy!
お楽しみ下さい
-ck

Saturday, 29 October 2016

linux-4.8-ck5, MuQSS version 0.120

Announcing a new version of MuQSS and a -ck release to go with it in concert with mainline releasing 4.8.5



4.8-ck5 patchset:
http://ck.kolivas.org/patches/4.0/4.8/4.8-ck5/

MuQSS by itself for 4.8:
4.8-sched-MuQSS_120.patch

MuQSS by itself for 4.7:
4.7-sched-MuQSS_120.patch


Git tree:
https://github.com/ckolivas/linux



This is a fairly substantial update to MuQSS which includes bugfixes for the previous version, performance enhancements, new features, and completed documentation. This will likely be the first publicly announced version on LKML.

EDIT: Announce here: LKML

New features:
- MuQSS is now a tickless scheduler. That means it can maintain its guaranteed low latency even in a build configured with a low Hz tick rate. To that end, it is now defaulting to 100Hz, and it is recommended to use this as the default choice for it leads to more throughput and power savings as well.
- Improved performance for single threaded workloads with CPU frequency scaling.
- Full NoHZ now supported. This disables ticks on busy CPUs instead of just idle ones. Unlike mainline, MuQSS can do this virtually all the time, regardless of how many tasks are currently running. However this option is for very specific use cases (compute servers running specific workloads) and not for regular desktops or servers.
- Numerous other configuration options that were previously disabled from mainline are now allowed again (though not recommended for regular users.)
- Completed documentation can now be found in Documentation/scheduler/sched-MuQSS.txt
 Bugfixes:
- Fix for the various stalls some people were still experiencing, along with the softirq pending warnings.
- Fix for some loss of CPU for heavily sched_yielding tasks.
- Fix for the BFQ warning (-ck only)

Enjoy!
お楽しみ下さい
-ck

Friday, 23 September 2016

BFS 502, linux-4.7-ck5

With the fix for the last of the freezes with BFS497 becoming clearer and a number of other minor issues being attended to, such as build failures and minor improvements accumulating, I'm releasing a new BFS that combines all into yet another release, which should be the last of the releases for the 4.7 kernel.

BFS by itself:
4.7-sched-bfs-502.patch

-ck patches with BFS:
4.7-ck5

In addition to the update to BFS, this -ck release is the first in a very long time to include a patch from another developer - the Throttled background buffered writeback v7 patch by Jens Axboe. This makes a massive difference to a system's ability to read files, open new applications etc. under heavy write loads in my testing and is a change which I believe is essential and will eventually make its way into the mainline kernel.

The changes to BFS 502 are as follows:
 
bfs497-build_other_arches.patch
bfs497-no_smtload_avg.patch
bfs497-recognise_nodes2.patch
bfs497-revert-othercpufreq.patch
bfs497-fix_smt_nonice.patch  

  • A build fix for building on other architectures (notably ARM).
  • Simplifying the load measurement on SMT machines reported to cpufreq - trying to account for load on the SMT sibling is unnecessary as each core will run at the speed of the most loaded sibling anyway on any existing hardware.
  • A fix for detecting CPUs on other NUMA nodes and setting their locality correctly.
  • Not trying to signal CPU load to cpufreq on other CPUs when tasks migrate - this was leading to the hangs and there is enough rescheduling for cpufreq to get the load later on.
  • A build fix for when SMT_NICE is not configured.

Enjoy!
お楽しみ下さい
-ck

Tuesday, 13 September 2016

BFS 497, linux-4.7-ck4

For the first time in a very long time, I'm announcing yet another -ck release up to ck4 along with yet more substantial updates for BFS for linux-4.7 based kernels.

BFS by itself:
4.7-sched-bfs-497.patch

-ck branded linux-4.7-ck4 patches:
linux-4.7-ck4

Thanks(?) to the massive changes to the mainline kernel I'd been forced to rewrite significant components of BFS to work properly with them, specifically the cpu frequency governors. At the same time I've had quite a bit of energy and enthusiasm for working on BFS in a way I haven't had in a long time. As a result, this updated version not only addresses the remaining cgroup stub patch bug (mentioned on the previous announcement) but implements further improvements and clean ups to go with those improvements.

Alas I still have no explanation for the random lockups some people are seeing, but I have seen reports of it happening on mainline kernels as well now, so while I'm always suspicious of my own code, there is also the chance that BFS exacerbates an issue in mainline. Something that appears common is onboard Intel graphics with the Haswell chipset.

Additionally I had reports of people being unable to suspend with BFS from 4.7 but I haven't heard back from them on later versions.

The short summary of improvements in this version are less overhead, higher throughput and less latencies.

I've rewritten the skiplist implementation to not require a malloc/free on insertion/removal of a new node which seemed to noticeably improve throughput at high loads.
Now that CPU frequency governors know what the scheduler is doing, the approach of BFS of old of knowing what the governor was doing and working around it is no longer helpful and I've removed the whole sticky task and offset for throttled CPUs and throughput has actually improved instead.
I've also added some micro-optimisations and cleanups.
I've added a minor change for offlining CPUs to prevent tasks trying to schedule to them.

The set of patches in ck4 is the largest in the ck patchset since the early 2.6 patchset days. I've also included the patch from Alfred (thanks!) to fix the warning that happens with suspend which is mostly harmless.

Each patch included has a mini changelog at the top.

I'm also keen to get feedback from people on if they see any noticeable interactive/responsiveness regressions by disabling the interactive flag as follows:

echo 0 > /proc/sys/kernel/interactive

Enjoy!
お楽しみ下さい
-ck

Wednesday, 7 September 2016

BFS 490, linux-4.7-ck3

Announcing yet another substantial update for BFS for linux-4.7 based kernels.

BFS by itself:
4.7-sched-bfs-490.patch

-ck branded linux-4.7-ck3 patches:
linux-4.7-ck3

Following on from the large update to BFS in 480 to skip lists, numerous regressions became apparent, the bulk of which were related to doing a poor job of signalling cpu load to the various cpufrequency governors. Some were affected badly, others not so, but there were plenty of helpful people giving feedback about those regressions which encouraged me to slowly but surely chip away at the problems. Additionally, there were some minor behavioural regressions which were oversights during the updates to BFS 480. Finally the rudimentary cgroup stub patch would crash the system.

As the number of patches required to address these issues got larger and larger, it became hard for people on this blog to keep up with the changes so I've released 490 which hopefully should address the bulk of these issues - there are patches in there that haven't been posted on this blog, but I've included all of them with a brief description in the incremental/ directory for your perusal.

Anyway it is much easier for people to grab the latest version which includes all of those changes, including the updated cgroups stub patch.

EDIT: Here's a patch to make cgroup stubs safer cgroup-stubs-safe2.patch

Enjoy!
お楽しみ下さい
-ck

Friday, 2 September 2016

BFS 480 with skip lists, linux-4.7-ck2

Announcing a major update for BFS for linux-4.7 based kernels.

BFS by itself:
4.7-sched-bfs-480.patch

-ck branded linux-4.7-ck2 patches:
linux-4.7-ck2

This is the largest BFS update in a long time. The various problems that had been accumulating forced me to spend a more extended period fixing BFS to work with the latest mainline changes and encouraged me to overhaul some areas that had long been needing it.

The changes are:
  • Fixed the crash when SMT NICE is configured in on a CPU without SMT.
  • Added my skiplist implementation.
  • Converted BFS from its long-standing O(n) lookup to use skiplists.
  • Fix crash when SMT NICE is enabled on some hardware
  • Fix try_preempt missing the locality diff effect in non-interactive mode
  • Ignore busy threads/caches when still on the same core
  • Reworked the testing of idle threads and cores for less overhead and to correctly identify idle siblings
  • Fix the CPU load that's passed to the cpu frequency governor, fixing a crash and non-working schedutil governor.
The short summary is I've fixed a number of showstopper bugs on the last version, and improved throughput .

Actually incorporating the skiplists that I had experimented with a long time ago was decided on by the fact that I was able to trim the skiplist overhead further and maintain identical semantics for process selection (maintaining interactivity) whereas on the previous experiment I had never completed the work. Throughput testing shows virtually identical performance on normal workloads and theoretically would be helpful in extreme overload cases.

The original post regarding skip lists was here:
bfs-and-skip-lists.html


This now means that BFS is no longer O(n) lookup after O(1) insertion. It is now O(log(n)) insertion, O(1) lookup and O(k) removal where k <= 16, thereby tackling a long-standing criticism of the overall design.

I did not find a specific cause for peoples' inability to suspend to ram so I doubt this has been fixed despite the large code update.


The list of patches making up bfs480 is as follows:

bfs472-fix_set_task_cpu.patch
skiplists.patch
bfs472-skiplist.patch
bfs-delay-smt-siblings.patch
bfs-fix-noninteractive-try-preempt.patch
bfs-ignore_local_busy.patch
bfs-rework-idles.patch
bfs-fix-schedutil.patch
bfs-v480.patch

As always I'm giving this to you not long after I've finished coding it so all the usual warnings apply, especially with an update of this size.

EDIT: Uniprocessor build fix: bfs480-fix-upbuild.patch
EDIT2: Here is a test patch to try and improve cpufreq behaviour: bfs480-rework_cpufreq.patch

Enjoy!
お楽しみ下さい
-ck

Friday, 29 July 2016

BFS 472, linux-4.7-ck1

Announcing an updated BFS for linux-4.7 based kernels.

BFS by itself:
4.7-sched-bfs-472.patch

-ck branded linux-4.7-ck1 patches:
linux-4.7-ck1

This was quite a substantial merge effort this time around with a fair amount of changes in mainline kernel that affected the patch. Nonetheless everything appears to be working as planned in my limited testing. I'm unsure if the changes will fix the problems people had with suspend during the 4.6-bfs patches but the new code does touch that area. I was never affected on any of my machines so was unable to reproduce the problem in the first place.

In addition to the resync, a few minor changes have made their way into this release with respect to the way tasks preempt other tasks. See bfs470-updates.patch for details.

One other fairly significant change was properly hooking into the new schedutil parameters that drive cpufreq scaling governors. What I committed into bfs470 would not have been working properly in choosing the correct CPU frequency to run at and may have led to slowdowns and/or more power usage. This should be fixed in 472.

I should also mention that if, like me, you use the evil proprietary nvidia driver, the latest will not build with the current kernel and you'll need a couple of patches to get it working.

Enjoy!
お楽しみ下さい
-ck

EDIT: This patch will fix crashes when configured without SMT_NICE enabled:
bfs472-fix_set_task_cpu.patch
And will be applied to the next BFS release.