4.6 Merge window part 2
The most notable user-visible changes include:
- Support for memory protection keys
has been merged. This is an Intel feature allowing user space to
partition its memory into zones and apply additional access
restrictions to each. The system calls for the manipulation of memory
protection keys have not yet been merged; they are waiting for a bit more
review. But keys will be used by the kernel in 4.6 to implement truly
execute-only
memory that cannot be read by the executing process.
- Control groups are now namespace-aware; there is a new
CLONE_NEWCGROUP flag to clone() to create a process
in a new control-group namespace. See this
patch for documentation on this new feature. The control-group
filesystem can also now be mounted within user namespaces.
- The preadv2() and
pwritev2() system calls, which take an extra "flags"
argument, have finally been merged, allowing for the addition of new
functionality. The first flag is RWF_HIPRI, which enables
the use of polling for a high-priority request.
- Page poisoning has traditionally been a kernel debugging feature; it
fills freed pages with a special pattern that is easy to spot when
looking for things that went wrong. In 4.6, poisoning can be enabled
independently of the debugging options, and the "poison" value can be
set to zero; this results in pages being simply cleared when they are
freed. This behavior, inspired by the grsecurity/PaX patches, reduces
the chances of the kernel leaking sensitive data.
- The memory-management subsystem's thrash-detection code has never
worked properly within control groups; that has been rectified. The
result should be better behavior when specific control groups are
experiencing memory pressure.
- The integrity measurement architecture (IMA) subsystem now requires
that its policy be signed, and the integrity of that policy is
measured prior to loading.
- The ARM64 architecture now supports the "user access override" feature
found in ARMv8.2. It allows user space to be accessed (by the kernel)
using ordinary unprivileged instructions that check the owning
process's
permissions in the normal way. That, in turn, offers extra protection
against the kernel being fooled into accessing memory it shouldn't.
- ARM64 also now supports kernel
address-space layout randomization.
- The kernel's representation of general-purpose I/O (GPIO) devices has
been massively reworked; the gpio_chip structure is a proper
device within the device model now. There is a new ABI for getting
information about the GPIOs on the system, but some work remains to be
done. As Linus Walleij noted:
"
We can now discover GPIOs properly from userspace. We still have not come up with a way to actually *use* GPIOs from userspace.
" See tools/gpio/lsgpio.c for an example of the new ABI; note that the old sysfs-based ABI is now considered obsolete (even though it has not yet been completely replaced). - A process's timer slack value — the amount by which timer requests may
be delayed to cause them to coincide with others — can now be seen and
modified via /proc/PID/timerslack_ns.
- The extended BPF virtual machine now implements per-CPU maps for
high-speed statistics collection. There is also a new map type to
store stack traces.
- There is a new network-control API called "devlink," intended for the
setting of various parameters that are not related to any specific
device class. This protocol is, naturally, undocumented; some
information can be found in this
merge changelog.
- The kernel connection multiplexer,
which allows for certain types of higher-level protocol handling in
the kernel, has been merged.
- A number of network-oriented sysctl knobs (tcp_syn_retries,
tcp_synack_retries,
tcp_syncookies,
tcp_reordering,
tcp_retries1,
tcp_retries2,
tcp_orphan_retries,
tcp_fin_timeout,
tcp_notsent_lowat,
igmp_max_memberships,
igmp_max_msf,
igmp_llm_reports, and
igmp_qrv) have been made network-namespace aware, so
that different namespaces can have different values.
- The "local checksum offload" mechanism (described in this article) has been merged. Local
checksum offload speeds checksum calculations, making tunneled
protocol implementations faster. See
Documentation/networking/checksum-offloads.txt for
more information.
- Netlink support over shared memory segments has been removed; it has
never worked correctly and there does not appear to be any user-space code
using it.
- A couple of new filesystem ioctl() commands
(Q_GETNEXTQUOTA and Q_XGETNEXTQUOTA) have been added
to enable efficient iteration through all of the disk quotas on a
filesystem.
- The Btrfs filesystem has a new mount option, nologreplay,
which prevents the replaying of the log tree; this can be used with
ro to obtain a truly read-only mount. The new mount option
usebackuproot is meant to replace the existing
recovery option.
- New hardware support includes:
- Audio:
Maxim MAX9867 and max98926 codecs,
Realtek RT5514 codecs,
AMD audio coprocessors, and
Allwinner A10 S/PDIF controllers.
- GPIO:
WinSystems WS16C48 GPIO controllers,
ACCES 104-DIO-48E GPIO controllers,
Technologic TS-4800 FPGA GPIO controllers,
TI TPIC2810 8-Bit I2C GPO expanders,
TI TPS65218 GPIO controllers,
TI TPS65086 GPO controllers, and
MEN 16Z127 GPIO controllers.
- Input:
BYD BTP10463 touchpads,
MELFAS MIP4 touchscreens,
Freescale i.MX25 integrated touchscreens, and
numerous devices using the Synaptics "Register Mapped Interface"
protocol.
- Media:
TI "camera adaptation layer" capture engines.
- Miscellaneous:
Xilinx NWL PCIe controllers,
Cavium ThunderX PEM PCIe host controllers,
Microchip PIC32 random number generators,
ST Microelectronics adjunct processors,
Qualcomm HIDMA DMA engines,
Active-semi ACT8945A charger controllers,
NXP LPC18XX EEPROM memory,
BCM2835 auxiliar mini UARTs,
Marvell EBU serial ports,
Moxa SmartIO MUE multiport serial cards,
AT91 SAMA5D2 analog to digital converters (ADCs),
Texas Instruments ADC0831/ADC0832/ADC0834/ADC0838 ADCs,
Texas Instruments ADS1015 ADCs,
Freescale MX25 ADCs,
Analog Devices AD5761/61R/21/21R digital to analog converters,
Freescale MPL115A1 pressure sensors,
Atlas Scientific pH-SM sensors,
TI AFE4404 heart rate and pulse oximeter sensors,
TI AFE4403 heart rate monitors,
TI TPS65086 power management integrated chips,
APM SoC X-Gene SLIMpro mailbox controllers,
Rockchip SoC integrated mailboxes,
Hisilicon Hi6220 mailboxes,
ARM high-definition color LCD controllers,
Microchip PIC32MZDA SDHCI controllers, and
MediaTek M4U I/O memory-management units.
- Network:
MediaTek MT7623 Gigabit Ethernet controllers and
Intel Ethernet X722 iWARP cards.
- USB:
Rockchip EMMC PHYs and
Rockchip DisplayPort PHYs.
- Watchdog: Intel MEI iAMT watchdogs, National Instruments 903x/913x watchdog timers, WinSystems EBC-C384 watchdog timers, and ARM SBSA generic watchdogs.
- Audio:
Maxim MAX9867 and max98926 codecs,
Realtek RT5514 codecs,
AMD audio coprocessors, and
Allwinner A10 S/PDIF controllers.
Changes visible to kernel developers include:
- The compile-time stack validation
patches have been merged, providing a tool that ensures that the
call stack will always be valid. The result will be more reliable
stack traces for developers; this feature is also needed for the
further development of the live-patching mechanism.
- There is a new function for freeing a set of objects:
void kfree_bulk(size_t size, void **objects);
It differs from kmem_cache_free_bulk() in that there is no pointer to a kmem_cache structure, meaning that objects from multiple slabs can be freed together. There is a cost to doing things this way, though, so kmem_cache_free_bulk() is preferred in cases where it is applicable.
- The cpufreq subsystem, charged with setting CPU frequencies to match
the current system load, has seen some significant changes. In
current kernels, it uses timers to periodically sample the load on the
CPU and, perhaps, make changes. As of 4.6, instead, the cpufreq
governors will be called directly from the scheduler when things
change, eliminating the timers. Eventually the governors will also
use the projected load information from the scheduler to make
(hopefully) better decisions, but that is work for a future
development cycle.
- sscanf() now has basic support for matching sets of
characters using the %[ operator (e.g. "%[abc]" to
match any of abc). Only literal sets can be
matched; there is, for example, no special meaning for "-"
within a character set.
- The new dtx_diff tool, in the scripts/dtc directory,
can calculate the differences between device trees in a number of
formats.
- The generic code supporting encrypted filesystems has been moved into
the VFS layer (in fs/crypto) so that it can be used beyond
the ext4 and f2fs filesystems.
- The I2C subsystem has a new pin-controller-based bus demultiplexor
allowing runtime selection between multiple I2C controllers. See i2c-demux-pinctrl.txt for an
overview.
- The "kcov" kernel code-coverage analyzer has been merged; it can be useful to ensure that fuzzing and other testing efforts have exercised as much code as possible. See Documentation/kcov.txt for more information.
At this point, it would appear that the bulk of the changes for this
development cycle have been merged. The merge window will likely stay open
through March 27, though, so one never knows whether something else of
interest might turn up. Next week's Kernel Page will summarize any
significant changes that appear at the tail end of the 4.6 merge window.
Index entries for this article | |
---|---|
Kernel | Releases/4.6 |