5.9 Merge window, part 1
Architecture-specific
- Support for the Unicore architecture, which survived a purge of unused architectures in 2018, has now been removed for real.
- X86 kernels can now be built using Zstandard compression.
- The x86 kernel currently provides a special device that allows access to the CPU's model-specific registers. As described in this commit, though, providing that kind of access is asking for all kinds of trouble. As of 5.9, the kernel can filter MSR writes to an approved subset of registers, or block them entirely (which is the intended final goal).
- Support for the x86 FSGSBASE instructions has finally been merged, bringing a long story to a close. This work allows safe and efficient access to the FS and GS segment base registers from user space.
Core kernel
- The io_uring subsystem now has full support for asynchronous buffered read operations without the need to fall back to using kernel threads. Write support will be forthcoming in a future development cycle.
- The deadline scheduler has gained capacity awareness, meaning that it now makes correct decisions on asymmetric systems. See this documentation patch for details.
- There is a new sysctl knob named sched_uclamp_util_min_rt_default. It controls the default amount of CPU-frequency boosting that is applied when a realtime task runs, with the idea of limiting power usage on mobile systems. This documentation patch has (a bit) more information.
- The kernel's energy model, which until now only described the energy behavior of CPUs, has been extended to cover peripheral devices as well. See this documentation patch for (some) more information.
- The new close_range() system call allows a process to efficiently close a whole range of open file descriptors.
Filesystems and block I/O
- The new Btrfs rescue= mount option is meant to be the future home of all rescue-oriented options; for now it just contains aliases of existing options. The alloc_start and subvolrootid options have been removed.
- Btrfs has also seen some performance improvements, especially around fsync() operations.
- Inline encryption support was added in 5.8; in 5.9 that support will be provided in the ext4 and F2FS filesystems via the new inlinecrypt mount option. This allows these filesystems to make use of encryption support built into block-device controllers.
- The new debugfs= command-line option controls the availability of the debugfs filesystem. Setting it to on enables debugfs normally, while off disables it as if it were configured out of the kernel entirely. The no-mount option leaves debugfs enabled internally, but does not allow it to be mounted.
Hardware support
- Crypto: Silex Insight BA431 random number generators, TI K3 SA2UL crypto accelerators, and Ingenic random number generators.
- Miscellaneous: NVIDIA Tegra210 external memory controllers, Renesas RPC-IF SPI controllers, Corsair Commander Pro controllers, and Microchip Sparx5 SoC temperature sensors.
- Regulator: ChromeOS embedded-controller regulators, Qualcomm USB Vbus regulators, Qualcomm LAB/IBB regulators, Silergy SY8827N regulators, Fairchild FAN53880 regulators, and NXP PCA9450A/PCA9450B/PCA9450C regulators.
- Systems-on-chip: Intel Movidius (A.K.A. "Keem Bay"), Microchip Sparx5, and MStar/Sigmastar Armv7.
- USB: Xilinx ZynqMP PHYs, SAMSUNG SoC series UFS PHYs, Qualcomm IPQ806x DWC3 USB PHYs, and Ingenic X-series PHYs.
Security-related
- Kernels built with Clang can be configured to automatically initialize all on-stack variables to zero. As described in this commit, how Clang will support this feature in the future is unclear, so changes to this option are likely to be needed in response to future Clang releases.
- The seccomp subsystem, when used with user-space notification, now allows the supervisor process to inject file descriptors into the watched process. This enables full emulation of system calls that create new file descriptors.
- The new CAP_CHECKPOINT_RESTORE capability provides access to a number of features needed to checkpoint and restore processes without (other) privileges. This capability was covered here (as CAP_RESTORE) in June. See this merge commit for a description of the actions allowed by CAP_CHECKPOINT_RESTORE.
Internal kernel changes
- The smp_read_barrier_depends() barrier, which was only needed on the Alpha architecture, has been removed in favor of some smarter logic in READ_ONCE().
- GCC 11 provides all of the features needed to support the KCSAN data-race detector, so KCSAN can now be used with GCC-built kernels.
- The uninitialized_var() macro has been removed. Its purpose was to silence compiler warnings about variables that might be used without being initialized; in practice it has often ended up hiding bugs.
- A great deal of cleanup work in the kernel's entry and exit code in 5.8 has been topped off in 5.9 with a new set of generic entry and exit functions. There is necessarily a lot of architecture-specific work to be done on entry into the kernel, but the overall set of tasks — and the ordering relationships between them — are essentially the same. The new code, once adopted by the various architectures, should help to eliminate bugs in this area and prevent their reintroduction. See the commits for the entry, exit, and interrupt handlers for a better idea of how this all works.
The 5.9 merge window can be expected to remain open through August 16.
There are a number of significant repositories yet to be pulled, so chances
are that the second-half summary, to be posted on LWN shortly after that
date, will include many more changes. Stay tuned.
Index entries for this article | |
---|---|
Kernel | Releases/5.9 |
Posted Aug 8, 2020 9:03 UTC (Sat)
by hrw (subscriber, #44826)
[Link]
https://fedora.juszkiewicz.com.pl/syscalls.html
Posted Aug 12, 2020 12:37 UTC (Wed)
by jezuch (subscriber, #52988)
[Link]
Well, what a surprise...
5.9 Merge window, part 1
5.9 Merge window, part 1