5.18 Merge window, part 1
Architecture-specific
- 32-Bit Arm systems have gained support for separate interrupt stacks and virtually-mapped kernel stacks.
- Support for older Arm systems (ARMv4 and ARMv5) without a memory-management unit has been removed. MMU-less support for ARMv7-M systems remains, though.
- The arm64 architecture supports the new "QARMA3" pointer-authentication algorithm. This variant of Arm's QARMA is evidently faster while still being sufficiently secure.
- Arm64 systems can be built with shadow-stack support using the GCC 12 compiler release.
- The PA-RISC architecture now has minimal vDSO support which, in turn, enables the system to run with a non-executable stack for the first time. The initial version of this patch was posted in 2006; some things take longer than others to get into the mainline, it seems.
- Support for the Intel "hardware feedback interface" has been added. This mechanism allows the hardware to inform the kernel about the current performance and energy-efficiency capability of each CPU in the system. These capabilities can change over time as the result of, for example, thermal constraints. This documentation patch has some more information.
- Support for the nds32 architecture has been removed. According to the merge
changelog:
The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release.
Core kernel
- The io_uring subsystem has seen a number of improvements. The new IORING_SETUP_SUBMIT_ALL option will cause a full batch of requests to be submitted even if an error is encountered partway through. The file descriptors for the ring itself can be registered with the ring, providing a performance improvement for threaded applications; see this changelog for some details. The new IORING_OP_MSG_RING operation allows one ring to signal another. Finally, it is now possible to perform the NAPI busy poll on sockets directly from the ring.
- Support for the a.out executable format is no longer built by default for the alpha and m68k architectures — the last two that were still using it. The a.out code has not actually been removed yet but that is probably coming soon.
- Some tweaks to the restartable-sequences API have been merged in preparation for support in the GNU C Library.
- The DAMON operation schemes (DAMOS) mechanism gives user space more control over memory-management operations (and page reclaim in particular).
- The tracing system now supports "user events", which are essentially
dynamic tracepoints in user-space applications. The feature is
described in the merge changelog
as:
User space can register an event with the kernel describing the format of the event. Then it will receive a byte in a page mapping that it can check against. A privileged task can then enable that event like any other event, which will change the mapped byte to true, telling the user space application to start writing the event to the tracing buffer.
See the commits adding documentation and a sample program for more information.
Filesystems and block I/O
- The inline-encryption capabilities of block request queues can now be viewed in sysfs; see this changelog for details.
- Direct I/O is not normally a possibility for encrypted files, since the data must be buffered through the kernel for encryption or decryption anyway. If the hardware does the crypto work, though, the situation is different. In 5.18, files encrypted with fscrypt can be accessed with direct I/O if inline encryption is in use. This documentation patch contains a little more information.
- The F2FS filesystem has gained support for ID-mapped mounts.
- Support for NFSv3 will always be built into the NFS server if NFS is enabled at all. This is done with the intent of making NFSv3 become the base, "always-supported" version of NFS in preparation for the eventual removal of NFSv2 support.
- There are two new ioctl() operations for Btrfs (BTRFS_IOC_ENCODED_READ and BTRFS_IOC_ENCODED_WRITE) that allow direct reading from and writing to a file's extents. The main use case for these commands is to support newer, more efficient send and receive operations.
Hardware support
- Hardware monitoring: ASUS ACPI embedded controllers, Vicor PLI1209BC digital power supervisors, Aquacomputer Farbwerk 360 RGB controllers, and Texas Instruments TMP464 and TMP468 temperature sensors.
- Media: Microchip CSI2 demux controllers, Hynix Hi-847 sensors, OmniVision OV08D10 and OG01A1B sensors, and Intersil ISL7998x video decoders.
- Miscellaneous: Qualcomm MSM power manager controllers, Xilinx ZynqMP SHA3 accelerators, TI TPS6286x power regulators, Richtek RT5190A power-management ICs, Sunplus SP7021 SPI controllers, LiteX MMC host controllers, and Tesla full-self-driving clock controllers.
- Sound: Texas Instruments TAS5805M speaker amplifiers, AMD PCI audio coprocessors, and Awinic AW8738 audio amplifiers.
Miscellaneous
- New documentation of interest includes some guidelines for researchers studying the kernel community, an overview of the readahead code, how to report regressions, and how developers should handle regressions.
Security-related
- There is a new kernel keyring called machine; it contains the machine-owner keys implemented by the shim bootloader interface. Keys in the machine keyring can be trusted within the kernel and thus used to sign artifacts (such as modules or integrity data) used after the initial boot process.
- Support for asymmetric TPM-backed private keys has been removed. This feature, initially added for the 3.7 release, depends on an obsolete TPM version and had some security issues of its own; it is hoped that nobody is using it.
- The random-number generator has seen a lot of work. The differences between /dev/random and /dev/urandom have been removed (though some of the urandom changes had to be reverted after a regression was reported). There is a new mechanism for the avoidance of random-stream duplication when a virtual machine forks. The BLAKE2s algorithm is now used internally. There is more; see the merge changelog and this page for lots more details.
- The kernel now provides saturating arithmetic helpers for size_t values; these can be used to harden code against integer-overflow bugs. See this commit for more information.
Internal kernel changes
- The first big chunk of work from the fast kernel-headers tree has found its way in with a significant reorganization of the scheduler header files.
- The block-layer congestion-tracking code, which was found to be unused last year, has been removed.
- The memory-management code has been enhanced with remote per-CPU page list draining.
- More of the folio patch series has been merged; this set converts internal memory-management functions (including the varieties of get_user_pages()) to folios and enables the creation of large folios in the readahead code. A second set converts a set of address_space_operations to folios.
- The set_fs() infrastructure has finally been fully removed.
A quick check shows that linux-next currently contains nearly 9,000 commits
that have not yet been pulled into the mainline, so it would seem that the
5.18 kernel will have a lot more to offer still. The merge window can be
expected to remain open until April 3; tune in shortly after that for
a summary of the remaining work pulled for this release.
Index entries for this article | |
---|---|
Kernel | Releases/5.18 |
Posted Mar 25, 2022 17:27 UTC (Fri)
by deater (subscriber, #11746)
[Link] (2 responses)
Posted Mar 25, 2022 20:38 UTC (Fri)
by james (subscriber, #1325)
[Link] (1 responses)
Posted Mar 26, 2022 8:37 UTC (Sat)
by lkundrak (subscriber, #43452)
[Link]
Posted Mar 26, 2022 4:48 UTC (Sat)
by alison (subscriber, #63752)
[Link]
The new feature sounds similar to user-level statically defined tracepoints (USDT), described in https://lwn.net/Articles/753601/ That article states that creation of USDT needs Systemtap, but instead one can use the folly library (https://github.com/facebook/folly/blob/main/folly/tracing...). As far as I can tell, the main difference between the two features is that USDT relies on bpf() to load a uprobe which then has debugfs uprobe_events. Perhaps that interface is simply less convenient than controlling an event via ftrace and then mmap()-ing shared memory in the case of the new user_events?
Posted Mar 27, 2022 3:42 UTC (Sun)
by KJ7RRV (subscriber, #153595)
[Link] (7 responses)
What do those do? Does FSD require a different type of clock than normal computers (or normal navigation systems)?
Posted Mar 27, 2022 6:18 UTC (Sun)
by zdzichu (subscriber, #17118)
[Link]
Posted Mar 27, 2022 16:36 UTC (Sun)
by Paf (subscriber, #91811)
[Link] (5 responses)
I actually didn’t realize those things interacted with the OS at all. I wish I knew why - my naive view suggests they wouldn’t need to.
Posted Mar 27, 2022 17:56 UTC (Sun)
by excors (subscriber, #95769)
[Link] (2 responses)
I'm not an expert but I think the basic reason is that the OS knows which SoC hardware blocks need to be powered on for the current application, and that determines what clock signals are required, so the OS has to have some control over the clocks. Also applications might want to configure UART baud rates, display refresh rates, etc, for which the hardware will need different clock frequencies, so the OS needs to mediate between those applications and the hardware.
As SoCs get increasingly complex, you may need rather sophisticated logic to work out precisely how to configure the low-level clock hardware (all the oscillators and PLLs and dividers and muxes and whatever) to generate those signals power-efficiently and with the required levels of accuracy and synchronisation. You've already got a kernel driver so you might as well put all the logic in there (instead of in hardware or firmware) and the drivers end up reflecting the internal complexity of every variant of every SoC.
Posted Mar 27, 2022 18:27 UTC (Sun)
by excors (subscriber, #95769)
[Link] (1 responses)
Posted Mar 28, 2022 5:17 UTC (Mon)
by Paf (subscriber, #91811)
[Link]
Posted Mar 27, 2022 20:23 UTC (Sun)
by KJ7RRV (subscriber, #153595)
[Link]
Posted Apr 3, 2022 13:43 UTC (Sun)
by linusw (subscriber, #40300)
[Link]
A simple example is a MMC/SD card: these have clock speeds desired by which card you plug in. The controller asks the card (at low frequency) what kind of clock frequencies it supports and then scales up to what the card and controller can handle.
Other examples include SoCs designed to handled several different displays, which are LED panels clocked directly from the chip. It then has to adjust to whatever is connected.
Intel systems often hide the clock control inside BIOS etc, but on Arm SoC systems the trust in BIOS-like constructs is low, and instead much of that low level control ends up in the operating systems.
5.18 Merge window, part 1
I/O is usually associated with a different set of dwarfs: "I/O, I/O, it's off to work we go .."
5.18 Merge window, part 1
5.18 Merge window, part 1
5.18 Merge window, part 1
> then enable that event like any other event, which will change the mapped byte to true,
> telling the user space application to start writing the event to the tracing buffer.
5.18 Merge window, part 1
5.18 Merge window, part 1
You can look at the patchset at https://lore.kernel.org/lkml/20220113121143.22280-1-alim....
For more human-consumable, here's a patch adding DT documentation: https://lore.kernel.org/lkml/20220113121143.22280-2-alim....
Keep in mind that linux kernel contains similar drivers for multitude of clocks embedded in SoCs. Tesla FSD driver is nothing special.
5.18 Merge window, part 1
5.18 Merge window, part 1
5.18 Merge window, part 1
5.18 Merge window, part 1
5.18 Merge window, part 1
5.18 Merge window, part 1