|
|
Subscribe / Log in / New account

5.18 Merge window, part 1

By Jonathan Corbet
March 25, 2022
As of this writing, 4,127 non-merge changesets have found their way into the mainline repository for the 5.18 development cycle. That may seem like a relatively slow start to the merge window, but there are a lot of changes packed into those commits. Read on for a summary of the most significant changes to land in the first half of the 5.18 merge window.

Architecture-specific

  • 32-Bit Arm systems have gained support for separate interrupt stacks and virtually-mapped kernel stacks.
  • Support for older Arm systems (ARMv4 and ARMv5) without a memory-management unit has been removed. MMU-less support for ARMv7-M systems remains, though.
  • The arm64 architecture supports the new "QARMA3" pointer-authentication algorithm. This variant of Arm's QARMA is evidently faster while still being sufficiently secure.
  • Arm64 systems can be built with shadow-stack support using the GCC 12 compiler release.
  • The PA-RISC architecture now has minimal vDSO support which, in turn, enables the system to run with a non-executable stack for the first time. The initial version of this patch was posted in 2006; some things take longer than others to get into the mainline, it seems.
  • Support for the Intel "hardware feedback interface" has been added. This mechanism allows the hardware to inform the kernel about the current performance and energy-efficiency capability of each CPU in the system. These capabilities can change over time as the result of, for example, thermal constraints. This documentation patch has some more information.
  • Support for the nds32 architecture has been removed. According to the merge changelog:

    The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release.

Core kernel

  • The io_uring subsystem has seen a number of improvements. The new IORING_SETUP_SUBMIT_ALL option will cause a full batch of requests to be submitted even if an error is encountered partway through. The file descriptors for the ring itself can be registered with the ring, providing a performance improvement for threaded applications; see this changelog for some details. The new IORING_OP_MSG_RING operation allows one ring to signal another. Finally, it is now possible to perform the NAPI busy poll on sockets directly from the ring.
  • Support for the a.out executable format is no longer built by default for the alpha and m68k architectures — the last two that were still using it. The a.out code has not actually been removed yet but that is probably coming soon.
  • Some tweaks to the restartable-sequences API have been merged in preparation for support in the GNU C Library.
  • The DAMON operation schemes (DAMOS) mechanism gives user space more control over memory-management operations (and page reclaim in particular).
  • The tracing system now supports "user events", which are essentially dynamic tracepoints in user-space applications. The feature is described in the merge changelog as:

    User space can register an event with the kernel describing the format of the event. Then it will receive a byte in a page mapping that it can check against. A privileged task can then enable that event like any other event, which will change the mapped byte to true, telling the user space application to start writing the event to the tracing buffer.

    See the commits adding documentation and a sample program for more information.

Filesystems and block I/O

  • The inline-encryption capabilities of block request queues can now be viewed in sysfs; see this changelog for details.
  • Direct I/O is not normally a possibility for encrypted files, since the data must be buffered through the kernel for encryption or decryption anyway. If the hardware does the crypto work, though, the situation is different. In 5.18, files encrypted with fscrypt can be accessed with direct I/O if inline encryption is in use. This documentation patch contains a little more information.
  • The F2FS filesystem has gained support for ID-mapped mounts.
  • Support for NFSv3 will always be built into the NFS server if NFS is enabled at all. This is done with the intent of making NFSv3 become the base, "always-supported" version of NFS in preparation for the eventual removal of NFSv2 support.
  • There are two new ioctl() operations for Btrfs (BTRFS_IOC_ENCODED_READ and BTRFS_IOC_ENCODED_WRITE) that allow direct reading from and writing to a file's extents. The main use case for these commands is to support newer, more efficient send and receive operations.

Hardware support

  • Hardware monitoring: ASUS ACPI embedded controllers, Vicor PLI1209BC digital power supervisors, Aquacomputer Farbwerk 360 RGB controllers, and Texas Instruments TMP464 and TMP468 temperature sensors.
  • Media: Microchip CSI2 demux controllers, Hynix Hi-847 sensors, OmniVision OV08D10 and OG01A1B sensors, and Intersil ISL7998x video decoders.
  • Miscellaneous: Qualcomm MSM power manager controllers, Xilinx ZynqMP SHA3 accelerators, TI TPS6286x power regulators, Richtek RT5190A power-management ICs, Sunplus SP7021 SPI controllers, LiteX MMC host controllers, and Tesla full-self-driving clock controllers.
  • Sound: Texas Instruments TAS5805M speaker amplifiers, AMD PCI audio coprocessors, and Awinic AW8738 audio amplifiers.

Miscellaneous

Security-related

  • There is a new kernel keyring called machine; it contains the machine-owner keys implemented by the shim bootloader interface. Keys in the machine keyring can be trusted within the kernel and thus used to sign artifacts (such as modules or integrity data) used after the initial boot process.
  • Support for asymmetric TPM-backed private keys has been removed. This feature, initially added for the 3.7 release, depends on an obsolete TPM version and had some security issues of its own; it is hoped that nobody is using it.
  • The random-number generator has seen a lot of work. The differences between /dev/random and /dev/urandom have been removed (though some of the urandom changes had to be reverted after a regression was reported). There is a new mechanism for the avoidance of random-stream duplication when a virtual machine forks. The BLAKE2s algorithm is now used internally. There is more; see the merge changelog and this page for lots more details.
  • The kernel now provides saturating arithmetic helpers for size_t values; these can be used to harden code against integer-overflow bugs. See this commit for more information.

Internal kernel changes

A quick check shows that linux-next currently contains nearly 9,000 commits that have not yet been pulled into the mainline, so it would seem that the 5.18 kernel will have a lot more to offer still. The merge window can be expected to remain open until April 3; tune in shortly after that for a summary of the remaining work pulled for this release.

Index entries for this article
KernelReleases/5.18


to post comments

5.18 Merge window, part 1

Posted Mar 25, 2022 17:27 UTC (Fri) by deater (subscriber, #11746) [Link] (2 responses)

I'm impressed you can talk about io_uring for so long without slipping up and going into LOTR mode. "And finally, you can take the one ring, and then in the darkness bind() it"

5.18 Merge window, part 1

Posted Mar 25, 2022 20:38 UTC (Fri) by james (subscriber, #1325) [Link] (1 responses)

I/O is usually associated with a different set of dwarfs: "I/O, I/O, it's off to work we go .."

5.18 Merge window, part 1

Posted Mar 26, 2022 8:37 UTC (Sat) by lkundrak (subscriber, #43452) [Link]

There's also a Bee Gees song about I/O

5.18 Merge window, part 1

Posted Mar 26, 2022 4:48 UTC (Sat) by alison (subscriber, #63752) [Link]

> User space can register an event with the kernel describing the format of the event. Then > it will receive a byte in a page mapping that it can check against. A privileged task can
> then enable that event like any other event, which will change the mapped byte to true,
> telling the user space application to start writing the event to the tracing buffer.

The new feature sounds similar to user-level statically defined tracepoints (USDT), described in https://lwn.net/Articles/753601/ That article states that creation of USDT needs Systemtap, but instead one can use the folly library (https://github.com/facebook/folly/blob/main/folly/tracing...). As far as I can tell, the main difference between the two features is that USDT relies on bpf() to load a uprobe which then has debugfs uprobe_events. Perhaps that interface is simply less convenient than controlling an event via ftrace and then mmap()-ing shared memory in the case of the new user_events?

5.18 Merge window, part 1

Posted Mar 27, 2022 3:42 UTC (Sun) by KJ7RRV (subscriber, #153595) [Link] (7 responses)

> Tesla full-self-driving clock controllers

What do those do? Does FSD require a different type of clock than normal computers (or normal navigation systems)?

5.18 Merge window, part 1

Posted Mar 27, 2022 6:18 UTC (Sun) by zdzichu (subscriber, #17118) [Link]

The driver consist mainly of register names definitions and some information how the clock controller is interconnected.
You can look at the patchset at https://lore.kernel.org/lkml/20220113121143.22280-1-alim....
For more human-consumable, here's a patch adding DT documentation: https://lore.kernel.org/lkml/20220113121143.22280-2-alim....
Keep in mind that linux kernel contains similar drivers for multitude of clocks embedded in SoCs. Tesla FSD driver is nothing special.

5.18 Merge window, part 1

Posted Mar 27, 2022 16:36 UTC (Sun) by Paf (subscriber, #91811) [Link] (5 responses)

I think there’s a misunderstanding - this is a clock generator, as an *clock frequency*, not a *clock*, as in time keeping. It’s an implementation detail for an SoC.

I actually didn’t realize those things interacted with the OS at all. I wish I knew why - my naive view suggests they wouldn’t need to.

5.18 Merge window, part 1

Posted Mar 27, 2022 17:56 UTC (Sun) by excors (subscriber, #95769) [Link] (2 responses)

> I actually didn’t realize those things interacted with the OS at all. I wish I knew why - my naive view suggests they wouldn’t need to.

I'm not an expert but I think the basic reason is that the OS knows which SoC hardware blocks need to be powered on for the current application, and that determines what clock signals are required, so the OS has to have some control over the clocks. Also applications might want to configure UART baud rates, display refresh rates, etc, for which the hardware will need different clock frequencies, so the OS needs to mediate between those applications and the hardware.

As SoCs get increasingly complex, you may need rather sophisticated logic to work out precisely how to configure the low-level clock hardware (all the oscillators and PLLs and dividers and muxes and whatever) to generate those signals power-efficiently and with the required levels of accuracy and synchronisation. You've already got a kernel driver so you might as well put all the logic in there (instead of in hardware or firmware) and the drivers end up reflecting the internal complexity of every variant of every SoC.

5.18 Merge window, part 1

Posted Mar 27, 2022 18:27 UTC (Sun) by excors (subscriber, #95769) [Link] (1 responses)

(In case it makes my explanation clearer, I think the concepts are nicely illustrated by the image on https://stackoverflow.com/questions/40214987/stm32-intern... . That shows the multiple clock sources (a high-accuracy 32768Hz driven by an external crystal, a low-accuracy 16MHz from an internal RC oscillator, etc), PLLs (which multiply frequencies), dividers (which, uh, divide frequencies), and muxes to select between different inputs, eventually leading to various groups of peripherals. You need a driver to configure all of that stuff. And this example is a pretty simple microcontroller - a big SoC should use similar concepts but lots more of everything.)

5.18 Merge window, part 1

Posted Mar 28, 2022 5:17 UTC (Mon) by Paf (subscriber, #91811) [Link]

Thanks, this is very interesting

5.18 Merge window, part 1

Posted Mar 27, 2022 20:23 UTC (Sun) by KJ7RRV (subscriber, #153595) [Link]

Oh, okay, so it's a frequency generator? That makes a lot more sense.

5.18 Merge window, part 1

Posted Apr 3, 2022 13:43 UTC (Sun) by linusw (subscriber, #40300) [Link]

> I actually didn’t realize those things interacted with the OS at all. I wish I knew why - my naive view suggests they wouldn’t need to.

A simple example is a MMC/SD card: these have clock speeds desired by which card you plug in. The controller asks the card (at low frequency) what kind of clock frequencies it supports and then scales up to what the card and controller can handle.

Other examples include SoCs designed to handled several different displays, which are LED panels clocked directly from the chip. It then has to adjust to whatever is connected.

Intel systems often hide the clock control inside BIOS etc, but on Arm SoC systems the trust in BIOS-like constructs is low, and instead much of that low level control ends up in the operating systems.


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds