|
|
Subscribe / Log in / New account

4.20/5.0 Merge window part 1

By Jonathan Corbet
October 26, 2018
Linus Torvalds has returned as the keeper of the mainline kernel repository, and the merge window for the next release which, depending on his mood, could be called either 4.20 or 5.0, is well underway. As of this writing, 5,735 non-merge changesets have been pulled for this release; experience suggests that we are thus at roughly the halfway point.

Some of the more significant changes merged so far are:

Architecture-specific

  • The arm64 architecture can make use of the new hardware-provided SSBS state bit to defend against Spectre variant 4 attacks.
  • RISC-V now supports the futex() system call and associated operations.

Core kernel

  • There are two new types of BPF maps for implementing queues and stacks. Documentation is missing, but an example of their use can be found in the selftest code.
  • On systems with asymmetric CPUs (big.LITTLE systems, for example), the CPU scheduler can now detect "misfit" processes that need the resources of a fast CPU but which are stuck on a slow one. When load balancing is performed, the scheduler will try to move misfits to a more appropriate processor.
  • Signal handling within the kernel has been extensively reworked; the result should be simpler and more robust handling. There is a slight change in structure sizes that is visible to user space, but patch author Eric Biederman couldn't find any programs that would be affected by it. There's also one other visible change that is hinted at: "Testing also revealed bad things can happen if a negative signal number is passed into the system calls."

Filesystems and block I/O

  • Numerous block drivers have been converted to the multiqueue API. Current plans call for the legacy API to be removed in the next development cycle.

Hardware support

  • Audio: Texas Instruments PCM3060 codecs, Amlogic AXG PDM input ports, Allwinner sun50i codec analog controls, and Nuvoton NAU88C22 codecs.
  • Miscellaneous: STMicroelectronics STPMIC1 PMIC regulators, Cirrus Logic Lochnagar regulators, UniPhier SD/eMMC Host controllers, Spreadtrum SDIO host controllers, SIOX GPIO controllers, Panasonic AN30259A LED controllers, BigBen Interactive gamepads, Spreadtrum SC2731 charger controllers, Freescale eDMA engines, and Mylex DAC960/DAC1100 PCI RAID controllers.
  • Network: DEC FDDIcontroller 700/700-C network interfaces (hardware designed in 1990; it is not clear why anybody wants this now) and Intel Ethernet Controller I225-LM/I225-V adapters.
  • Pin control: Nuvoton BMC NPCM750/730/715/705 pinmux and GPIO controllers, Meson g12a SoC pin controllers, Mediatek MT6765, MT7623 and MT8183 pin controllers, Qualcomm SDM660 and QCS404 pin controllers, Broadcom Northstar pin controllers, and Renesas RZ/N1, r8a774a1 and r8a774c0 pin controllers.
  • SPI: Spreadtrum SC9860 SPI controllers, MediaTek SPI slave devices, Qualcomm QuadSPI controllers, Qualcomm GENI-based SPI controllers, STMicroelectronics STM32 QUAD SPI controllers, and Atmel USART SPI controllers.
  • Additionally, the "LED pattern driver" can be used to drive an LED given a brightness pattern from user space; see this commit for more information.

Networking

  • The TCP stack has moved to an "earliest departure time" model for the pacing of outgoing traffic. This mode, inspired by a talk by Van Jacobson [PDF] at the 2018 Netdev conference, aims to address scalability problems by replacing outgoing packet queues with a timer wheel describing the earliest time that each packet can be sent. The result is meant to be better pacing and more accurate round-trip-time calculations to drive that pacing.
  • Network flow dissectors can now be loaded as BPF programs, which should provide both better hardening and better performance.
  • The new "taprio" traffic scheduler allows the control of packet scheduling according to a pre-generated time sequence. Documentation is naturally scarce; a little can be found in this commit.
  • The rtnetlink protocol has been enhanced with a "strict checking" option that allows user space to be sure it is getting the actual information it asked for.

Security-related

  • The kernel now makes more aggressive use of barriers when switching between unrelated processes in an attempt to provide stronger protection against Spectre variant-2 attacks.
  • The controversial Speck crypto algorithm has been removed from the kernel.
  • There is a new mechanism for obtaining statistics from the cryptographic subsystem. Naturally, it is thoroughly undocumented, but there is an example program showing its use.

Internal kernel changes

  • The read-copy-update (RCU) subsystem has seen a lot of refactoring, ending in the removal of many of the "flavors" of RCU. There are now two primary flavors, one of which is adapted to preemptible kernels and one for non-preemptible kernels.
  • The PCI subsystem can now support peer-to-peer DMA operations between peripherals.

If the usual schedule is followed, this merge window will end on November 4, with the final release happening just before the end of the year. Stay tuned for the followup article, which will cover the changes pulled in the second half of the 4.20 (or 5.0) merge window.

Index entries for this article
KernelReleases/4.20


to post comments

Crypto side channels

Posted Oct 26, 2018 19:52 UTC (Fri) by abatters (✭ supporter ✭, #6932) [Link]

> obtaining statistics from the cryptographic subsystem

Could this be used in a side channel attack on crypto algorithms?

4.20/5.0 Merge window part 1

Posted Oct 26, 2018 21:58 UTC (Fri) by jhoblitt (subscriber, #77733) [Link] (1 responses)

The "LED pattern driver" looks useful -- this is something that I expect to be popular on RPIs.

LED patterns

Posted Nov 8, 2018 14:43 UTC (Thu) by robbe (guest, #16131) [Link]

I must admit that I thought: when will BPF programs drive this?

4.20/5.0 Merge window part 1

Posted Oct 27, 2018 9:40 UTC (Sat) by pbonzini (subscriber, #60935) [Link] (6 responses)

I like the crypto statistics. It'd probably be good to have that for KVM too, instead of the current debugfs hack...

4.20/5.0 Merge window part 1

Posted Oct 27, 2018 10:06 UTC (Sat) by lkundrak (subscriber, #43452) [Link] (5 responses)

My first thought when I saw the crypto stats thing was: "Doesn't this belong in debugfs?" Why is a proc file or a netlink protocol better than debugfs for this sort of thing?

I suppose it's mostly used for debugging and with debugfs you don't have to stick to a stable interface.

4.20/5.0 Merge window part 1

Posted Oct 27, 2018 10:19 UTC (Sat) by pbonzini (subscriber, #60935) [Link] (4 responses)

Debugfs is a mishmash of interfaces, some of which might even let you get arbitrary code execution in the firmware; statistics can be retrieved and logged in production systems and it would be nice if production systems could avoid mounting debugfs altogether.

It is also not particularly efficient if you want to log thousands of statistics every second.

4.20/5.0 Merge window part 1

Posted Oct 30, 2018 19:30 UTC (Tue) by fartman (guest, #128226) [Link] (3 responses)

but netlink is also tied to network namespaces, don't you feel that's a little odd for something like crypto?

4.20/5.0 Merge window part 1

Posted Oct 31, 2018 14:20 UTC (Wed) by montjoie (subscriber, #110115) [Link] (2 responses)

My first patch for cryptostat was using a /sys/kernel/crypto, but the crypto maintainer said to use netlink.
It seems that the rules for all crypto management is to use netlink.

4.20/5.0 Merge window part 1

Posted Oct 31, 2018 15:17 UTC (Wed) by fartman (guest, #128226) [Link] (1 responses)

I can see why, probably don't want to deviate in that NETLINK_CRYPTO has already been the place for that kind of thing now.

The problem to me appears to be the mixed granularity of procfs and sysfs. The former often has files varying in representation and results in more data being read than you need to, being slow. sysfs is granular with file-per-datum but when you need to collect a lot of things it quickly gets expensive.

If you had some sort of readv where you could submit what files to read together, as some sort of megaread, and it took a new iovec type where one could specify file descriptors to read from, the sysfs model would work well, and you would only need to make the kernel fetch what you need to.

It's still perhaps expensive in that you need to open all these files, but since that happens only ones per initialization (and all reads can be coalesced into one), i think it would help a lot. Then move all the snowflake files that don't belong in /proc to /sys/kernel/subsystem.

Ofcourse, someone needs to do this work, and since netlink is good enough, I don't think it would happen.

4.20/5.0 Merge window part 1

Posted Oct 31, 2018 15:37 UTC (Wed) by fartman (guest, #128226) [Link]

It might be interesting to explore BPF as an alternative for grabbing information from the kernel, instead of resorting to netlink (and it solves the problem of parsing things in userspace, and is fast enough). The files on virtual fs can remain for human readability and legacy tools.

4.20/5.0 Merge window part 1

Posted Oct 27, 2018 21:49 UTC (Sat) by mtaht (subscriber, #11087) [Link] (4 responses)

I'd really like there to be a 4.20 release. There's all sorts of in-jokes to be made.

I'd call it: smokey salmon.

4.20/5.0 Merge window part 1

Posted Oct 29, 2018 10:06 UTC (Mon) by nilsmeyer (guest, #122604) [Link] (2 responses)

I'd prefer 5.0 and "Linux 2000" in keeping with "Linux for Workgroups" released some time ago ;)

4.20/5.0 Merge window part 1

Posted Oct 29, 2018 20:47 UTC (Mon) by naptastic (guest, #60139) [Link]

Why not both? :)

4.20/5.0 Merge window part 1

Posted Nov 10, 2018 9:09 UTC (Sat) by Wol (subscriber, #4433) [Link]

Teenage Angst, the Galaxy, and part of everything :-)

Cheers,
Wol

4.20/5.0 Merge window part 1

Posted Oct 30, 2018 15:51 UTC (Tue) by nhippi (subscriber, #34640) [Link]

Linux 4.20, serves the best clouds

4.20/5.0 Merge window part 1

Posted Oct 30, 2018 15:16 UTC (Tue) by johnsoda (guest, #127477) [Link]

When releasing the 3.0 kernel, part of the reasoning that Linus states was that he couldn't comfortably count that high.

Now, we are at the point where, does he consider 4.20 as being "Too high?"


Copyright © 2018, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds