The first half of the 6.3 merge window
Changes merged to date include:
Architecture-specific
- A large set of old and unused Arm board files has been removed, reducing the size of the kernel tree by over 150,000 lines. This (6.0) commit describes the list of systems for which board files have been removed. Meanwhile, devicetree files have been added to support 46 new arm64 systems.
- The new virtconfig build target for arm64 systems creates a relatively lightweight configuration intended to be booted on virtual systems.
- AMD's "automatic IBRS" feature is now supported. This is a Spectre defense that restricts indirect-branch speculation with less of a performance cost than that imposed by retpolines.
- The m68k architecture has gained support for system-call filtering with seccomp().
- Arm scalable matrix extension 2 instructions are now supported.
- BPF trampolines are now fully supported on s390x and RISC-V RV64 systems.
Core kernel
- The list of enhancements to the kernel's embryonic support for the
Rust language is relatively small this time, but that support is, according
to Miguel Ojeda, "
getting closer to a point where the first Rust modules can be upstreamed
". These changes include the removal of a non-applicable part of the alloc crate, an implementation of the Arc type (which provides a reference-counted pointer), the ScopeGuard type (which runs some cleanup code when it goes out of scope), and the ForeignOwnable type, which facilitates moving pointers between Rust and C code. - There is a new document covering the stability expectations for BPF kfuncs; it describes the current status in the ongoing discussion of how stable the BPF API should be.
- The cgroup.memory=nobpf command-line parameter disables memory accounting for BPF programs; see this merge message for a discussion of the motivation behind this feature.
- There is a new red-black tree data structure available to BPF programs. See this merge message for more information.
- The restartable sequences mechanism now exports a "per-memory-map concurrency ID" to processes. This ID can be thought of (and treated like) a CPU number, but the numbers are kept as close to zero as possible. Its purpose is to enable more efficient per-CPU data structures in applications that are only using a subset of the CPUs on a large system. This commit contains some more information.
Filesystems and block I/O
- The tmpfs filesystem now supports ID-mapped mounts.
- Erofs has gained support for per-CPU file-data decompression, leading to reduced data-access latency.
- The Btrfs block allocator will now segregate extents by their size, so that any given block group is limited to extents that are small (less than 128KB), medium (up to 8MB), or large. This evidently reduces fragmentation, especially in workloads where allocation size correlates with file lifetime — something that evidently actually happens. See this commit message for some details.
- Rotating disk drives still exist, and are even becoming more complex: multi-actuator drives have independently controllable arms that, for best performance, must all be kept busy. The BFQ I/O scheduler has gained support for such drives; this commit message has a bit more information on how it works.
Hardware support
- GPIO and pin control: Qualcomm QDU1000/QRU1000, IPQ5332, SA8775P, and SM8550 pin controllers, Mediatek MT7981 pin controllers, and StarFive JH7110 pin and GPIO controllers.
- Hardware monitoring: MPS MPQ7932 regulators, HPE GXP fan controllers, NXP MC34VR500 power-management ICs, and Infineon TDA38640 voltage regulators.
- Input: EVision keyboards and Steam Deck force feedback controllers.
- Miscellaneous: Xilinx ZynqMP on-chip-memory controllers, MediaTek low-voltage thermal sensor controllers, Intel topology aware register/pm capsule interfaces, Aspeed ACRY RSA engines, StarFive JH7110 random number generators, Maxim MAX20411 single step-down converters, and Broadcom BCMBCA HS SPI controllers.
- Networking: Microchip KSZ9563/LAN937x Ethernet switch PTP clocks, Realtek RTL8188EU wireless interfaces, Ocelot VSC7511, VSC7512, VSC7513 and VSC7514 external switches, Amlogic GXL-based MDIO bus multiplexers, Motorcomm 8531 PHYs, and Qualcomm WiFi 7 (ath12k) interfaces.
- Sound: MediaTek MT8188 controllers, Iron Device SMA1303 audio amplifiers, Renesas IDT821034 quad PCM codecs, Awinic AW88395 audio amplifiers, Realtek RT712 SDCA codecs, and Infineon PEB2466 quad PCM codecs.
- Also: preliminary support for writing human-interface device drivers in BPF has been merged, though the mechanism for distributing such drivers is still to be worked out. See this document for more information.
Networking
- Support for the Physical Layer Collision Avoidance (PLCA) Reconciliation Sublayer has been added; it is said to improve access performance on shared media Ethernet. This documentation patch describes how to configure and use this feature.
- The "wireless extensions" API for the control of WiFi interfaces ran into trouble in 2006, but is still supported as an emulation layer. This API will no longer be supported for WiFi 7 (802.11be) interfaces, since it is unable to configure all of the available features. The use of the wireless extensions API will generate a warning for most current devices as of 6.3.
- The process of documenting the netlink API continues; the results can be seen in the core API and user-space API manuals. Also added is a new tool to generate netlink protocol code from YAML specifications.
- The new IP_LOCAL_PORT_RANGE socket option makes it easier for multiple hosts to make outgoing connections through a NAT gateway; this commit contains details.
- Multi-path TCP can now handle mixed flows using both the IPv4 and IPv6 protocols.
- BIG TCP support has been extended to IPv4.
- The new default_rps_mask sysctl knob allows the creation of a default, per-net-namespace receive packet steering (RPS) configuration.
- Support for a number of queuing disciplines (specifically class-based queuing (CBQ), ATM virtual circuits (ATM), differentiated service marker (dsmark), traffic-control index (tcindex), and resource reservation protocol (RSVP)) has been removed due to a lack of maintenance and interest.
Internal kernel changes
- The old memory-allocation function get_kernel_pages() has been removed now that there are no more in-tree users.
The 6.3 merge window can be expected to remain open until March 5, at which
point 6.3-rc1 will come out and the kernel will enter the stabilization
phase of the development cycle. Quite a few more changes are poised to
enter the mainline before that happens, though; tune in once the merge
window closes for a summary of the rest of that work.
Index entries for this article | |
---|---|
Kernel | Releases/6.3 |
Posted Feb 23, 2023 20:37 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (9 responses)
Linked lists, trees, custom data types... Guys, stop reinventing WASM.
Posted Feb 24, 2023 1:41 UTC (Fri)
by davemarchevsky (guest, #85534)
[Link] (7 responses)
Posted Feb 24, 2023 11:42 UTC (Fri)
by smurf (subscriber, #17840)
[Link] (6 responses)
Reminds me of when I, in Linux's early days, was so fed up with the then-abysmal state of Linux networking that I linked the BSD network stack into it. It was not a particularly good fit, of course, but it worked.
Posted Feb 25, 2023 21:39 UTC (Sat)
by Sesse (subscriber, #53779)
[Link] (5 responses)
Posted Feb 26, 2023 2:03 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (4 responses)
And I'm pretty sure it's very easy to make BPF programs run for quite some time, if you combine list lookups and function calls. The raw instruction count has stopped being a good predictor for the maximum BPF runtime.
Posted Feb 26, 2023 6:20 UTC (Sun)
by Sesse (subscriber, #53779)
[Link] (3 responses)
The point isn't as much to avoid slowness as to have deterministic forward progress in the kernel.
Posted Feb 26, 2023 19:33 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
BPF doesn't guarantee this either. It has an early exit instruction (bpf_exit) that allows you to terminate the program earlier. It's entirely possible to take a lock and then do an early exit. Or to take two locks in the wrong order resulting in a deadlock, and the verifier will be happy. The only locking BPF allows is holding ONE spinlock at a time from a structure in a BPF map: https://lwn.net/Articles/779120/
A similar lock helper can be created for WASM via a simple helper that will release the lock on timeout.
With the scheduler example, BPF doesn't provide anything that can't be expressed in WASM. You can't express the invariant "BPF picks at least one process" in a way that the verifier understands.
Posted Feb 27, 2023 13:19 UTC (Mon)
by kkdwivedi (subscriber, #130744)
[Link] (1 responses)
The verifier does complain if you try to exit while holding a spinlock. Also, it's totally possible to support holding more than one lock at a time. Deadlock avoidance is a challenge, but there are some cases (which have a substantial overlap with common usage scenarios) where you can easily prove or enforce it statically. I think it has not been done yet because no strong use case came up, rather than some kind of fundamental limitation in BPF.
Posted Feb 27, 2023 20:46 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link]
The last time I checked, the verifier supported only one lock at a time.
> Deadlock avoidance is a challenge
I don't think it's even possible if BPF is allowed to use general-purpose locks that are used in other parts of the kernel. For the more restricted use-case, it's possible to force lock ordering. But this will require runtime tracking to be useful, you can't have static verification for anything non-trivial.
The simplest way to do runtime tracking is to have a consistent numbering for locks, and when you take a lock, store the "lock tickets" in a linked list. This way you can verify that your previous lock has a greater number than the current one. It still will be somewhat limited (so no hand-over-hand locking), but it'll do for a large number of practical applications.
But this of course can be expressed as a simple API exposed to WASM code, just as with the BPF use case.
Posted Feb 24, 2023 7:32 UTC (Fri)
by PengZheng (subscriber, #108006)
[Link]
Posted Feb 24, 2023 10:11 UTC (Fri)
by georgm (subscriber, #19574)
[Link]
Posted Mar 2, 2023 14:52 UTC (Thu)
by kpfleming (subscriber, #23250)
[Link] (3 responses)
Posted Mar 4, 2023 18:49 UTC (Sat)
by geofft (subscriber, #59789)
[Link] (2 responses)
Posted Mar 4, 2023 20:28 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Mar 6, 2023 10:19 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
It's basically a way to let you reuse a CANbus topology for Ethernet, so that you can entirely replace CANbus with Ethernet in your vehicle without paying a weight penalty.
The first half of the 6.3 merge window
The first half of the 6.3 merge window
The first half of the 6.3 merge window
The first half of the 6.3 merge window
The first half of the 6.3 merge window
The first half of the 6.3 merge window
The first half of the 6.3 merge window
The first half of the 6.3 merge window
The first half of the 6.3 merge window
The first half of the 6.3 merge window
RTL8188EU
Shared-media Ethernet is still a thing?
This blog post says the use case is "automotive Ethernet," where apparently you don't want to run a whole new Ethernet cable to a new switch port to add some new device inside a car, you just want to attach the device to the cable. Seems like this is the core feature of "10BASE-T1S."
Shared-media Ethernet is still a thing?
Shared-media Ethernet is still a thing?
Shared-media Ethernet is still a thing?