|
|
Subscribe / Log in / New account

The rest of the 6.10 merge window

By Jonathan Corbet
May 27, 2024
Linus Torvalds released 6.10-rc1 and closed the 6.10 merge window on May 26. By that time, 11,534 non-merge changesets had been pulled into the mainline for the next release; nearly 5,000 of those came in after "The first half of the 6.10 merge window" was written. While the latter half of the merge window tends to focus more on fixes, there was also a lot of new functionality that landed during this time.

Significant changes merged since the first-half summary include:

Architecture-specific

  • 32-Bit Arm systems can now be built with Clang-based control-flow integrity.
  • The PowerPC BPF JIT compiler now supports kfuncs.
  • The RISC-V architecture has gained support for the Rust language.

Core kernel

  • It is now possible to map tracing ring buffers directly into user space. See this merge message and this documentation commit for more information.
  • An initial set of patches toward the eventual consolidation of hugetlbfs into the core memory-management subsystem has been merged; there should be no user-visible changes.
  • The ntsync subsystem, which provides a set of Windows NT synchronization primitives for Linux, has been merged. It is, however, marked as "broken" for this release and cannot yet be used for its intended purpose.
  • After a significant amount of discussion and change, the mseal() system call was merged as one of the final features for this development cycle. mseal() allows a process to forbid future changes to portions of its address space; the initial application is in the Chrome browser, which will use it to strengthen its internal sandboxing. More information can be found in this documentation commit.

Filesystems and block I/O

  • There is a new netlink-based protocol for the control of the NFS server in the kernel; a new nfsdctl tool is said to be on its way into the nfs-utils package.
  • The XFS filesystem continues to gain more online repair functionality.
  • The filesystems in user space (FUSE) subsystem now supports integrity protection with fs-verity.
  • The overlayfs filesystem is now able to create temporary files using the O_TMPFILE option.

Hardware support

  • Clock: Sophgo CV1800 series SoCs clock controllers, STMicroelectronics stm32mp25x clocks, NXP i.MX95 BLK CTL clocks, and Epson RX8111 realtime clocks.
  • Media: Broadcom BCM283x/BCM271x CSI-2 receivers and sixth-generation Intel image processing units.
  • Miscellaneous: Acer Aspire 1 embedded controllers, Lenovo WMI camera buttons, ACPI Quickstart buttons, Lenovo Yoga Tablet 2 1380 fast chargers, Dell AIO UART backlight interfaces, MeeGoPad ANX7428 Type-C switches, Zhaoxin I2C interfaces, Lenovo SE10 watchdog timers, ARM MHUv3 mailbox controllers, Samsung HDMI PHYs, MediaTek 10GE SerDes XFI T-PHYs, and Rockchip USBDP COMBO PHY.

Miscellaneous

  • The perf tool has, as usual, seen a lot of changes; details can be found in this merge message.

Networking

  • The new IORING_CQE_F_SOCK_NONEMPTY operation for io_uring can be used to determine whether there are more connection requests waiting on a socket.

Security-related

  • The Landlock security module is now able to apply policies to ioctl() calls; see this documentation commit for a bit more information.
  • The new init_mlocked_on_free boot option will cause any memory that is locked into RAM with mlock() to be zeroed if (and only if) it is freed without having been first unlocked with munlock(). The purpose is to protect memory that may be holding cryptographic keys from being exposed after an application crash.

Internal kernel changes

  • Developers may be unaware of the no_printk() macro. Its job is to do nothing, but to preserve printk() statements in the code should somebody need to restore them for future debugging purposes. In prior kernels, no_printk() still contributed indexing data to the kernel image, even though it printed nothing; that has been fixed for 6.10.
  • Some changes to how memory for executable code in the kernel is allocated have made it possible to enable ftrace and kprobes without the need to enable loadable-module support.
  • Work items in BH workqueues can now be enabled and disabled; with this change, it should be possible to convert all tasklet users over to the new mechanism.
  • The (sometimes controversial) memory-allocation profiling subsystem has been merged; this should help developers optimize memory use and track down memory leaks. See this documentation commit for some more information.
  • There are 371 more symbols exported to modules in 6.10, and 18 new kfuncs; see this page for the full list of changes.

If this development cycle follows the usual timeline (and they all do anymore), then the final 6.10 release will happen on July 14 or 21. Between now and then, though, there will be a need for a lot of testing and bug fixing.

[Note that LWN subscribers can find more information about the contributions to 6.10-rc1 in the LWN kernel-source database.]

Index entries for this article
KernelReleases/6.10


to post comments

The rest of the 6.10 merge window

Posted May 27, 2024 16:07 UTC (Mon) by aszs (subscriber, #50252) [Link] (22 responses)

Could sshd (or its linker) have used this mseal() system call to block the xz backdoor from overwriting its entry points?

The rest of the 6.10 merge window

Posted May 27, 2024 16:51 UTC (Mon) by mussell (subscriber, #170320) [Link]

Yes and no. The dynamic linker can use mseal() to prevent the permissions of GOT/PLT from being modified during runtime, thus ensuring it is always read-only throughout the entire execution of a program. It wouldn't have helped against the xz backdoor as it used IFUNC handlers to modify the PLT during loading before it would have been marked read-only.

The rest of the 6.10 merge window

Posted May 27, 2024 19:52 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (20 responses)

No, but it will prevent libsystemd from working properly because new lazy dynamic dependencies won't work if the text is sealed before one of them is called.

The rest of the 6.10 merge window

Posted May 27, 2024 22:13 UTC (Mon) by aszs (subscriber, #50252) [Link] (17 responses)

ok... considering that change would have blocked the XY backdoor, apparently a naive use of mseal() would be a step backwards... and it sounds like in this case preventing transitive dependencies from setting IFUNCs would be the most obvious hole to plug.

The rest of the 6.10 merge window

Posted May 28, 2024 0:34 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (16 responses)

The correct change would have been splitting the libsystemd into libmeaculpa and libjournald. Where libmeaculpa would have contained only the basic systemd interaction functionality (readiness and other housekeeping features) and libjournald would have had everything to do with the journal (including decompressors). It also would have solved the problems with the ssh, but without adding tons of new infrastructure.

The rest of the 6.10 merge window

Posted May 28, 2024 8:24 UTC (Tue) by bluca (subscriber, #118303) [Link] (15 responses)

Do you really have to repeat this nonsense under every article? It's really tiring, you obviously have no idea how any of this works the way it does and why, so how about you just drop it? Pretty please?

The rest of the 6.10 merge window

Posted May 28, 2024 12:25 UTC (Tue) by diegor (subscriber, #1967) [Link]

Do you have a link to the rationale of why not splitting the library? I'm quite interested to know why.

Thanks.

The rest of the 6.10 merge window

Posted May 28, 2024 15:07 UTC (Tue) by pbonzini (subscriber, #60935) [Link] (13 responses)

But it's true. There's parts of libsystemd that are simple stuff, that has already been implemented dozens of times and that should be a simple copylib rather than a shared library. The rest is the journal.

Using libsystemd for the former is just laziness as things stand. As a copylib it makes more sense.

The rest of the 6.10 merge window

Posted May 28, 2024 16:13 UTC (Tue) by bluca (subscriber, #118303) [Link] (12 responses)

Yeah, trivial stuff like a complete D-Bus library, session management APIs and device events handling. Couple of lines that can be copy pasted with no trouble.

The rest of the 6.10 merge window

Posted May 28, 2024 16:42 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

Yet compressor dependencies are only needed for the journal part. Unless I'm missing something?

The rest of the 6.10 merge window

Posted May 28, 2024 16:51 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

Also, having a DBUS implementation in a library that gets loaded into every process is NOT a point of pride.

The rest of the 6.10 merge window

Posted May 28, 2024 17:55 UTC (Tue) by daroc (editor, #160859) [Link] (3 responses)

I think this is getting somewhat off topic; the technical details of how mseal() could or couldn't be used by systemd are interesting. Relitigating systemd's design decisions is both not really relevant to that, and unlikely to come to anything when nobody seems to be changing their mind.

So let's please end this thread (and the related one with Lennart) here.

The rest of the 6.10 merge window

Posted May 28, 2024 20:24 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

I'm sorry, but I think the design decisions related to libsystemd are absolutely relevant here. mseal() interaction with dynamic loading seems to be relevant, as is the possible future pinsyscalls() analog.

libsystemd is certainly not the only offender, glibc with NSS and PAM modules is another example. But glibc is moving in the _opposite_ direction and has removed dynamically loaded libpthread.

And I don't think that the overall systemd design is in question here, just the libsystemd part. And for the record, I _love_ systemd infrastructure in general.

The rest of the 6.10 merge window

Posted May 28, 2024 20:49 UTC (Tue) by daroc (editor, #160859) [Link] (1 responses)

I do agree that it's interesting to know how mseal() interacts with dynamic loading, and that design decisions of systemd touch on that. The message I responded to seemed to me to be getting away from that -- and also to be somewhat confrontational. If you had said something like "And having a DBUS connection in every process is a source of unnecessary complexity, since many processes do not use it", I wouldn't have said anything.

I definitely don't want to stop you an bluca from having an interesting conversation about the future of this technology and the systemd project. I do really like the details we get from discussions in the comments; I just wanted to try and prevent another heated argument like the one we had this past weekend. We ended up having to turn on comment moderation for that article, so perhaps I'm just being overly-sensitive to comment moderation right now.

Please do talk about how mseal() will or will not change things for systemd. Please don't get heated about it.

The rest of the 6.10 merge window

Posted May 28, 2024 20:51 UTC (Tue) by daroc (editor, #160859) [Link]

(And, I should say -- this naturally applies to everyone. I'm not trying to single out Cyberax. Let's all remember to remain polite, respectful, and informative, like the box above the comment submission form says)

The rest of the 6.10 merge window

Posted May 28, 2024 17:56 UTC (Tue) by bluca (subscriber, #118303) [Link] (3 responses)

Usual rubbish. What you, again, fail to understand, is that it's there for free, given there _will_ be multiple processes actively using those APIs, which means the shared library is already loaded in memory by the kernel, and cached among all processes using it. So there is (next to) no extra cost if you don't need it, and if suddenly requirements change and you do need it, you get it for free.

The rest of the 6.10 merge window

Posted May 28, 2024 20:14 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

The question here is not just the overhead (though it IS an issue for containers), but the attack/failure surface. I don't think it's good that every application now has DBUS interface code, just out of pure design cleanliness perspective. Even if it's mostly dormant.

The rest of the 6.10 merge window

Posted May 28, 2024 22:19 UTC (Tue) by Wol (subscriber, #4433) [Link] (1 responses)

THIS!

Does this mean any attacker who manages to hijack my program, now has access to DBUS with my credentials?

Not that I understand DBUS in the slightest, but the less code there is in my programs that I don't understand, the happier I am.

And that is actually a real danger - if I'm clueless about DBUS (because I don't knowingly use it), then how am I supposed to defend myself against its misuse (and more to the point - why should I have to)?

Cheers,
Wol

The rest of the 6.10 merge window

Posted May 28, 2024 22:36 UTC (Tue) by mjg59 (subscriber, #23239) [Link]

The entire point of arbitrary code execution in a process context is that you can execute arbitrary code, not just whatever code already existed in the process.

The rest of the 6.10 merge window

Posted May 29, 2024 16:40 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (1 responses)

You're right, it's actually much worse than I remembered.

I was referring basically to sd-daemon.h. Those are the only bits that are of interest to projects outside systemd. If you want to keep it as a .so, move everything else out of the way; but they shouldn't be in the same place as journal, D-BUS and everything else.

The rest of the 6.10 merge window

Posted May 29, 2024 16:51 UTC (Wed) by bluca (subscriber, #118303) [Link]

> Those are the only bits that are of interest to projects outside systemd.

A quick search for 'sd_bus_message' on Github, excluding the systemd org, shows ~12700 results, and that's just one subset of one family of APIs, so I'm going to go with a big and fat <citation needed>

> If you want to keep it as a .so, move everything else out of the way

No

The rest of the 6.10 merge window

Posted May 28, 2024 14:30 UTC (Tue) by mezcalero (subscriber, #45103) [Link] (1 responses)

That's just rubbish, you apparently have no idea what you are talking about. When you have dlopen() dependencies you resolve your symbols manually via dlsym(), not implicitly via ELF's GOT/PLT. Hence, mseal()ing the GOT/PLT won't affect things for dlopen() based "weak" deps at all. (It won't be able to lock down the security of the pointers you store dlsym() return values in either though, but that's not quite the same as cause "libsystems working properly".)

Anyway, would appreciate if you'd stop your uneducated FUD, not helpful.

Lennart

The rest of the 6.10 merge window

Posted May 28, 2024 16:50 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

> It won't be able to lock down the security of the pointers you store dlsym() return values in either though, but that's not quite the same as cause "libsystems working properly"

As I understand, mseal() should prevent dlopen() from working? Is it not?

> Anyway, would appreciate if you'd stop your uneducated FUD, not helpful.

Introducing brittle workarounds for clearly specious reasons is not helpful either. You screwed up by turning libsystemd from a small 50kb library with utility functions into a large library with decompressors and DBUS implementation.


Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds