|
|
Subscribe / Log in / New account

4.12 Merge window part 1

By Jonathan Corbet
May 3, 2017
The 4.12 merge window opened on May 1; as of this writing, just over 4,300 non-merge changesets have been pulled into the mainline repository. Though things are just beginning, it has the look of yet another busy development cycle for the kernel community. Thus far, the bulk of the changes merged have been in the block I/O and networking areas.

Some of the more interesting user-visible changes merged thus far include:

  • As expected, the BFQ and Kyber block I/O schedulers have been merged. The kernel now has two multiqueue I/O schedulers suitable for widely varying use cases, and the long wait for BFQ in the mainline is at an end.

  • The blk-throttle control-group controller has a new "low" limit that serves as a sort of soft cap. No group is allowed to exceed its low limit until all active groups have reached their respective low limits. It is also now possible to adjust the sample period used by the controller, trading off fine control against CPU overhead.

  • The LightNVM subsystem has gained a "pblk" target which will expose an open-channel SSD as an ordinary-looking block device.

  • The prctl() system call has two new operations: ARCH_SET_CPUID to allow trapping of the CPUID instruction, and ARCH_GET_CPUID to get the current state of that trapping. This feature, which is only implemented on the x86 architecture, is expected to be useful for tracing applications that want to trap and emulate this instruction.

  • As usual, the perf events subsystem has seen a number of changes; see this merge commit for the list.

  • The BPF virtual machine subsystem has seen a few improvements. Maps are now able to contain other maps, allowing them to be cascaded to multiple levels. There is a new in-kernel testing framework for BPF programs, controlled by the new BPF_PROG_TEST_RUN command to the bpf() system call. And there is now a just-in-time BPF compiler for the SPARC64 architecture.

  • The epoll_wait() system call can now perform busy-polling of network sockets, reducing packet-reception latencies.

  • The "hybrid consistency model" for live kernel patching has been merged. This model, discussed in this article, enables the application of patches that change function or data semantics. See this commit for an overview of how it works.

  • The MD RAID5 implementation has gained "partial parity log" support. This feature can reduce the possibility of corruption when running with a degraded array. See Documentation/md/raid5-ppl.txt for more information.

  • The device mapper supports a new dm-integrity target; it emulates a device with extra per-sector integrity tags. See Documentation/device-mapper/dm-integrity.txt for details.

  • New hardware support includes:

    • Cryptographic: Cavium ThunderX "ZIP" compression engines, Freescale CAAM Queue-Interface crypto engines, STMicroelectronis STM32 crypto accelerators, and Mediatek random number generators.

    • Input: Accutouch 2216 touch controllers.

    • Miscellaneous: ASPEED AST2400/AST2500 PWM and fan controllers, Motorola CPCAP PMIC battery chargers, LEGO MINDSTORMS EV3 batteries, Broadcom FlexRM ring managers, Mediatek MT6323 PMIC LED controllers, Motorola CPCAP PMIC LED controllers, DaVinci DM816 AHCI SATA controllers, and NVIDIA Tegra186 CPU-frequency controllers.

    • Multi-media card: Broadcom BCM2835 SDHOST MMC controllers, Cavium ThunderX and Octeon SD/MMC card interfaces, and Marvell Xenon eMMC/SD/SDIO SDHCI interfaces.

    • Networking: APM X-Gene SoC Ethernet interfaces, Synopsys DWC Enterprise Ethernet adapters, Holt HI311x SPI CAN controllers, Cascoda CA8210 SPI 802.15.4 wireless controllers, SMSC/MicroChip LAN9303 three-port Ethernet switches, and Microchip CAN bus analyzer interfaces.

    • Pin control: Axis ARTPEC-6 pin controllers, Marvell 37xx SoC pin controllers, and STMicroelectronics STM32F469 pin controllers.

Changes visible to kernel developers include:

  • The "hd" disk driver, written by Linus and present since the 0.01 release, has been removed at last.

  • The new "AnalyzeBoot" tool can create a timeline of the kernel's bootstrap process in HTML format.

  • The code for accessing user-space data from the kernel has been significantly reworked, resulting in the removal of a lot of architecture-specific code.

  • The AVR32 architecture has been removed from the kernel. The chips have been past their end of life for some time, and the kernel code has been poorly maintained at best.

  • The "generic XDP" functionality in the networking stack implements express data path functionality on devices that lack their own optimized implementation. It is meant to make XDP functionality more widely available, especially for developers who are new to it.

The 4.12 merge window will likely remain open through May 14, and the 4.12 release will probably happen in early July. As always, LWN will continue to follow the patch stream as this merge window runs its course.

Index entries for this article
KernelReleases/4.12


to post comments

4.12 Merge window part 1

Posted May 4, 2017 6:58 UTC (Thu) by alison (subscriber, #63752) [Link] (3 responses)

'The new "AnalyzeBoot" tool can create a timeline of the kernel's bootstrap process in HTML format.'

How is AnalyzeBoot different from systemd-bootchart? Speaking of systemd, are we right to assume that Andy Lutomirski has killed the most recent attempt to get D-Bus-like broadcast IPC notifications into the kernel?

4.12 Merge window part 1

Posted May 4, 2017 16:01 UTC (Thu) by Tara_Li (guest, #26706) [Link]

Hopefully, AnalyzeBoot doesn't *depend* on systemd. There actually *are* people who prefer to use a different init system.

4.12 Merge window part 1

Posted May 4, 2017 16:20 UTC (Thu) by smcv (subscriber, #53363) [Link]

As a user-space component, systemd-bootchart doesn't know anything before pid 1 comes up (it starts as pid 1, forks to run in the background, and execs the real init as pid 1). It also doesn't have access to kernel internals in detail.

If the majority of your boot cost is happening in user-space, systemd-bootchart is probably still the more useful tool (particularly for low-hanging fruit like "everything seems to be waiting for foo, can I avoid that?"), but if a large part of your boot cost is in kernel-space, this new thing is how you would analyze that part.

4.12 Merge window part 1

Posted May 6, 2017 9:10 UTC (Sat) by flussence (guest, #85566) [Link]

To take that question further back in history: how is systemd-bootchart different from pybootchart?

"AnalyzeBoot"

Posted May 4, 2017 8:33 UTC (Thu) by johnjingleheimer (guest, #115088) [Link] (6 responses)

Is the "AnalyzeBoot" tool mentioned the same as analyze_boot.py[1] that is part of Intel's pm-graph tools[2] which is in turn part of the work that is being done the same to improve suspend and resume operations?[3]

[1] https://github.com/01org/pm-graph/blob/master/analyze_boo...
[2] https://github.com/01org/pm-graph
[3] https://01.org/suspendresume

"AnalyzeBoot"

Posted May 4, 2017 9:41 UTC (Thu) by ntnn (guest, #109693) [Link]

Yes, in this[1] message the pull for 4.12-rc1 is requested with the pm-graph tools, containing AnalyzeBoot.
This[2] is the actual script, if someone wants to compare with the official repo[3].

[1]: http://lkml.iu.edu/hypermail/linux/kernel/1705.0/00334.html
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/li...
[3]: https://github.com/01org/pm-graph/blob/master/analyze_boo...

Suspend/Resume

Posted May 5, 2017 22:39 UTC (Fri) by fratti (guest, #105722) [Link] (4 responses)

>https://01.org/suspendresume

Huh, isn't the normal thing for "fast" power-saving to set the processor into lower p-states? I mostly know suspend/resume from laptops, since for phones you'll still want to occasionally do things in the background, and for servers you're really just waiting for a new request of some form to come in, or a batch job to start.

Is this some work to make suspend/resume fast enough so it can be initiated automatically without possibly annoying the user?

Suspend/Resume

Posted May 6, 2017 9:29 UTC (Sat) by flussence (guest, #85566) [Link] (3 responses)

The automatic suspend/resume feature already exists in the kernel under CONFIG_PM_AUTOSLEEP. It's nearly useless on an average x86 ACPI machine; ARM platforms can enter/leave the equivalent of S1/S3 state in milliseconds, so it's the main method of power saving there. Compare the responsiveness of the "power" button on a phone to the suspend button on a laptop - they do the same thing.

It sounds like Intel is making a serious attempt to catch up on that front, and if they can get a high wattage Xeon to wake on LAN and start servicing requests in under a second I can see it becoming a big selling point. They need to get there before ARM servers start eating their lunch en-masse, though.

Suspend/Resume

Posted May 6, 2017 15:01 UTC (Sat) by fratti (guest, #105722) [Link] (2 responses)

Ah, I did not realise phones entered such a deep sleep when the screen was locked. I always figured if that was the case, the wireless connections and such would drop, but I guess these are handled by external chips and they can wake up the main processor when something interesting is happening.

Thanks for the insight on this!

Suspend/Resume

Posted May 12, 2017 9:55 UTC (Fri) by oldtomas (guest, #72579) [Link]

> but I guess these are handled by external chips

Well, kind of, only that they are subsystems (with CPU, RAM, ROM)... on the same chip. Cost pressure has forced maufacturers to move next to everything into That One Big Chip.

Although unrelated, those two articles have the nice collateral effect to update one's idea of the current State of the Phone:

https://googleprojectzero.blogspot.de/2017/04/over-air-ex...
https://googleprojectzero.blogspot.de/2017/04/over-air-ex...

(and, btw, it's through LWN that I was made aware of them).

Suspend/Resume

Posted May 16, 2017 15:42 UTC (Tue) by flussence (guest, #85566) [Link]

Wi-Fi standards are a twisty maze of layered hacks to facilitate power saving. Chances are your phone is mostly disconnected when it's switched off, but reconnects fast enough when it wakes up to put on a convincing illusion that it's always-on.

Most Android phones also have Play Services and its server-push API, so instead of O(n) apps polling for new messages separately it's a single connection to Google's servers. (Anecdotally, I found my phone's battery life tripled after removing the Google-ware, but that's probably more to do with it triggering the OOM killer constantly...)

hd disk driver

Posted May 5, 2017 22:30 UTC (Fri) by fratti (guest, #105722) [Link] (3 responses)

 *  Bugfix: max_sectors must be <= 255 or the wheels tend to come
 *  off in a hurry once you queue things up - Paul G. 02/2001

and

/* Uncomment the following if you want verbose error reports. */
/* #define VERBOSE_ERRORS */

Man, this code is gold.

hd disk driver

Posted May 8, 2017 13:08 UTC (Mon) by BenHutchings (subscriber, #37955) [Link] (2 responses)

  • Gets high-precision timestamp by reading the i8253 directly and assuming CONFIG_HZ == 100
  • Non-configurable I/O port addresses
  • Does (nearly) everything in hard interrupt handlers

hd disk driver

Posted May 8, 2017 15:05 UTC (Mon) by excors (subscriber, #95769) [Link]

And there's the nice "for (i = 0; i < 1000; i++) barrier();" (an optimisation of Linux 0.01's "for(i = 0; i < 1000; i++) nop();" - the barrier version saves the cost of the explicit nop instruction). Back in 1991 I guess that would have delayed for a couple of hundred microseconds, nowadays it's a couple of hundred nanoseconds. I hope the hardware has very tolerant timing constraints.

Seems a good idea to be deleting this kind of obsolete code, in case people writing new drivers happen to look at it and don't realise it's obsolete and pick up bad habits. (I suspect I've written some bad code myself for that reason.)

hd disk driver

Posted May 8, 2017 15:17 UTC (Mon) by nix (subscriber, #2304) [Link]

It even uses interrupts rather than function calls. It's sort of like a really twisted continuation-passing style. :)


Copyright © 2017, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds