2.6.36 merge window: the sequel
User-visible changes include:
- The ext3 filesystem, once again, defaults to the (safer) "ordered"
mode at mount time. This reverses the change (to "writeback" mode)
made in 2009, which was typically overridden by distributions.
- The out-of-memory killer has
been rewritten. The practical result is that the system may
choose different processes to kill in out-of-memory situations, and
the user-space API for adjusting how attractive processes appear to
the OOM killer has changed.
- The fanotify mechanism
has been merged. Fanotify allows a user-space daemon to obtain
notification of file operations and, perhaps, block access to specific
files. It is intended for use with malware scanning applications, but
there are other potential uses (hierarchical storage management, for
example) as well.
- There is a new system call for working with resource limits:
int prlimit64(pid_t pid, unsigned int resource, const struct rlimit64 *new_rlim, struct rlimit64 *old_rlim);
It is meant to (someday) replace setrlimit(); the differences include the ability to modify limits belonging to other processes and the ability to query and set a limit in a single operation.
- The TTY driver has gained support for the EXTPROC mode
supported by BSD for the last 20 years or so. This option was
originally developed to
facilitate telnet's "linemode", but it is useful for contemporary
protocols as well.
- New drivers:
- Processors and systems: Ingenic JZ4740 SOC systems,
Trapeze ITS GPR boards,
ifm PDM360NG boards,
Freescale P1022DS reference boards,
TQM mcp8xx-based boards,
TI TNETV107X-based systems,
OMAP4430-based PandaBoards,
NVIDIA Tegra-based systems, and
Tilera TILEPro and TILE64 processors (a whole new architecture).
- Block:
QLogic ISP82XX host adaptors,
AppliedMicro 460EX processor on-chip SATA controllers,
Samsung S3C/S5P board PATA controllers, and
Moorestown NAND Flash controllers.
- Media:
EasyCAP USB video adapters,
Softlogic 6x10 MPEG codec cards,
Winbond/Nuvoton NUC900-based audio controllers,
Cirrus Logic CS42L51 codecs,
Cirrus Logic EP93xx series audio devices,
Marvell Kirkwood I2S audio devices,
Ingenic JZ4740-based audio devices,
SmartQ board audio devices,
Wolfson Micro WM8741 codecs, and
Samsung S5P FIMC video postprocessors.
- Miscellaneous: Silicon Image sil164 TMDS transmitters, TI DSP bridge devices, PCILynx TSB12LV21/A/B controllers (as a FireWire sniffer; the user-space side has also been added under tools/firewire), Bosch Sensortec BMP085 digital pressure sensors, ROHM BH1780GLI ambient light sensors, Honeywell HMC6352 compasses, Summit Microelectronics SMM665 six-channel active DC output controller/monitor devices, JEDEC JC 42.4 compliant temperature sensors, Intel Topcliff PCH DMA controllers, Intel Moorestown DMAC1 and DMAC2 controllers, Intel Moorestown MAX3110 and MAX3107 UARTs, Intel Medfield UARTs, Quatech SSU-100 USB serial ports, and ARM Primecell SP805 watchdog timers.
- Processors and systems: Ingenic JZ4740 SOC systems,
Trapeze ITS GPR boards,
ifm PDM360NG boards,
Freescale P1022DS reference boards,
TQM mcp8xx-based boards,
TI TNETV107X-based systems,
OMAP4430-based PandaBoards,
NVIDIA Tegra-based systems, and
Tilera TILEPro and TILE64 processors (a whole new architecture).
Changes visible to kernel developers include:
- The SCSI layer now supports runtime power management, but almost no
work has been done (yet) to push that support down into individual
drivers.
- The MIPS architecture now has kprobes support.
- The KGDB debugger is now supported with the Microblaze architecture.
- There are a few new build-time configuration commands:
listnewconfig outputs a list of new configuration options,
oldnoconfig sets all new configuration options to "no"
without asking,
alldefconfig sets all options to their default values, and
savedefconfig writes a minimal configuration file in
defconfig. (This
patch adding the first two options above also introduces a new
Whatevered-by: patch tag, with unknown semantics).
- There is a new scripts/coccinelle directory containing a
number of Coccinelle
"semantic patches" which perform various useful checks. They can be
run with "make coccicheck".
- The kmemtrace ftrace plugin is gone; "perf kmem" should be used
instead. The ksym plugin has also been superseded by perf, and, thus,
removed.
- There is a new function for short, blocking delays:
void usleep_range(unsigned long min, unsigned long max);
This function will sleep (uninterruptibly) for a period between min and max microseconds. It is based on hrtimers, so the timing will be more precise than obtained with msleep().
- The new IRQF_NO_SUSPEND flag for request_irq() will cause
the interrupt line
not to be disabled during suspend; IRQF_TIMER can no longer
be (mis)used for this purpose.
- The concurrency-managed
workqueues patch set has been merged, completely changing the way
workqueues are implemented. One immediate user-visible result will be
that there should be far fewer kernel threads running on most systems.
All users of the "slow work" API have been converted to
concurrency-manged workqueues, so the slow work mechanism has been
removed from the kernel.
- The cpuidle mechanism has been enhanced to allow for the set of
available idle states to change over time. Details can be found in this
patch.
- The Blackfin architecture has gained dynamic ftrace support.
- There is a new super_operations method called
evict_inode(); it handles all of the necessary work when an
in-core inode is being removed. It should be used instead of
clear_inode() and delete_inode().
- The inotify mechanism has been removed from inside the kernel; the
fsnotify mechanism must be used instead. (Of course, the user-space
inotify interface is still supported).
- The Video4Linux2 layer has gained a new framework which simplifies the
handling of controls; see this
commit and Documentation/video4linux/v4l2-controls.txt
for details.
- The open() and release() functions in struct
block_device_operations are now called without the big kernel
lock held. Additionally, the locked_ioctl() function has
gone away; all block drivers must implement their own locking there as
well.
- The domain name resolution code has been pulled out of the CIFS filesystem and made generic. It works by using the key mechanism to request DNS resolution from user space; see Documentation/networking/dns-resolver.txt for details.
The merge window remains open as of this writing, so we may yet see more
interesting features merged for 2.6.36. Watch this space next week for the
final merge window updates for this development cycle.
Index entries for this article | |
---|---|
Kernel | Releases/2.6.36 |
Posted Aug 12, 2010 0:07 UTC (Thu)
by viro (subscriber, #7872)
[Link]
Current rules are pretty simple:
1) ->drop_inode() is called when we release the last reference to struct inode. It tells us whether fs wants inode to be evicted (as opposed to retained in inode cache). Doesn't do actual eviction (as it used to), just returns an int. The normal policy is "if it's unhashed or has no links left, evict it now". generic_drop_inode() does these checks. NULL ->drop_inode means that it'll be used. generic_delete_inode() is "just evict it". Or fs can set rules of its own; grep and you'll see.
2) ->delete_inode() and ->clear_inode() are gone; ->evict_inode() is called in all cases when inode (without in-core references to it) is about to be kicked out, no matter why that happens (->drop_inode() telling that it shouldn't be kept around, memory pressure, umount, etc.) It will be called exactly once per inode's lifetime. Once it returns, inode is basically just a piece of memory about to be freed.
3) ->evict_inode() _must_ call end_writeback(inode) at some point. At that point all async access from VFS (writeback, basically) will be completed and inode will be fs's to deal with. That's what calls of clear_inode() in original ->delete_inode() should turn into. Don't dirty an inode past that point; it never worked to start with (writeback logics would've refused to trigger ->write_inode() on such inodes) and now it'll be detected and whined about.
4) kicking the pages out of page cache (== calling truncate_inode_pages()) is up to ->evict_inode() instance; that was already the case for ->delete_inode(), but not for ->clear_inode(). Of course, if fs doesn't use page cache for that inode, it doesn't have to bother. Other than that, ->evict_inode() instance is basically a mix of old ->clear_inode() and ->delete_inode(). inodes with NULL ->evict_inode() behave exactly as ones with NULL ->delete_inode() and NULL ->clear_inode() used to.
That's it. Original was much more convoluted...
Posted Aug 19, 2010 9:33 UTC (Thu)
by renox (guest, #23785)
[Link] (1 responses)
Is-there an interesting inside story with this?
Posted Aug 19, 2010 11:45 UTC (Thu)
by cladisch (✭ supporter ✭, #50193)
[Link]
2.6.36 merge window: the sequel
2.6.36 merge window: the sequel
This kind of change of the default setting looks really amateurish for an external point of view..
From this thread:
Re: ext3: default to ordered mode
Ok, so now I know *why* that one filesystem got busted - I built a
kernel without CONFIG_EXT3_DEFAULTS_TO_ORDERED set and it got a
forced reboot (echo b > proc/sysrq-trigger). That'll teach me for
trying to reproduce bugs Andrew is tripping over with his config
files.
Quite frankly, data=writeback mode for ext3 is a dangerous,
dangerous configuration to run by default. IMO, it shouldn't be the
default. Patch below.
Jan's changelog:
data=writeback mode is dangerous as it leads to higher data loss and stale data
exposure when systems crash. It should not be the default, especially when all
major distros ensure their ext3 filesystems default to ordered mode. Change the
default mode to the safer data=ordered mode, because we should be caring far
more about avoiding stale data exposure than performance.