User-space lockdep
Lockdep works by adding wrappers around the locking calls in the kernel. Every time a specific type of lock is taken or released, that fact is noted, along with ancillary details like whether the processor was servicing an interrupt at the time. Lockdep also notes which other locks were already held when the new lock is taken; that is the key to much of the checking that lockdep is able to perform.
To illustrate this point, imagine that two threads each need to acquire two locks, called A and B:
If one thread acquires A first while the other grabs B first, the situation might look something like this:
Now, when each thread goes for the lock it lacks, the system is in trouble:
Each thread will now wait forever for the other to release the lock it holds; the system is now deadlocked. Things may not come to this point often at all; this deadlock requires each thread to acquire its lock at exactly the wrong time. But, with computers, even highly unlikely events will come to pass sooner or later, usually at a highly inopportune time.
This situation can be avoided: if both threads adhere to a rule stating that A must always be acquired before B, this particular deadlock (called an "AB-BA deadlock" for obvious reasons) cannot happen. But, in a system with a large number of locks, it is not always clear what the rules for locking are, much less that they are consistently followed. Mistakes are easy to make. That is where lockdep comes in: by tracking the order of lock acquisition, lockdep can raise the alarm anytime it sees a thread acquire A while already holding B. No actual deadlock is required to get a "splat" (a report of a locking problem) out of lockdep, meaning that even highly unlikely deadlock situations can be found before they ruin somebody's day. There is no need to wait for that one time when the timing is exactly wrong to see that there is a problem.
Lockdep is able to detect more complicated deadlock scenarios than the one described above. It can also detect related problems, such as locks that are not interrupt-safe being acquired in interrupt context. As one might expect, running a kernel with lockdep enabled tends to slow things down considerably; it is not an option that one would enable on a production system. But enough developers test with lockdep enabled that most problems are found before they make their way into a stable kernel release. As a result, reports of deadlocks on deployed systems are now quite rare.
Kernel-based tools often do not move readily to user space; the kernel's programming environment differs markedly from a normal C environment, so kernel code can normally only be expected to run in the kernel itself. In this case, though, Sasha Levin noticed that there is not much in the lockdep subsystem that is truly kernel-specific. Lockdep collects data and builds graphs describing observed lock acquisition patterns; it is code that could be run in a non-kernel context relatively easily. So Sasha proceeded to put together a patch set creating a lockdep library that is available to programs in user space.
Lockdep does, naturally, call a number of kernel functions, so a big part of Sasha's patch set is a long list of stub implementations shorting out calls to functions like local_irq_enable() that have no meaning in user space. An abbreviated version of struct task_struct is provided to track threads in user space, and functions like print_stack_trace() are substituted with user-space equivalents (backtrace_symbols_fd() in this case). The kernel's internal (used by lockdep) locks are reimplemented using POSIX thread ("pthread") mutexes. Stub versions of the include files used by the lockdep code are provided in a special directory. And so on. Once all that is done, the lockdep code can be built directly out of the kernel tree and turned into a library.
User-space code wanting to take advantage of the lockdep library needs to start by including <liblockdep/mutex.h>, which, among other things, adds a set of wrappers around the pthread_mutex_t and pthread_rwlock_t types and the functions that work with them. A call to liblockdep_init() is required; each thread should also make a call to liblockdep_set_thread() to set up information for any problem reports. That is about all that is required; programs that are instrumented in this way will have their pthreads mutex and rwlock usage checked by lockdep.
As a proof of concept, the patch adds instrumentation to the (thread-based) perf tool contained within the kernel source tree.
One of the key aspects of Sasha's patch is that it requires no changes to the in-kernel lockdep code at all. The user-space lockdep library can be built directly out of the kernel tree. Among other things, that means that any future lockdep fixes and enhancements will automatically become available to user space with no additional effort required on the kernel developers' part.
In summary, this patch looks like a significant win for everybody involved;
it is thus not surprising that opposition to its inclusion has been hard to
find. There has been a call for some
better documentation, explicit mention that the resulting user-space
library is GPL-licensed, and a runtime toggle for lock validation (so that
the library could be built into applications but not actually track locking
unless requested). Such
details should not be hard to fill in, though. So, with luck, user space
should have access to lockdep in the near future, resulting in more
reliable lock usage.
Index entries for this article | |
---|---|
Kernel | Lockdep |
Posted Feb 8, 2013 23:17 UTC (Fri)
by cmccabe (guest, #60281)
[Link] (5 responses)
https://github.com/cmccabe/lksmith
I've been trying to get more people interested in it, but I have a day job and I don't have a ton of time to devote to this. Hopefully I'll get a chance to give a talk on it at a conference or two this year.
I admit I only skimmed this patch set, but from a first glance the differences are: they require an init() call, whereas my library does not, they're GPLv2 only, whereas my library is BSD.
Another tool that can be used to debug these kinds of problems is helgrind, from the awesome valgrind suite of tools. helgrind has some limitations, though: for example, the documentation urges you not to use condition variables because it doesn't support them. Also, helgrind runs your program rather slowly.
I think one thing that is important for a lot of projects is the ability to roll their own locks and have them be instrumented. For example, if you rolled your own locks with futexes or atomic instructions, you need some way to notify your lock debugging library of what you've done. (I received this request from one potential Locksmith user). Please correct me if I'm wrong, but it seems to me that it's going to be difficult to do that in your GPLv3/BSD/proprietary program if it means linking against GPLv2-only source like liblockdep.
Posted Feb 9, 2013 1:37 UTC (Sat)
by sashal (✭ supporter ✭, #81842)
[Link] (3 responses)
1. We need an init call.
This is why I've sent another patch series couple of hours before this article was published that first eliminates the need in init calls, and then allows liblockdep to be used as a LD_PRELOAD library.
In essence, it means that both problems are solved: you don't need to call anything in your code to start liblockdep, and if you don't want to link with it, you don't have to!
You can now have liblockdep test code from the outside, without being compiled in your program. For example:
"lockdep perf sched record"
will now run a lockdep analyzed perf doing it's usual sched recording work.
Posted Feb 9, 2013 4:20 UTC (Sat)
by cmccabe (guest, #60281)
[Link]
I have a bunch of features I've been planning on adding to Locksmith, but I'm pretty busy right now so it will be a few weeks probably.
What would be even better is to see this kind of functionality implemented at the libc / pthreads level.
Posted Feb 10, 2013 23:49 UTC (Sun)
by asnast (guest, #74907)
[Link]
It would be nice if one could modify the output stream of liblockdep (currently it spits its output to stdout).
How about adding liblockdep_set_stream(FILE *) to achieve the above?
Posted Feb 28, 2013 9:34 UTC (Thu)
by bergwolf (guest, #55931)
[Link]
Do you have a git somewhere so people can pull the latest code?
Thanks!
Posted May 10, 2013 7:37 UTC (Fri)
by mingo (guest, #31122)
[Link]
If you want to permanently link it into your application then it obviously needs to be license compatible with the kernel's lockdep code.
User-space lockdep
User-space lockdep
2. This is GPL code.
User-space lockdep
User-space lockdep
User-space lockdep
User-space lockdep