CAP_SYS_ADMIN: the new root

March 14, 2012

This article was contributed by Michael Kerrisk.

Capabilities are—at least in theory—a nice idea: divide the privileges of root (user ID 0) into small pieces so that a process can be granted just enough power to perform specific privileged tasks. If the pieces are small enough, and well chosen, then, even if a privileged program is compromised (e.g., by a buffer overrun), the damage that can be done is limited by the set of capabilities that are available to the process. Good examples of the use of such fine-grained privileges are CAP_KILL, which permits sending signals to arbitrary processes, and CAP_SYS_TIME, which permits setting the system clock.

As of Linux 3.2, there are 36 capabilities. You can see a list of them, along with some of the main powers they each grant, in the capabilities(7) manual page. Capabilities can (since Linux 2.6.24) be attached to an executable file, to create the capabilities equivalent of a set-user-ID-root program: when the executable is run, the resulting process starts with a limited set of capabilities (instead of the full power of root, as is the case for set-user-ID-root programs).

The key point from the beginning of this article is small pieces, and it's here that the Linux capabilities implementation has gone astray.

When a kernel developer adds a new feature that should require privilege, what capability should they use, or should they perhaps even create a new capability? Although parceling root privileges into small pieces is useful from a security perspective, we don't want too many pieces, since then the task of administering capabilities would become unwieldy. Thus, it usually makes sense to employ an appropriate existing capability to control access to a new privileged kernel feature.

And this is where the problem begins. First, there is—unsurprisingly, given the Linux development model—no central authority determining how capabilities should be assigned to privileged operations. Second, there is very little guidance on what capability to choose. (Probably the best existing guide is to look at the capabilities(7) man page. By comparing with existing uses in that page, we can get some guidance on choosing the capability that best matches a new use case.)

So in practice, what happens? A kernel developer looks at the list of available capabilities in the kernel include/linux/capability.h header file, and is likely left bewildered wondering which capability to choose. (It appears that the original intent was that this header file would be updated with comments for all of the usages of each capability, so as to give an overview of capability usage, but in practice those comments have been updated only sporadically.) But the developer does know one thing: their feature will likely be administered by system administrators, and, helpfully, there is a capability called CAP_SYS_ADMIN. So, lacking sufficient information for a decision, the developer chooses CAP_SYS_ADMIN for their new feature.

Which brings us to where we are today: of the 1167 uses of capabilities in C files in the Linux 3.2 source code, 451 of those uses are CAP_SYS_ADMIN. That's rather more than a third of all capability checks. We might wonder if CAP_SYS_ADMIN is overrepresented because of duplications of similar operations in the kernel arch/ trees, or because CAP_SYS_ADMIN is commonly assigned as the capability governing administrative functions on device drivers. However, even after eliminating drivers/ and architectures other than x86, CAP_SYS_ADMIN still accounts for 167—about 30%—of the 552 uses of capabilities. (Fuller details about usage of capabilities in current and earlier kernels can be found here.)

So, on the one hand, the powers granted by CAP_SYS_ADMIN are so numerous and wide ranging that, armed with that capability, there are several avenues of attack by which a rogue process could gain all of the other capabilities. (As has been summarized by Brad Spengler, the ability to be leveraged for full root privileges is a weakness of many existing capabilities; CAP_SYS_ADMIN is just the most egregious example.) On the other hand, so many privileged operations require CAP_SYS_ADMIN that it is the capability most likely to be assigned to a privileged program.

To summarize: CAP_SYS_ADMIN has become the new root. If the goal of capabilities is to limit the power of privileged programs to be less than root, then once we give a program CAP_SYS_ADMIN the game is more or less over. That is the manifest problem revealed from the above analysis. However, if we look further, there is evidence of an additional problem, one that lies in the Linux development model.

As noted above, if we eliminate drivers/ and architectures other than x86, CAP_SYS_ADMIN accounts for 30% of the uses of capabilities. However, when capabilities were first introduced in Linux 2.2, the corresponding figures were 23 of 147 uses (16%). This supports a hypothesis that when random kernel developers are faced with the question "What capability should I use to govern access to the privileged feature that I'm adding to the kernel?", the answer often goes "I'm not sure… maybe CAP_SYS_ADMIN?". In other words, the Linux kernel development model (where, for example, there is no overall coordination of the use of capabilities) appears not to scale well when multiple developers face questions of this sort. (In retrospect, it also seems clear that the choice of the name CAP_SYS_ADMIN was rather unfortunate. The name conveys no real information about what operations the capability should govern, and it's an easy choice that looks safe to kernel developers who are uncertain of what capability to use.)

What could be done to improve matters? There's no quick and easy way out of the existing situation, but there are some steps that could be taken:

Avoid new kinds of uses of CAP_SYS_ADMIN. (As this article was being written, Linux 3.3-rc is adding 13 new uses of capabilities. Most of them are CAP_SYS_ADMIN, and at least some of them may be new kinds of uses of that capability. One such use has been averted, however.)

Rename CAP_SYS_ADMIN to CAP_AS_GOOD_AS_ROOT. Well, maybe not. But such a change would help get the point across to kernel developers looking to choose a capability for their new feature.

Publish better guidelines on the use of capabilities. Past attempts to do this (the capabilities(7) man page and comments in include/linux/capability.h) have only had limited success (the guidelines are incomplete, and haven't done much to alleviate the problem). However, some more explicit guidelines, coupled with some measurements of the kernel source (see next point), might achieve better results.

Regularly publish statistics on the use of capabilities in the kernel source and monitor new uses of capabilities in each kernel release (e.g., employ some scripting to look at capability-related changes in the diff for the current -rc release).

Existing uses of CAP_SYS_ADMIN could be divided out into other existing capabilities, and possibly some new capabilities. Those capabilities could then be assigned to privileged programs instead of CAP_SYS_ADMIN. (For application backward-compatibility, the kernel capability checks wouldn't remove CAP_SYS_ADMIN, but rather would check for CAP_SYS_ADMIN or its replacement. This would allow old binaries that have the CAP_SYS_ADMIN capability to continue to work, while new binaries would be assigned the replacement capability.) One or two steps in this direction have already been made, for example, with the addition of the CAP_SYSLOG capability in Linux 2.6.37. An obvious first point of focus would be non-generic uses of CAP_SYS_ADMIN in areas other than drivers and the file-system trees. Next points of focus could be generic uses of CAP_SYS_ADMIN in the drivers/ and fs/ trees.

Do a similar analysis of other heavily used capabilities, especially CAP_NET_ADMIN, to see whether splitting would be useful for those capabilities. (CAP_NET_ADMIN has 395 uses in Linux 3.2. However, all of those uses are restricted to code in the drivers/net/ and net/ subdirectories. If we remove CAP_NET_ADMIN from the discussion, then there are more uses of CAP_SYS_ADMIN in the kernel source than all of the remaining capabilities combined.)

As well as the above, of course the problem outlined by Brad Spengler that many capabilities can be leveraged to gain full root access remains to be addressed. (Ongoing work on namespaces will help improve this situation for some capabilities when used in conjunction with containers.)

In summary, capabilities go some way toward improving application security, but there's still further work needed before they can deliver on their early promise of being a mechanism for providing discrete, non-elevatable privileges to applications. Furthermore, as the example of the ever-widening scope of CAP_SYS_ADMIN shows, some questions requiring coordinated answers are currently not well addressed by the distributed Linux development model.

[Acknowledgment: Thanks to Serge Hallyn for comments on an early draft of this article.]

Index entries for this article
Kernel	Capabilities
Security	Capabilities
Security	Linux kernel/Linux/POSIX capabilities
GuestArticles	Kerrisk, Michael

CAP_SYS_ADMIN: the new root

Posted Mar 14, 2012 17:29 UTC (Wed) by dpquigl (guest, #52852) [Link] (4 responses)

It's unclear to me why Serge isn't listed as the maintainer for the capabilities subsystem. He is the main proponent of Linux capabilities (at least the last time I paid attention to them). I would say anything having to do with capabilities should be run past Serge first. We don't let people introduce new security features without them being put through the security subsystem maintainer so I don't know why we let uses of the capability system slide.

CAP_SYS_ADMIN: the new root

Posted Mar 19, 2012 2:20 UTC (Mon) by jamesmorris (subscriber, #82698) [Link] (3 responses)

What do you mean? He is the maintainer :-)

CAP_SYS_ADMIN: the new root

Posted Mar 19, 2012 19:04 UTC (Mon) by dpquigl (guest, #52852) [Link] (2 responses)

I looked in the MAINTAINERS file and didn't see his name anywhere. If it were there people might be more inclined to make sure they CC him on any capabilities related code.

CAP_SYS_ADMIN: the new root

Posted Mar 19, 2012 19:08 UTC (Mon) by corbet (editor, #1) [Link] (1 responses)

You were probably looking at an inexcusably ancient kernel, like 3.3-rc7 or something. The MAINTAINERS file addition went in just before the 3.3 release.

CAP_SYS_ADMIN: the new root

Posted Mar 19, 2012 20:09 UTC (Mon) by dpquigl (guest, #52852) [Link]

It was even more inexcusable than 3.3-rc7 I hadn't done a pull since February 28th. The shame. I see that he is listed now.

CAP_SYS_ADMIN: the new root

Posted Mar 14, 2012 17:56 UTC (Wed) by JoeBuck (subscriber, #2330) [Link] (1 responses)

Suggestion: determine which capabilities are "as good as root" and either eliminate those capabilities or eliminate the ability for a program with that capability to obtain all the others. Programs that previously relied on eliminated capabilities would then have to run as root.

To do otherwise gives a false sense of security and just adds complexity.

CAP_SYS_ADMIN: the new root

Posted Mar 14, 2012 18:25 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

It's tricky, if you look at Brad's sample escalations you can see there are lots of assumptions involved. Some seem fairly clear (bind mounting a filesystem you control over the root is probably going to get attackers what they want or near enough) while others demand expert knowledge of the system being compromised (is there a root process treating information received over IPC as trusted? No? Too bad then, CAP_IPC_OWNER doesn't buy attackers root equivalence by that route)

The kernel is not in charge of local system policy. If the underlying block device on which your root filesystem is written is read-only then a kernel privilege to write to the device is no use to attackers, for example. But if they also have a device driver privilege that lets them flip the read-write switch on the hardware then suddenly they're in business...

CAP_SYS_ADMIN: the new root

Posted Mar 14, 2012 18:15 UTC (Wed) by tialaramex (subscriber, #21167) [Link] (4 responses)

Splitting privileges that are each equal to root into their own capability doesn't seem to achieve much, at least from a security point of view. Forty capabilities that are root equivalent isn't better than ten, or indeed one.

So it seems the main target should be those privileges or groups of privileges controlled by CAP_SYS_ADMIN which, after thorough examination, are useful separately without being root-equivalent in common systems. These could be given a new capability bit or added to an appropriate existing one.

Doing that "thorough examination" first is necessary I think, particularly for capabilities that already exist. Mistakenly adding some root-equivalent privilege to a capability because it "looked appropriate" superficially would be almost as bad as accidentally removing the capability checks from something vital. Having a new privilege temporarily in the CAP_SYS_ADMIN catch-all is much less awful.

Of course as with any bug, exactly how much a capability buys you will vary from one system to another. Snooping old-fashioned telnet was usually a goldmine. Snooping SSH is much less so (but far from completely useless). On some systems reading /etc/shadow is a big coup, on others not so much (e.g. there may be nothing in there but a (hash of the) local root password which can only be used on a physical console...). For this reason I don't much like Brad's classification of some escalations as "generic" but the idea of figuring out what attackers _might_ do with a privilege is definitely something to be left to white or grey hats and not the people doing routine Linux kernel development.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 2:54 UTC (Thu) by lutchann (subscriber, #8872) [Link]

> Mistakenly adding some root-equivalent privilege to a capability because it "looked appropriate" superficially would be almost as bad as accidentally removing the capability checks from something vital.

Yes, exactly. I don't want to have to grep every new kernel for CAP_.* to see if my containers are suddenly going to gain privileges that I didn't want them to have. I'm much happier with everything new going under CAP_SYS_ADMIN, which is already widely known to be a root equivalent.

CAP_SYS_ADMIN: the new BKL (;-))

Posted Mar 15, 2012 15:21 UTC (Thu) by davecb (subscriber, #1574) [Link]

I suspect we need to treat CAP_SYS_ADMIN the same way as we did the big kernel lock: carefully break it up into the small locks that we actually needed, and create a mechanism for allowing drivers to be kept in sync with specific locks they needed.

This is harder when one is breaking up a "lock" that is accessible from user-land, but once we've paid the price of no longer having the equivalent of the BKL, it gets *way* easier.

If asked, I can write a rant on how we migrating an equivalent problem out of existence for the GCOS C compiler (;-))

--dave

CAP_SYS_ADMIN: the new root

Posted Mar 17, 2012 17:55 UTC (Sat) by giraffedata (guest, #1954) [Link]

Splitting privileges that are each equal to root into their own capability doesn't seem to achieve much, at least from a security point of view.

I agree, but the non-security point of view is also important, which is why I like the present situation.

I use capabilities mainly to prevent a process from accidentally exercising privilege I never meant it to have. For example, it's extremely useful to have a process forbidden to update a file owned by someone else even if the process has the ability to change its UID to the owner's.

CAP_SYS_ADMIN: the new root

Posted Mar 22, 2012 9:11 UTC (Thu) by kevinm (guest, #69913) [Link]

Absolutely agreed.

There will always be many operations which fundamentally are equivalent to root, because they can be used to subvert the kernel itself. Splitting these dangerous operations up into many different capabilities is counter-productive - they should all be under one "root-equivalent" capability. It doesn't much matter whether you call that capability CAP_SYS_ADMIN, CAP_RAWIO or CAP_AS_GOOD_AS_ROOT.

CAP_SYS_ADMIN: the new root

Posted Mar 14, 2012 19:11 UTC (Wed) by mjthayer (guest, #39183) [Link] (21 responses)

Isn't PolicyKit a handier alternative to capabilities? You can define a capability pretty precisely (and like anything else security-like it is as good or as bad as its auditing from a compromise point of view) and have decent control over who can use it and who not. And as a bonus - as far as I am aware at least - it only uses standard POSIX mechanisms. (And yes, I am afraid I am one of those rare people who rather likes DBus.)

CAP_SYS_ADMIN: the new root

Posted Mar 14, 2012 19:38 UTC (Wed) by dpquigl (guest, #52852) [Link] (20 responses)

The problem is that policykit is a security mechanism implemented in userspace. In addition to that you need your application to be policykit aware. It does nothing to actually provide the capability to stop your program from doing something bad. You still need a kernel level mechanism to enforce actual access control whether that be capabilities, SELinux, or GRSecurity RBAC.

CAP_SYS_ADMIN: the new root

Posted Mar 14, 2012 20:01 UTC (Wed) by mjthayer (guest, #39183) [Link] (19 responses)

> You still need a kernel level mechanism to enforce actual access control whether that be capabilities, SELinux, or GRSecurity RBAC.

Perhaps I am seeing something wrong here. My thinking is that the enforcing is done by auditing the code to make sure it won't do anything you don't want it to. And the system administrator only installs policy modules which are known to be properly audited. Surely capabilities, SELinux, or GRSecurity RBAC are also only as good as the auditing which has been done on them, and the rights they provide are in fact analogous to PolicyKit modules but lower down the stack?

CAP_SYS_ADMIN: the new root

Posted Mar 14, 2012 20:53 UTC (Wed) by dpquigl (guest, #52852) [Link] (18 responses)

Maybe I just don't understand where the actual enforcement of policykit policy is. What makes the yes no decision? Is it just this library that's linked into your application? Is there a policykitd somewhere that makes these decisions (it seems there is)? What stops the program from just sending a message to the service it wants without having to deal with policykit? Does this require you to put all sorts of policykit calls into both the client and the privileged service? Thats a lot of work to get protection on objects that policykit doesn't even own. In the end the kernel still needs to provide actual protection over kernel object. It doesn't matter if policykit says no you can't do this if I can though an exploit in your program run my code to do it anyway.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 6:17 UTC (Thu) by cmccabe (guest, #60281) [Link] (17 responses)

from freedesktop.org:

> PolicyKit is an application-level toolkit for defining and handling
> the policy that allows unprivileged processes to speak to privileged
> processes: It is a framework for centralizing the decision making process
> with respect to granting access to privileged operations for unprivileged
> applications. PolicyKit is specifically targeting applications in rich
> desktop environments on multi-user UNIX-like operating systems. It
> does not imply or rely on any exotic kernel features.

The basic idea, if I understand it correctly, is to reduce the number of processes that need to run with root permissions by switching to a message passing model where a small set of privileged daemons do things on behalf of other processes. Android did something similar with their security model.

In fact, I have to ask who is actually using Linux's fine-grained capabilities that were discussed in this article? Nearly every programmer knows what root is, but mention CAP_SYS_ADMIN and you are likely to get a blank stare. Is all this complexity really necessary for something that people are not going to actually use?

If you're a userspace programmer writing a daemon that needs root permissions, you would be better off spending your time rewriting the code to use privilege separation-- which works on any OS. Openssh did this. Or you could just audit the code, or invest in writing an selinux policy. Why on earth would you waste your time with capabilities, which don't seem to be as stable as some of the other ABIs, and are mostly root-equivalent anyway?

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 9:03 UTC (Thu) by mjthayer (guest, #39183) [Link] (15 responses)

cmcabe wrote:
>The basic idea, if I understand it correctly, is to reduce the number of processes that need to run with root permissions by switching to a message passing model where a small set of privileged daemons do things on behalf of other processes. Android did something similar with their security model.

In fact they are not daemons but executables which are started on demand by DBus (the only running daemon required by the framework) with root privilege to perform an action. PolicyKit itself is a framework in the form of library APIs which the privileged modules can use to check whether the user who triggered them has the privileges to perform the action.

As far as I recall this is a pretty simple process - DBus starts the executables and passes them a cookie of some sort, and the new process makes a single API call, passing in the cookie, which reads in the PolicyKit configuration files from /etc and where ever else and returns a boolean "allowed" or "not".

PolicyKit configuration is a set of rules like "user michael is allowed to execute module 'setdate'", "group network is allowed to execute module 'setipaddress'", "local (non-ssh) users are allowed to execute module 'shutdown'". (I wonder how reliable the check for local users is!)

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 11:27 UTC (Thu) by dpquigl (guest, #52852) [Link] (14 responses)

So I read the policykit documentation last night and mechanisms can be either daemons or normal executables. DBUS activatable daemons are still daemons. This page explains a typical policykit interaction complete with client, mechanism, and authorization agent[1]. There are more than just allow and not allowed responses from policykit. There is also an authorize return which kicks things off to an authorization agent which handles the 8 or so different authorization modes that can be specified by policy kit. Going back to your earlier statement yesterday it seems your argument boils down to we should write code properly and rely on userspace to police itself. It also ignored the fact that the set of actions that policykit is policing and the objects it is protecting are completely disjoint from the objects that the kernel protects. policykit isn't going to provide you any sort of access control to the things under the hood. The way I read it policykit is more of an authentication mechanism than an access control mechanism.

As a side note its unfortunate that people aren't using the SELinux extensions to DBUS which would make policykit more effective. From what I can tell any "client" can send any message that a user is allowed to a policykit enabled "mechanism". The information transmitted with a DBUS message for policykit is uid and pid(and potentially an SELinux context). So in theory I could exploit an application and have the user think he's typing in his password to say change the system time but instead send a DBUS message to perform some other action. James Carter a long time ago extended DBUS to be a userspace object manager for SELinux and its a shame we don't see distros using that to ensure that certain clients are only allowed to talk to certain mechanisms. If it is restricted in some other way I'd like to be proven wrong but the documentation doesn't seem to mention anything about that.

[1]http://www.manpagez.com/html/PolicyKit/PolicyKit-0.9/mode...

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 11:54 UTC (Thu) by mjthayer (guest, #39183) [Link] (13 responses)

dpquigl wrote:
> Going back to your earlier statement yesterday it seems your argument boils down to we should write code properly and rely on userspace to police itself.

I think I could go along with that summary. Badly written code can be exploited whether it is in the kernel (including SELinux, as Brad Spengler has shown) or in user space. And the system administrator is responsible for installing both kernel and user space, and (it seems to me) should worry more about whether the mechanisms work and are well audited than exactly at what level they are implemented. (Speaking as someone who often writes non-security-related code which can live on either or both sides of the kernel/user space dividing line.)

>It also ignored the fact that the set of actions that policykit is policing and the objects it is protecting are completely disjoint from the objects that the kernel protects. policykit isn't going to provide you any sort of access control to the things under the hood.

Could you please give an example of what you mean there?

> From what I can tell any "client" can send any message that a user is allowed to a policykit enabled "mechanism".

I thought there was a school of thought that claimed that associating privileged actions with users, not with applications was a sound thing to do, but I am not knowledgeable about the subject to have an opinion about whether you or they are right.

> So in theory I could exploit an application and have the user think he's typing in his password to say change the system time but instead send a DBUS message to perform some other action.

Surely anti-spoofing password entry mechanisms are an orthogonal problem? For example, just an example, when requesting the password the privileged module could require some unspoof-able sequence (like Ctrl-Alt-Del in Windows) which brings up the password box, complete with a description of the action.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 12:12 UTC (Thu) by dpquigl (guest, #52852) [Link] (12 responses)

What Brad has proven is that you can't reliably protect a level of the system at that level. That if you have access to the kernel address space that anything in the kernel is fair game and you need a protection at a lower level to help address that (hardware for example). With that same reasoning you can't protect an application layer framework solely at that layer. If you think that the way to secure usespace is by properly auditing all userspace code to make sure its all ok then we'll have to disagree.

For an example to the second part of your response. The only thing policy kit will get you is a yes, no, authorize answer. It will not protect you if there is a fault in the "mechanism" that is using policy kit. If you find a way to send a bad DBUS message which allows you to hop the policy kit check through some exploitable code then you can still run. As a matter of fact there was a race condition in one of the policykit authentication agents which allowed you to exhaust the pid space and run any policykit action as root a while back. This is just skipping the authentication. In a similarly crafted situation above if I can get some sort of arbitrary code execution by this privileged mechanism I now have the ability to do whatever I want. Without one of the security mechanisms I mentioned above you're relying on the correctness of the userspace code to enforce access control on other things in the system. That's just not acceptable.

To address your third question. Restricting access control based on users has long been debunked as an acceptable mechanism. Whether it be SELinux, or SMACK, or Tomoyo, or GRSecurity RBAC, or any number of other mechanisms people have said time and time again that binding permissions to executing code is much more effective than binding it to a particular user. For example if you bind it to the user anything that user can do can be done by any application the user runs. So an exploit in the web browser can read out the users SSH keys if it wants.

To respond to the last part of your response. Its not a question of anti-spoofing. Trusted path is an important area where Linux is lacking but it still falls to a problem that actions are not bound to specific applications. If you bind an action to a user any policykit enabled program can authenticate and send a related message to a mechanism regardless on whether or not it is supposed to send that message type. I would really like to be proven wrong here because if I am not wrong this is a large hole which from what I can tell is only mitigated by putting in an SELinux dbus policy.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 14:09 UTC (Thu) by mjthayer (guest, #39183) [Link] (11 responses)

dpquigl wrote:
> With that same reasoning you can't protect an application layer framework solely at that layer. If you think that the way to secure usespace is by properly auditing all userspace code to make sure its all ok then we'll have to disagree.

Perhaps you did misunderstand me then. I certainly wasn't talking about auditing all of user space. The whole point of PolicyKit is to restrict the trusted code base which needs to be audited to a minimum, particularly by sharing a lot of code which is typically duplicated in e.g. setuid applications. And it uses kernel protection mechanisms to achieve that - like user separation, with the user the module is run as privileged and the other not. However these are standard POSIX mechanisms, not "homebrew" Linux ones.

> It will not protect you if there is a fault in the "mechanism" that is using policy kit. If you find a way to send a bad DBUS message which allows you to hop the policy kit check through some exploitable code then you can still run.

Perhaps it is simpler to talk about a malicious user of the PolicyKit mechanism? But how does this differ from a malicious user space binary gaining privileges by exploiting a hole in SELinux code in the kernel?

> If you bind an action to a user any policykit enabled program can authenticate and send a related message to a mechanism regardless on whether or not it is supposed to send that message type.

Indeed - my thought regarding spoofing and the trusted path was that this is somewhat mitigated if the policy module can unspoofably communicate back to the user what it is proposing to do at the same time as it asks for the password, so that if a malicious application running as the user has requested "overwrite /etc/passwd" the user will be reliably told that by entering their password they will cause /etc/passwd to be overwritten. I don't think that PolicyKit can currently do this in an unspoofable way, though I think it is designed in such a way that it can be added.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 14:25 UTC (Thu) by dpquigl (guest, #52852) [Link] (10 responses)

I didn't misunderstand you but you don't seem to understand the size of the TCB on a modern Linux system. Its massive and relying on proper auditing of the code is unreasonable. Some even argue that the kernel itself shouldn't be contained in the TCB.

If a malicious userspace binary can expoit a kernel vulnerability it doesn't matter what you do because its game over. You still seem to be missing the idea that protections at a given level can't protect that level reliably. SELinux doesn't claim to protect against kernel vulnerabilities. It claims to contain the accesses made by userspace programs and at best mitigates damage caused by an exploited application by confining the actions the application may take to only what it requires to run (assuming your policy is configured correctly). I'm talking about an exploit in a userspace framework allowing for attacks on other userspace applications. This is entirely reasonable considering its how attacks work today on systems that use simple DAC protections. Own a process running as root and do whatever you want including poke into the address space of other processes. You're trying to argue here that DAC protections are sufficient. This has been shown time and time again to be false. You might want to read up on the MAC vs DAC discussion to see exactly why they are insufficient. POSIX does not provide sufficient access control protections for any modern system.

Also with the exception of the mmap_minaddr bug which Brad found (and was subsequently fixed) SELinux does not grant permissions over your existing permissions. The LSM framework is designed to provide further restrictions not to act as a priviledge granting mechanism. So unless you've found an exploit in SELinux code which allows for arbitrary code execution or memory manipulation in the kernel I'm not sure what kind of buggy SELinux code you'd be referring to.

With respect to trusted path again there is currently no way to do this and relying on userspace to provide a mechanism for trusted path won't work. The fact that any number of components can be overwritten to trick you into typing a password in for an action that isn't the one you think you're authorizing makes that not possible today.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 14:54 UTC (Thu) by mjthayer (guest, #39183) [Link] (8 responses)

Replying to dpquigl:

Just to start with, thanks for your patience here, given that I am no security expert. I won't be offended when you decide to give up, but I am sure that I will learn something in-between. That said...!

> You still seem to be missing the idea that protections at a given level can't protect that level reliably.

I am probably misunderstanding you here somewhere, but I get the feeling that you lump all of user space as one "level". Surely the whole point here is that we have (at least) two levels, a small privileged subset of user space binaries which is the set of policy modules which DBus is configured to start and the set of binaries which a given user is allowed to execute, with DBus and PolicyKit the bridge and the communication mechanism between the two. I suppose I am slightly tainted here by experience of QNX where a lot of what is done in the kernel in e.g. Linux takes place in user space. (And of virtualisation development for that matter.)

> You might want to read up on the MAC vs DAC discussion to see exactly why they are insufficient.

I must admit that my grasp of MAC and DAC is very limited. As far as I can see, DAC is roughly allocating permission to access resources on a per-user basis, whereas MAC is more fine-grained permission to carry out particular actions. But that is also exactly what PolicyKit manages.

> With respect to trusted path again there is currently no way to do this and relying on userspace to provide a mechanism for trusted path won't work. The fact that any number of components can be overwritten to trick you into typing a password in for an action that isn't the one you think you're authorizing makes that not possible today.

The last I heard, the idea for doing that based on today's Linux/X11 systems was to have a second X server which only PolicyKit (that is, the policy modules) has access to and putting up the password prompt along with the clear message about what action was about to be taken there. I'm not sure what the plan was for proving to the user that this was indeed the "privileged" X server (Ctrl-Alt-Fx could verify that, but of course no one will do that every time).

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 15:35 UTC (Thu) by mjthayer (guest, #39183) [Link] (7 responses)

Replying to myself:
> I must admit that my grasp of MAC and DAC is very limited. As far as I can see, DAC is roughly allocating permission to access resources on a per-user basis, whereas MAC is more fine-grained permission to carry out particular actions. But that is also exactly what PolicyKit manages.

Taking a look at the CentOS documentation[1] to get an idea of what can be done with SELinux which can't be easily done in other ways, I see examples of things like forbidding a user from making their .ssh keys world-readable. I presume that in practice one would also restrict the set of applications able to read them even as that user. To achieve the same using PolicyKit one would have to have the keys stored in a file to which the user has no access at all and provide a policy module to access the keys. Clearly the SELinux approach has the advantage of being easier to retro-fit. On the other SELinux has something of the feel of a retro-fitted solution.

Basically though if I get it right MAC vs DAC means separating rights to access a file from rights to control its access rights.

[1] http://wiki.centos.org/HowTos/SELinux#head-01f53a6fa1f203...

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 16:00 UTC (Thu) by dpquigl (guest, #52852) [Link] (6 responses)

Central administration of security policy is just one property of MAC. The other more important one in SELinux is binding permissions to code and not user identity.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 16:12 UTC (Thu) by mjthayer (guest, #39183) [Link] (5 responses)

dpquigl wrote:
> Central administration of security policy is just one property of MAC. The other more important one in SELinux is binding permissions to code and not user identity.

So you are saying that the key feature of MAC in SELinux which PolicyKit is lacking is that it allows you to say "this action can only be performed by this user or set of users in combination with this binary or set of binaries", rather than just the first part of that? I realise of course that you will wince at the way I formulated that.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 17:19 UTC (Thu) by dpquigl (guest, #52852) [Link] (4 responses)

Its a different level of abstraction. Policy kit makes high level abstractions of what a program does. Like do this privileged operation. SELinux and MAC policies say this program performs these actions on these specific object in the system. These objects can be files or sockets or whatever you like. Its different concept because Policykit really doesn't map policy to system objects. It just maps it to high level concepts exposed by the mechanism. A policykit rule could be that to read my addressbook provided by some other dbus service I need to authenticate myself again. This has nothing to do with how the address book is stored on disk or any of the other resources the address book service needs to function. They are disjoint sets of permissions. Now DBUS has SELinux extensions in it. Where you can say that a process running with a certain SELinux label can contact a "mechanism" aka another service running with a different SELinux label. That's baked into DBUS however no one uses it(To the best of my knowledge). It would strengthen the use of policykit because you couldn't have arbitrary applications contact arbitrary mechanisms and requesting authorization. I guess the point that I haven't made very well in all of this is policykit isn't an access control mechanism. It more resembles an authentication mechanism and access control is still left to the underlying security mechanisms which protect the individual policykit "mechanisms". I really don't like that policy kit called their service providers mechanisms it makes the terminology confusing. Client, service provider, authentication agent make much more sense.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 17:25 UTC (Thu) by dpquigl (guest, #52852) [Link] (1 responses)

It would be nice if LWN had an edit button that tracked the edit history of the comment. That way you can fix stupid grammatical mistakes and because you kept the history you can make sure people don't white wash their comments.

CAP_SYS_ADMIN: the new root

Posted Mar 16, 2012 0:17 UTC (Fri) by filteredperception (guest, #5692) [Link]

"It would be nice if LWN had an edit button that tracked the edit history of the comment. That way you can fix stupid grammatical mistakes and because you kept the history you can make sure people don't white wash their comments."

+1. Yeah yeah yeah I should proofread more before hitting submit, but still... (Not saying that on a tight budget that LWN probably has they should dedicated a lot of resources. Just saying, if somebody has that itch, +1 more person would be gratified)

CAP_SYS_ADMIN: the new root

Posted Mar 16, 2012 5:18 UTC (Fri) by mjthayer (guest, #39183) [Link] (1 responses)

This has strayed pretty far from the original question of what practical things one can accomplish with *capabilities* that PolicyKit can't do, but since we are here... I found the first example (the sshd one) in this posting[1] interesting and educational as to what SELinux MAC is useful for. Obviously one couldn't use PolicyKit to accomplish anything like this, as it is more for controlling privilege escalation than preventing it. And unlike the other person in the discussion, I don't think that privsep is the answer here, as that won't prevent the unprivileged person logged in through ssh from escalating their privileges afterwards through some buggy setuid binary. (Note that SELinux would not protect from a buggy PolicyKit module in the example either, as that would potentially allow the ssh user to trigger an escalation in a different process, though it would be harder to exploit than if it were in the same one.)

[1] http://lwn.net/Articles/103705/

CAP_SYS_ADMIN: the new root

Posted Mar 16, 2012 9:11 UTC (Fri) by mjthayer (guest, #39183) [Link]

To continue off-track and start on the old song - I think that one of the things that irks me most about SELinux is that I have so much trouble nailing down what it is and does, which I think is down to the fact that it doesn't try to solve a precise problem but more to be a general solution to all security issues. As an example, it covers both forbidding people to change the permissions on sensitive files they own, but also forbids binaries from modifying their own executable code without express permission (actually, to add to the confusion, I think there is an official workaround for that involving having two mappings for the memory, one writeable and one executable). Both laudable goals, but perhaps they should be a bit more clearly separated.

DAC vs MAC, and Posix Capabilities

Posted Mar 24, 2012 5:32 UTC (Sat) by gmatht (subscriber, #58961) [Link]

The fundamental difference between MAC and DAC that in MAC the administrator/system decides whether a right is shared while in DAC each object that holds a right makes the choice as to whether to delegate it. Traditional capability operating system such as GNOSIS [1] from 1979 are all about being able to delegate only the rights you want to a particular applications, and those applications in turn being able to delegate rights to modules. But they aren't MAC as such, though you could implement MAC on top of them.

The broad concensus at cap-talk is that MAC is usually a liability, as what you really need is finely divisible rights and the overhead of having to change a central security policy means that MAC systems rarely have finely divisible rights (they also agree that POSIX "capabilities" give true capability systems a bad name). Compare the "capabilities" in POSIX to a traditional DAC Capability system such as GNOSIS/KeyKOS. Most capabilities in KeyKOS aren't equivalent to root, or even the user "nobody". Indeed in KeyKOS a process running with the rights of "nobody" would be considered highly privileged, as it has a huge number of rights. For example, it has direct access to the filesystem which is a highly complex, sensitive and hence exploitable piece of code.

The real argument for MAC is that it stops users delegating rights, the first example given was stopping users sharing their maildir. In general malicious users and objects can bypass this by proxying the right, over a side channel if need be. In practice, the benfit of MAC is stopping users from accidentally doing something stupid. But in MAC there is always a trade off between functionality and security. Maybe the user is going away and wants to allow their friend to access their mail to deal with any important issues that crop up. Pathologically, this may encourage the user to give their friend their password, resulting in even worse security.

In true capability systems, objects can delegate precisely those rights required. So, if an object calls "compress(a,b)" compress gets the right to a and b, but does not even know whether a "c" exists. This doesn't come at the cost of functionality as a well written program should never access variables that are out-of-scope, indeed such a program shouldn't even compile. This makes security an "Inexpensive lunch" [2] because you get it for free with a well decomposed object oriented design. Likewise the progam "gedit" would get the right to modify ~/.bashrc if and only if the user selected .bashrc in the file open dialog box; this can be retrofitted to existing POSIX applications, often without even a recompile [3]. In a traditional MAC system the admin would have to decide which rights gedit should have, and would probably just decide to give it rights to the whole home directory.

[1] http://www.cis.upenn.edu/~KeyKOS/Gnosis/Gnosis.html
[2] http://wiki.erights.org/wiki/Walnut/Secure_Distributed_Co...
[3] http://plash.beasts.org/powerbox.html

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 11:38 UTC (Thu) by dpquigl (guest, #52852) [Link]

You are correct. Binder plus the userspace security policy framework in android is very much like policykit/dbus. The major difference is that Android makes use of their own special IPC mechanism (Binder) because of what they saw as deficiencies in DBUS at the time.

I don't know of many people using Linux capabilities currently (although I don't claim to be an expert on it.) Dan Walsh and others at Red Hat are working on using file capabilities to remove the need for suid on binaries in Fedora. If you're going to make a capabilities system you need to make sure you do it right otherwise you miss the benefits of having it in the first place. That's why they really need to be auditing what actions go under what capabilities and breaking them out as necessary. People say the same thing about the complexity of SELinux. I personally find capabilities and all of their semantics far more complicated than any SELinux concepts and I was introduced to both at the same time.

I think the solution here is to go over all the calls to capabilities and make sure that 1) they are the correct capability, and 2) if they are not that there is the appropriate granularity present for those capabilities. The way that LSM is setup currently is that the capabilities module is the default security model unless something else is specified and then it is chained together with whatever LSM is loaded. So it is there regardless of what you do so we should do it right.

I personally would use SELinux instead of capabilities but I do see a benefit to making use of capabilities to remove suid behavior in the system for when someone decides to not use SELinux. In general I'd say SELinux is a superior solution because it actually controls access to specific objects in the system where capabilities give you access to entire classes of objects. For example with SELinux I can limit an application to binding to a specific port. With capabilities from my understanding the only thing I can do is say whether or not a program can bind to ports.

CAP_SYS_ADMIN: the new root

Posted Mar 14, 2012 21:22 UTC (Wed) by ballombe (subscriber, #9523) [Link] (1 responses)

Maybe an extra level of indirection would help:
Linux developers would create new virtual capabilities for each new usages,
and the capabilities maintainer would associate them to real capabilities
in separate patches.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 20:05 UTC (Thu) by bronson (subscriber, #4806) [Link]

That might help but I'd be afraid that it opens another attack surface. A virtual capability may appear safe, but mapping it to a real capability could cause rather nonobvious holes to appear. Especially if multiple virtual capabilities get mapped into a single real one.

CAP_SYS_ADMIN: the new root

Posted Mar 15, 2012 16:11 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

I absolutely loathe the capabilities. Their current implementation is braindead and their pushers should be put up against the wall and shot.

First, in the good old times I could just look at an executable and see if it's a setuid executable. Which means "it may be dangerous, beware".

Right now we have tons of capabilities with quite a lot of them equivalent to root access, which are hidden away in extended attributes. And people somehow think it's a GOOD thing.

Then there's a question of braindead el-dumbo capability inheritance. I have not been able after literally hours of trying to grant my Java program access to restricted ports. Should be easy, right? There definitely should be a program which you can run as root, and which will drop excessive capabilities and set uid to another user. Right? Well, think again.

CAP_SYS_ADMIN: the new root

Posted Mar 19, 2012 20:34 UTC (Mon) by BenHutchings (subscriber, #37955) [Link] (2 responses)

systemd apparently is that program.

CAP_SYS_ADMIN: the new root

Posted Mar 20, 2012 2:07 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Systemd is indeed quite nice. Alas, it's not supported in Debian Stable. And it'll probably won't be integrated properly in Wheeze as well. So the earliest date I can use it is around 2016. Oh well...

BTW, I see that Wheeze now supports AppArmor ( http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=598408 ).

Some time ago ( http://lwn.net/Articles/459460/ ) I promised to send you a case of beer or a yearly subscription to LWN in that case. So what do you choose? :)

CAP_SYS_ADMIN: the new root

Posted Mar 20, 2012 2:33 UTC (Tue) by foom (subscriber, #14868) [Link]

It looks most likely that it'll be a supported alternative init system in Wheezy, although not the default. Which is already pretty sweet, although being default would of course be better.

CAP_SYS_ADMIN: the new root

Posted Mar 16, 2012 5:40 UTC (Fri) by Arach (guest, #58847) [Link]

There's another problem that should be considered in this context. The kernel code restricted with capabilities might be written with relaxed sense of security and/or without due audit, because of a false assumption that capable processes are more trusted than unprivileged ones.

CAP_SYS_ADMIN: the new root

Posted Mar 16, 2012 22:51 UTC (Fri) by hallyn (subscriber, #22558) [Link]

Thanks very much, Michael, for writing this article. A few notes below,
some of which we discussed in email:

1. Besides their finer-grained nature, capabilities have another
advantage over setuid root: you lose the cap after exec. So shellcode
for execve(/bin/bash) doesn't grant a root shell by itself.

Put another way, your blog post starts with:

"The idea of capabilities is to break the power of root (user ID 0) into
independently assigned pieces governing specific privileged operations"

But the other fundamental property of capabilities is intended to be
that programs, not people, wield privilege, so that privilege is granted
to a combination of the logged in user and the program being execute.
Note that to fully achieve this programs ought to also lock themselves
into a noroot|nosuid_fixup securelevel.

2. Regarding breaking up some of the courser capabilities, I had
suggested making what we implicitly did with CAP_SYSLOG explicit, namely
introducing a hierarchy. At the top level, there is CAP_ALL_CAPS. This
is not quite the same as being root since after exec you can lose this
privilege. But of course it's enough power to let you ensure you can
keep your privilege. Then come most of the current ones, CAP_SYS_ADMIN,
CAP_NET_ADMIN, etc. Then come newer fine grained ones, like CAP_SYSLOG
and CAP_IPC_ADMIN.

Perhaps we can introduce through capability.h, through the use of some
annotations, a graph of the hierarchy, and a hierarchical way to refer
to the capabilities in the code. The point here is to let userspace
decide how fine grained to get. It can use CAP_SYS_ADMIN, or, over
time, choose CAP_SYSLOG. I think if we guarantee that a given
capability will never become insufficient privilege for what it could
previously achieve, that helps userspace.

Some privileges need to be reconsidered. For instance, it's been
pointed out that mount is dangerous because you can overmount /.
But really, there are several issues with mount:

1. Unprivileged mounts patches have been out there for years,
and apart from the issue of unpriv users preventing admins
from deleting files using mounts in unreachable namespaces,
it's understood hwo to allow many mount actions safely.
2. Mounting to targets which you don't own is obviously one
dangerous aspect to mount, as you can overwrite "trusted
paths" like /sbin or /etc.
3. Lack of trust in the in-kernel filesystem (and especially
super block) parsers is another, separate concern. You may
be able to write garbage to a file, mount it loopback, and
cause ext2's read_super to crash the kernel (or worse).

3. Analysis. Thanks for the great ideas. I've somewhat lost track of
kernel dev cycles, but your idea to analyze changes to capabilities at
rcs is a great one, and I should act on it. I also should reproduce and
expand on the analysis of the current capability checks that you've
done.

CAP_SYS_ADMIN: the new root

Posted Mar 17, 2012 18:00 UTC (Sat) by giraffedata (guest, #1954) [Link]

The article seems to imply that many of the things that today require CAP_SYS_ADMIN could instead require some other existing capability. But that's not my impression.

I see CAP_SYS_ADMIN as the miscellaneous category, for things that don't merit their own capability. When I've added privileged operations, I have always scanned all the existing categories and almost never found any more fitting than CAP_SYS_ADMIN.

History Repeats

Posted Mar 22, 2012 2:20 UTC (Thu) by ldo (guest, #40946) [Link]

I saw much the same thing play out in the 1980s with VMS, and its “privileges” system (of which Linux capabilities are a very close copy in principle). Even with a (presumably) centrally-managed design and implementation, you still get overlaps and odd divisions.

And have you figured out what you’re trying to achieve, anyway? Are you trying to guard against accidents, or malice? Guarding against malice means trying to ensure that none of the privileges/capabilities is on its own effectively equivalent to full root access—a task which seems hopeless.

CAP_SYS_ADMIN: the new root

Posted Apr 2, 2012 23:14 UTC (Mon) by nwmcsween (guest, #62367) [Link]

Please stop with posix 'capabilities' adopt something that isn't garbage maybe capsicum or finish seccomp2? Actual capability based security could be so much better...

CAP_SYS_ADMIN: the new root

Posted Aug 2, 2019 16:29 UTC (Fri) by mkerrisk (subscriber, #1978) [Link]

And an update: back in Linux 3.2, CAP_SYS_ADMIN was 38% of uses in the kernel. I checked Linux 5.2 today. Now it's just over 45%.