linux capability 权限控制

来源：互联网发布：淘宝网卖家不同意编辑：程序博客网时间：2024/05/21 10:33

http://www.ibm.com/developerworks/library/l-posixcap.html

Some programs need to perform privileged operations on behalf of an unprivileged user. For instance, the passwd program writes to the very sensitive /etc/passwd and /etc/shadow files. On UNIX® systems, you achieve this control by setting the setuid bit on the binary file. This bit tells the system that while the program is running -- regardless of who executed it -- it should be treated as belonging to the user who owns the file, typically the root user. Because the passwd program cannot be written to by the user, and is very constrained in what it allows the user to do, this setup is usually safe. More complicated programs make use of saved uids to switch back and forth between root and a non-root user.

POSIX capabilities break the root privileges into smaller pieces, and allow tasks to run with only a subset of the root user's privileges. File capabilities allow such privileges to be attached to a program, greatly simplifying the use of capabilities. POSIX capabilities have been available in Linux for years. Using capabilities has several advantages over being the root user:

exec(3)

From the Linux man pages: The exec() family of functions replaces the current process image with a new process image. Find more details in Resourcesbelow.

You can remove the capabilities from the effective set but retain them in the permitted set to prevent inadvertent abuse of the capabilities.
You can remove all unneeded capabilities from the permitted set, such that they cannot be regained. Admittedly, most of the capabilities are dangerous and can be abused, but reducing the capabilities available to an attacker might well protect your system from harm.
After an exec(3) of a regular executable file, all capabilities are lost. (The details are more complicated and are soon expected to change, as will be explained later in this article.)

This article shows you how programs can make use of POSIX capabilities, how to investigate which capabilities are needed by a program, and how to assign those capabilities to the program.

Process capabilities

For years, POSIX capabilities could be assigned to processes, but not to files. A program therefore had to be started by root (or be owned by root and have its setuid bit set) before it could drop some of its root privileges while keeping others. Additionally, the order in which capabilities had to be dropped was very specific:

The program would tell the system that it wanted to keep its capabilities while changing its effective userid. That is done using prctl.
The program would change its userid to something other than root.
The program would construct sets of required capabilities and make those its active sets.

A process carries three capability sets: permitted (P), inheritable (I), and effective (E). When a process forks, the child's capability sets are copied from the parent. When a process executes a new program, its new capability sets are calculated according to a formula I will discuss in a moment.

The effective set consists of those capabilities that the process can currently use. The effective set must always be a subset of the permitted set. The process can change the contents of the effective set at any time as long as the effective set does not exceed the permitted set. The inheritable set is used only for calculating the new capability sets after exec().

Listing 1 shows the three formulas that dictate a process's new capability sets after file execution according to the POSIX draft (see Resources for a link to IEEE Std 1003.1-2001).

Listing 1. Formulas for new capability sets after exec()

pI' = pIpP' = fP | (fI & pI)pE' = pP' & fE

A value ending with a ' indicates the newly calculated value. A value beginning with a p indicates a process capability. A value beginning with an f indicates a file capability.

The inheritable set is taken unchanged from the parent process, so once a process drops a capability from its inheritable set, it should never be able to regain it (but read the discussion of SECURE_NOROOT below). The new permitted set is taken as a union of the file's permitted set and the result of intersecting the file's and process's inheritable sets. The process's effective set is the conjunction of the new permitted and file effective sets. Technically, in Linux fE is not a set but a boolean. If true, the pE' is set topP'. If false, then pE' starts empty.

For a process to keep any capabilities after executing a file, the capabilities must be in the file's permitted or inheritable set. Because Linux has not implemented file capabilities for most of its life, this posed an unworkable restriction. To get around it, a "secure mode" was implemented. It consists of two bits:

When SECURE_NOROOT is not set, then when a process executes a file, the new capability sets may be calculated as though the file had some file capability sets set fully populated. In particular:
- The file inheritable and permitted sets will be full on if the process's real or effective uid is 0 (root) or the file is setuid root.
- The file effective set will be full on if the process's effective uid is root or the file is setuid root.
When SECURE_NO_SETUID_FIXUP is not set, then when a process switches its real or effective uids to or from 0, capability sets are further shifted around:
- If a process switches its effective uid from 0 to non-0, then its effective capability set is cleared.
- If a process switches its real, effective, or saved uids from at least one being 0 to all being non-zero, then both the permitted and effective capabilities are cleared.
- If a process sets its effective uid from non-zero to 0, then the effective capabilities are set equal to the permitted capabilities.

This set of rules allows a process to have capabilities either by virtue of being root or by running a setuid root file. However, theSECURE_NO_SETUID_FIXUP rules prevent a process from keeping any capabilities after becoming non-root. But withSECURE_NOROOT unset, a root process having dropped some capabilities can simply execute another program to regain its capabilities. So in order for capabilities to be useful, a root process must be able to irrevocably switch its uid to non-zero while keeping a few capabilities.

Using prctl(3), a process can request keeping its capabilities across its next setuid(2) call. This means that a process can:

Start as root, either by authenticating as root or executing a setuid root binary.
Call prctl(2) to set PR_SET_KEEPCAPS, which asks the system to let it keep its capabilities across setuid(2).
Call setuid(2) or a related system call to change its userid.
Call cap_set_proc(3) to drop capabilities.

Now the process can continue running with a subset of root privileges. If it is compromised, the attacker can use only the capabilities present in the effective set, or, with a call to cap_set_proc(3), in its permitted set. And if the attacker should coerce the program into executing another file, all capabilities will be dropped and the file will be executed as an unprivileged user.

The function exec_with_caps() in Listing 2 shows a function that can be used by a setuid root program to continue execution at a specified function as a specified userid and with a set of capabilities specified as a string.

Listing 2. Executing code with reduced capabilities

#include <sys/prctl.h>#include <sys/capability.h>#include <sys/types.h>#include <stdio.h>int printmycaps(void *d){cap_t cap = cap_get_proc();printf("Running with uid %d/n", getuid());printf("Running with capabilities: %s/n", cap_to_text(cap, NULL));cap_free(cap);return 0;}int exec_with_caps(int newuid, char *capstr, int (*f)(void *data), void *data){int ret;cap_t newcaps;ret = prctl(PR_SET_KEEPCAPS, 1);if (ret) {perror("prctl");return -1;}ret = setresuid(newuid, newuid, newuid);if (ret) {perror("setresuid");return -1;}newcaps = cap_from_text(capstr);ret = cap_set_proc(newcaps);if (ret) {perror("cap_set_proc");return -1;}cap_free(newcaps);f(data);}int main(int argc, char *argv[]){if (argc < 2) {printf("Usage: %s <capability_list>/n",argv[0]);return 1;}return exec_with_caps(1000, argv[1], printmycaps, NULL);}

To test this, paste the code into a file named execwithcaps.c, and compile and run it as root:

gcc -o execwithcaps execwithcaps.c -lcap./execwithcaps cap_sys_admin=eip

File capabilities

File capabilities are currently implemented in the -mm kernel tree, and are expected in the mainline kernel by 2.6.24. With file capabilities, you can assign capabilities to a program. For example, the ping program requires CAP_NET_RAW in order to function. For this reason, it has historically been a setuid root program. With file capabilities, you can reduce the amount of privilege invested in the program by doing:

chmod u-s /bin/pingsetfcaps -c cap_net_admin=p -e /bin/ping

This requires the newest version of the libcap libraries and related programs, which are available at GoogleCode (seeResources for a link). This first removes the setuid bit from the binary, then assigns it the CAP_NET_RAW privilege it needs. Now any user can run ping with the CAP_NET_RAW privilege, but if the ping program is compromised, the attacker can exercise no other privileges.

The question arises how you would determine the minimal capability set required for an unprivileged user to run any particular program. If there were only the one program, a worthwhile approach would be to scour the application, its dynamically linked libraries, and the kernel sources. This action, though, would need to be repeated for all setuid root programs. Of course, this approach may not be a bad idea before allowing an application to be run as root by an unprivileged user, but it is unfortunately an unrealistic prospect.

If a program were verbose and well behaved, it might be possible to simply run the program without privilege and have it complain about which privileges it lacks. Let's try that with ping.

chmod u-s /bin/pingsetfcaps -r /bin/pingsu - myuserping google.comping: icmp open socket: Operation not permitted

This technique could be helpful depending on our understanding of the implementation of icmp, but it certainly isn't spelled out for us.

Next, we can try to run the program (again without the suid bit) under strace. strace reports all system calls used by the program along with their return values, so we can look through the strace output for return values indicating lack of permission.

strace -oping.out ping google.comgrep EPERM ping.out   socket(PF_INET, SOCK_RAW, IPPROTO_ICMP) = -1 EPERM (Operation not permitted)

The permission we lack is to create a socket of type SOCK_RAW. Reading through /usr/include/linux/capability.h, you'll see that:

/* Allow use of RAW sockets *//* Allow use of PACKET sockets */#define CAP_NET_RAW          13

In this case, it is clear that CAP_NET_RAW is the capability needed in order to allow unprivileged users to use ping. However, it does seem likely that some programs will attempt and be denied with -EPERM many things that they don't actually need to do. It's also likely that the capability it will need won't be quite as simple to guess.

Another more practical approach may be to insert a probe into the kernel at the place where capabilities are checked. The probe will print debugging information about denied capabilities.

kprobes allow developers to write small kernel modules to run code at the start of a function (jprobe), the end of a function (kretprobe), or at any address (kprobe). Enabling this ability allows you to obtain information about which capabilities the kernel requires to run certain programs. (This remainder of this section assumes that you have a kernel with both kprobes and file capabilities enabled.)

Listing 3 is a kernel module that inserts a jprobe to instrument the start of the cap_capable() kernel function.

Listing 3. capable_probe.c

#include <linux/kernel.h>#include <linux/module.h>#include <linux/kprobes.h>#include <linux/sched.h>static const char *probed_func = "cap_capable";int cr_capable (struct task_struct *tsk, int cap){printk(KERN_NOTICE "%s: asking for capability %d for %s/n",__FUNCTION__, cap, tsk->comm);jprobe_return();return 0;}static struct jprobe jp = {.entry = JPROBE_ENTRY(cr_capable)};static int __init kprobe_init(void){int ret;jp.kp.symbol_name = (char *)probed_func;if ((ret = register_jprobe(&jp)) < 0) {printk("%s: register_jprobe failed, returned %d/n",__FUNCTION__, ret);return -1;}return 0;}static void __exit kprobe_exit(void){unregister_jprobe(&jp);printk("capable kprobes unregistered/n");}module_init(kprobe_init);module_exit(kprobe_exit);MODULE_LICENSE("GPL");

When this kernel module is inserted, any calls to cap_capable() are replaced by a call to the cr_capable() function. This function prints the name of the program that requires capabilities and the capability being checked. It then continues executing the actual cap_capable() call through the call to jprobe_return().

Compile the module using the makefile in Listing 4:

Listing 4. Makefile for capable_probe

obj-m := capable_probe.oKDIR := /lib/modules/$(shell uname -r)/buildPWD := $(shell pwd)default:$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modulesclean:rm -f *.mod.c *.ko *.o

Then execute as root:

/sbin/insmod capable_probe.ko

Now in one window, watch the system logs using:

tail -f /var/log/messages

In another window, as non-root, execute the ping binary without the setuid bit set:

/bin/ping google.com

The system logs now contain multiple entries for ping. These are the capabilities that the program attempted to use. Not that all of these are needed. We can cross-reference /usr/include/linux/capability.h to convert the integer to a capability name and see that ping requested 21, 13, and 7.

21 is CAP_SYS_ADMIN. Avoid granting this catch-all to any program.
7 is CAP_SETUID. Ping should not require this.
13 is CAP_NET_RAW. Ping should require this.

Let's grant it that capability and see whether it succeeds.

setfcaps -c cap_net_raw=p -e /bin/ping(become non root user)ping google.com

As we expected, ping succeeded.

Complications

Existing software is often written to be as secure as possible with few changes across many UNIX variants. On top of this, distributions sometimes apply their own patches, which can make it impossible to replace the root setuid bit with file capabilities in some situations.

An example of such a program on Fedora is at. The at program allows users to schedule jobs for execution at a later time. For instance, a cheap way to get a pop-up reminder to dial into a meeting at 2 p.m. would be:

echo "xterm -display :0.0 -e //"echo Call customer 555-5555; echo ^V^G; sleep 10m/" " | /at 14:00

The at program is available for all UNIX systems and can be used by any user. Users share a common job spool under /var/spool. Security is therefore of the utmost importance, but it also is coded to work across many systems and so does not make use of system-specific security mechanisms like capabilities. Nevertheless, it attempts to reduce privilege through the use of setuid(2). On top of this, the Fedora package adds patches of use to PAM modules.

The quickest way to check whether at could be made to run by a non-root user without being setuid root is to remove the setuid bit and then grant it all capabilities.

chmod u-s /usr/bin/atsetfcaps -c all=p -e /usr/bin/atsu - (non root user)/usr/bin/at

By specifying -c all=p, we asked for a fully populated permitted, or forced, capability set on /usr/bin/at. So any user running this program will do so with all of root's privileges. But on a Fedora 7, running /usr/bin/at will now result in:

You do not have permission to run at.

The reason is evident if you download and study the source code, but the details are not helpful for this exercise. While certainly it is possible to change the source code to make at usable with file capabilities, the setuid bit cannot be substituted by simply assigning file capabilities on Fedora.

File capability details

So far, we have been using a very specific format for the capabilities we assign to executables. For ping we used:

setfcaps -c cap_net_raw=p -e /bin/ping

setfcaps is a program that sets the target file's capabilities by setting an extended attribute named security.capability. The -cflag is followed by a list of capabilities in a somewhat free-flowing format:

capability_list=capability_set(s)

capability_set can contain i and p, and capability_list can contain any valid capabilities. The capability types represent inheritable and permitted sets, respectively, and separate capability lists can be specified for each set. The -e or -d flag dictates whether the capabilities in the permitted set are in the program's effective set on startup or not, respectively. If the capabilities are not in the program's effective set, then the program must be capability aware and must activate the bits in its effective set itself in order to make use of the capabilities.

Until now, we have asserted the desired capabilities in the permitted set but not the inheritable set. In fact, there are subtler and more powerful things we could do with capabilities. Recall Listing 1, repeated here:

Repeat of Listing 1. Formulas for new capability sets after exec()

pI' = pIpP' = fP | (fI & pI)pE' = pP' & fE

The file inheritable set specifies which of the process's inheritable capabilities can be in the process's new permitted set. If onlycap_dac_override is in the file inheritable set, then only that capability can be inherited into the process's new permitted set.

The file permitted set, also known as the "forced" set, is the set that is forced on in the new permitted set, regardless of whether it was in the task's inheritable set or not.

Finally, the file effective bit dictates whether the bits in the task's new permitted set should be in its new effective set; that is, whether the program should be able to actually exercise the capabilities without explicitly asking to using cap_set_proc(3).

Recall that the system makes a few changes for the root user when SECURE_NOROOT is not set. In particular, the system pretends that on the file being executed, the inheritable (fI), permitted (fP), and effective (fE) sets are fully populated. So the fI set on a binary is only useful for a non-root process with non-empty capability sets. In particular, for a program that has kept capabilities while becoming a non-root user, the above formulas will apply without such finagling. It is likely that SECURE_NOROOTwill become a per-process setting so that process trees can choose whether to use true capabilities or use a root-user-is-privileged model. But at the time of this writing, this is a system-wide setting that, for any practical system, is set such that the root user is always all-powerful by default.

To illustrate the interactions of these sets, let's assume that the administrator has used the following command to set file capabilities on /bin/some_program:

setfcaps -c cap_sys_admin=i,cap_dac_read_search=p -e //bin/some_program

If a non-root user runs this program while running with full capabilities, its inheritable set pI is first masked against fI so it is reduced to just cap_sys_admin. Next, fP is unioned with that set, so the interim result iscap_sys_admin+cap_dac_read_search. This set becomes the task's new permitted set.

Finally, since the effective bit is on, the task's new effective set will contain both the bits that are in its new permitted set.

In contrast, if a completely unprivileged user runs this same program, his empty inheritable set is masked against fI, resulting in the empty set. This is unioned with fP, resulting in cap_dac_read_search. This becomes the task's new permitted set. Finally, since the effective bit is on, the new permitted set is copied to the new effective set, resulting again incap_dac_read_search.

In either case, if the file effective bit were not set, then the task would need to use cap_set_proc(3) to copy any bits it wanted to use from its permitted set to its effective set.

Summary and exercises

To summarize:

The file effective bit dictates whether the program can exercise its permitted capabilities by default.
The file permitted set is a set that will always be on in the resulting process.
The file inheritable set is the set that can be inherited from the parent process's inheritable set into its new permitted set.

To illustrate what we've covered, let's experiment with the programs in Listings 5 and 6. In Listing 5, print_caps simply prints out the capability sets with which it is running. In Listing 6, exec_as_nonroot_priv is intended to be executed as the root user. It asks to keep its capabilities across the next setuid(2), becomes the non-root user specified as the first command-line argument, sets its capability sets to those indicated in the second command-line argument, and then executes the program specified as the third command-line argument.

Listing 5. print_caps.c

#include <stdio.h>#include <stdlib.h>#include <sys/capability.h>int main(int argc, char *argv[]){cap_t cap = cap_get_proc();if (!cap) {perror("cap_get_proc");exit(1);}printf("%s: running with caps %s/n", argv[0], cap_to_text(cap, NULL));cap_free(cap);return 0;}

Listing 6. exec_as_nonroot_priv.c

#include <sys/prctl.h>#include <sys/capability.h>#include <sys/types.h>#include <unistd.h>#include <stdio.h>void printmycaps(void){cap_t cap = cap_get_proc();if (!cap) {perror("cap_get_proc");return;}printf("%s/n",  cap_to_text(cap, NULL));cap_free(cap);}int main(int argc, char *argv[]){cap_t cur;int ret;int newuid;if (argc<4) {printf("Usage: %s <uid> <capset>""<program_to_run>/n", argv[0]);exit(1);}ret = prctl(PR_SET_KEEPCAPS, 1);if (ret) {perror("prctl");return 1;}newuid = atoi(argv[1]);printf("Capabilities before setuid: ");printmycaps();ret = setresuid(newuid, newuid, newuid);if (ret) {perror("setresuid");return 1;}printf("Capabilities after setuid, before capset: ");printmycaps();cur = cap_from_text(argv[2]);ret = cap_set_proc(cur);if (ret) {perror("cap_set_proc");return 1;}printf("Capabilities after capset: ");cap_free(cur);printmycaps();ret = execl(argv[3], argv[3], NULL);if (ret)perror("exec");}

Let's use these programs to verify the effect of the inheritable and permitted file capabilities. We will do this by placing file capabilities on print_caps, then executing print_caps with initial process capability sets carefully set up usingexec_as_nonroot_priv. First, set some capabilities just in print_caps's permitted set:

gcc -o print_caps print_caps.c -lcapsetfcaps -c cap_dac_override=p -d print_caps

Now execute print_caps as a non-root user:

su - (username)./print_caps

Next, as root, execute print_caps through exec_as_nonroot_priv:

./exec_as_nonroot_priv 1000 cap_dac_override=eip ./print_caps

In the either case, print_caps ran with cap_dac_override=p. Note that the effective set is empty. That means thatprint_caps would have to use cap_set_proc(3) before it would actually be able to make use of the cap_dac_overridecapability. To change that, use the -e flag to setflags to set the effective bit.

setfcaps -c cap_dac_override=p -e print_caps

print_caps has an empty fI so none of the process' pI is pulled into pP'. The single bit in pP' came from the file forced set,fP.

A more interesting test, though, is to test the effect of the inheritable file capability and run print_caps again both as a non-root user and through the exec_as_nonroot_priv program:

setfcaps -c cap_dac_override=i -e print_capssu - (nonroot_user)./print_capsexit./exec_as_nonroot_priv 1000 cap_dac_override=eip ./print_caps

This time, the non-root user has an empty capability set, while the process started as a root user has cap_dac_override in its permitted and effective sets.

Run print_caps one more time, this time simply as the root user without going through exec_as_nonroot_priv. Note that the capability set is full. The root user always receives a full capability set after executing a program, regardless of file capabilities. The exec_as_nonroot_priv does not run print_caps as the root user. Rather it uses the privileges of the root user to set up a non-root process with some inheritable capabilities.

Conclusion

Now you know how to determine which capabilities are needed by a program, how to set the capabilities, and how to do some other interesting things with file capabilities.

Always handle capabilities with care; they are still dangerous pieces of root privilege. On the other hand, experience with the sendmail capabilities bug (see Resources for a link) shows that providing too few capabilities can be dangerous as well. Nevertheless, file capabilities applied judiciously to system binaries in place of making them setuid root can help protect your systems.

Resources

Learn

The Secure programmer series on developerWorks includes several articles that comment on setuid():
- "Keep an eye on inputs" (developerWorks, December 2003) discusses ways data gets into your program and how to deal with them
- "Minimizing privileges" (May 2004) discusses ways to provide the minimal privileges without starving system users
- "Call components safely" (December 2004) explains how to prevent attackers from exploiting component calls
"Speaking UNIX, Part 8" (developerWorks, April 2007) shows how to control processes and use a number of commands to peer into your system.
"Anatomy of the Linux kernel" (developerWorks, June 2007) is a good place to start to understand how Linux bits fit together.
Read about the sendmail capabilities bug.
From the Linux man pages: The exec() family of functions replaces the current process image with a new process image.
From the Linux man pages: prctl(), operations on a process, is called with a first argument describing what to do (with values defined in <linux/prctl.h>) and further parameters with a significance depending on the first one.
From the Linux man pages: setuid() sets the effective user ID of the current process. If the effective UID of the caller is root, the real UID and saved set-user-ID are also set. Under Linux, it is implemented like the POSIX version with the_POSIX_SAVED_IDS feature. This allows a set-user-ID (other than root) program to drop all of its user privileges, do some un-privileged work, and then re-engage the original effective user ID in a secure manner.
From the Linux man pages: cap_set_proc() sets the values for all capability flags for all capabilities with the capability state identified by cap_p. The new capability state of the process will be completely determined by the contents of cap_pupon successful return from this function. If any flag in cap_p is set for any capability not currently permitted for the calling process, the function will fail, and the capability state of the process will remain unchanged.
POSIX Threads Programming is a great tutorial that begins with an introduction to concepts and takes you all the way through such topics as how to develop hybrid MPI/Pthreads.
POSIX, also known as IEEE Std 1003.1-2001, defines a standard operating system interface and environment, including a command interpreter (or "shell") and common utility programs to support applications portability at the source code level. It is intended to be used by both applications developers and system implementors.
Take a look at "Using ReiserFS with Linux" (developerWorks, April 2006) for an "alternative, advanced filesystem for the adventurous."
In "Differentiating UNIX and Linux" (developerWorks, March 2006), get a quick lesson in the differences in filesystem support between Linux and UNIX—look for the heading "Filesystem support."
For more on filesystems, "System Administration Toolkit: Migrating and moving UNIX filesystems" (developerWorks, July 2006) shows you how to transfer an entire filesystem on a live system, including how to create, copy, and re-enable.
In the developerWorks Linux zone, find more resources for Linux developers, and scan our most popular articles and tutorials.
See all Linux tips and Linux tutorials on developerWorks.
Stay current with developerWorks technical events and Webcasts.

Get products and technologies

Linux PAM is a flexible mechanism for authenticating users that lets developers craft programs that are independent of authentication scheme (so, "new device" doesn't have to equal recoding of all the authentication support programs).
With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.

Discuss

Get involved in the developerWorks community through blogs, forums, podcasts, and community topics in our new developerWorks spaces.

About the author

Serge Hallyn is a part of IBM's Linux Technology Center, focusing on Linux kernel and security. He obtained his Ph.D. in computer science from the College of William and Mary. He has written and contributed to several security modules. He currently focuses on adding support for virtual server functionality, application checkpoint/restart, and POSIX file capabilities.

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

other resource

http://www.symantec.com/connect/articles/introduction-linux-capabilities-and-acls

http://linux.die.net/man/7/capabilities