As I have been griping on Twitter lately, about how I dislike the design of modern UNIX operating systems, an interesting conversation about object capabilities came up with the author of musl-libc. This conversation caused me to realize that systems programmers don’t really have a understanding of object capabilities, and how they can be used to achieve environments that are aligned with the principle of least authority.
In general, I think this is largely because we’ve failed to effectively disseminate the research output in this area to the software engineering community at large — for various reasons, people complete their distributed systems degrees and go to work in decentralized finance, as unfortunately, Coinbase pays better. An unfortunate reality is that the security properties guaranteed by Web3 platforms are built around object capabilities, by necessity – the output of a transaction, which then gets consumed for another transaction, is a form of object capability. And while Web3 is largely a planet-incinerating Ponzi scheme run by grifters, object capabilities are a useful concept for building practical security into real-world systems.
Most literature on this topic try to describe these concepts in the framing of, say, driving a car: by default, nobody has permission to drive a given car, so it is compliant with the principle of least authority, meanwhile the car’s key can interface with the ignition, and allow the car to be driven. In this example, the car’s key is an object capability: it is an opaque object, that can be used to acquire the right to drive the car. Afterwards, they usually go on to describe the various aspects of their system without actually discussing why anybody would want this.
the principle of least authority
The principle of least authority is hopefully obvious, it is the idea that a process should only have the rights that are necessary for the process to complete. In other words, the calculator app shouldn’t have the right to turn on your camera or microphone, or snoop around in your photos. In addition, there is an expectation that the user should be able to express consent and make a conscious decision to grant rights to programs as they request them, but this isn’t necessarily required: a user can delegate her consent to an agent to make those decisions on her behalf.
In practice, modern web browsers implement the principle of least authority. It is also implemented on some desktop computers, such as those running recent versions of macOS, and it is also implemented on iOS devices. Android has also made an attempt to implement security primitives that are aligned with the principle of least authority, but it has various design flaws that mean that apps have more authority in practice than they should.
the object-capability model
The object-capability model refers to the use of object capabilities to enforce the principle of least authority: processes are spawned with a minimal set of capabilities, and then are granted additional rights through a mediating service. These rights are given to the requestor as an opaque object, which it references when it chooses to exercise those rights, in a process sometimes referred to as capability invocation.
Because the object is opaque, it can be represented in many different ways: as a digital signature (like in the various blockchain platforms), or it can simply be a reference to a kernel handle (such as with Capsicum’s capability descriptors). Similarly, invocation can happen directly, or indirectly. For example, a mediation service which responds to a request to turn on the microphone with a sealed file descriptor over
SCM_RIGHTS, would still be considered an object-capability model, as it is returning the capability to listen to the microphone by way of providing a file descriptor that can be used to read the PCM audio data from the microphone.
Some examples of real-world object capability models include: the Xen hypervisor’s
Sandbox.framework in macOS (derived from FreeBSD’s Capsicum), Laurent Bercot’s
xdg-portal specification from freedesktop.org and privilege separation as originally implemented in OpenBSD.
capability forfeiture, e.g. OpenBSD pledge(2)/unveil(2)
OpenBSD 5.9 introduced the
pledge(2) system call, which allows a process to voluntarily forfeit a set of capabilities, by restricting itself to pre-defined groups of syscalls. In addition, OpenBSD 6.4 introduced the
unveil(2) syscall, which allows a process to voluntarily forfeit the ability to perform filesystem I/O except to a pre-defined set of paths. In the object capability vocabulary, this is referred to as forfeiture.
The forfeiture pattern is seen in UNIX daemons as well: because you historically need
root permission (or in Linux, the
CAP_NET_BIND_SERVICE process capability bit) to bind to a port lower than 1024, a daemon will start as
root, and then voluntarily drop itself down to an EUID which does not possess
root privileges, effectively forfeiting them.
Because processes start with high levels of privilege and then gives up those rights, this approach is not aligned with the principle of least authority, but does have tangible security benefits in practice.
Linux process capabilities
Since Linux 2.2, there has been a feature called process capabilities, the
CAP_NET_BIND_SERVICE mentioned above is one of the capabilities that can be set on a binary. These are basically used as an alternative to setting the SUID bit on binaries, and are totally useless outside of that use case. They have nothing to do with object capabilities, although an object capability system could be designed to facilitate many of the same things SUID binaries presently do today.
Fuchsia uses a combination of filesystem namespacing and object capability mediation to restrict access to specific devices on a per-app basis. This could be achieved on Linux as well with namespaces and a mediation daemon.