A few days ago, Qualys dropped CVE-2021-4034, which they have called “Pwnkit”. While Alpine itself was not directly vulnerable to this issue due to different engineering decisions made in the way musl and glibc handle SUID binaries, this is intended to be a deeper look into what went wrong to enable successful exploitation on GNU/Linux systems.
a note on blaming systemd
Before we get into this, I have seen a lot of people on Twitter blaming systemd for this vulnerability. It should be clarified that systemd has basically nothing to do with polkit, and has nothing at all to do with this vulnerability, systemd and polkit are separate projects largely maintained by different people.
We should try to be empathetic toward software maintainers, including those from systemd and polkit, so writing inflammatory posts blaming systemd or its maintainers for polkit does not really help to fix the problems that made this a useful security vulnerability.
the theory behind exploiting CVE-2021-4034
For an idea of how one might exploit CVE-2021-4034, lets look at blasty’s “blasty vs pkexec” exploit. Take a look at the code for a few minutes, and come back here. There are multiple components to this exploit that have to all come together to make it work. A friend on IRC described it as a “rube goldberg machine” when I outlined it to him.
The first component of the exploit is the creation of a GNU iconv plugin: this is used to convert data from one character set to another. The plugin itself is the final step in the pipeline, and is used to gain the root shell.
The second component of the exploit is using
execve(2) to arrange for
pkexec to be run in a scenario where
argc < 1. Although some POSIX rules lawyers will argue that this is a valid execution state, because the POSIX specification only says that
argv should be the name of the program being run, I argue that it is really a nonsensical execution state under UNIX, and that defensive programming against this scenario is ridiculous, which is why I sent a patch to the Linux kernel to remove the ability to do this.
The third component of the exploit is the use of GLib by
pkexec. GLib is a commonly used C development framework, and it contains a lot of helpful infrastructure for developers, but that framework comes at the cost of a large attack surface, which is undesirable for an SUID binary.
The final component of the exploit is the design decision of the GLIBC authors to attempt to sanitize the environment of SUID programs rather than simply ignore known-harmful environmental variables when running as an SUID program. In essence, Qualys figured out a way to bypass the sanitization entirely. When these things combine, we are able to use
pkexec to pop a root shell, as I will demonstrate.
how things went wrong
Now that we have an understanding of what components are involved in the exploit, we can take a look at what happens from beginning to end. We have our helper plugin, which launches the root shell, and we have an understanding of the underlying configuration and its design flaws. How does all of this come together?
The exploit itself does not happen in
blasty-vs-pkexec.c, that just sets up the necessary preconditions for everything else to fall into place, and then runs
pkexec. But it runs
pkexec in a way that basically results in an execution state that could be described as a weird machine: it uses
execve(2) to launch it in an execution state where there are no arguments provided, not even an
pkexec is running in this weird state that it was never designed to run in, it executes as normal, except that we wind up in a situation where
argv is actually the beginning of the program’s environment. The first value in the environment is
lol, which is a valid argument, but not a valid environment variable, since it is missing a value. If we run
pkexec lol in a terminal, we get:
[kaniini@localhost ~]$ pkexec lol
Cannot run program lol: No such file or directory
The reason why this is interesting is because that message is actually generated by
g_log(), and that’s where the fun begins. In initializing the GLog subsystem, there is a code path where
g_utf8_validate() gets called on
argv. When running as a weird machine, this validation fails, because
argv is NULL. This results in GLib trying to convert
argv to UTF-8, which uses
iconv, a libc function.
On GLIBC, the
iconv function is provided by the GNU
libiconv framework, which supports loading plugins to add additional character sets, from a directory specified as
GCONV_PATH is removed from an SUID program’s environment because GLIBC sanitizes the environment of SUID programs, but Qualys figured out a way to glitch the sanitization, and so
GCONV_PATH remains in the environment. As a result, we get a root shell as soon as it tries to convert
argv to UTF-8.
where do we go from here?
On Alpine and other musl-based systems, we do not use GNU libiconv, so we are not vulnerable to blasty’s PoC, and musl also makes a more robust decision: instead of trying to sanitize the environment of SUID programs, it just ignores variables which would lead to musl loading additional code, such as
LD_PRELOAD entirely when running in SUID mode.
This means that ultimately three things need to be fixed:
pkexec itself should be fixed (which has already been done), to close the vulnerability on older kernels, the kernel itself should be fixed to disallow this weird execution state (which my patch does), and GLIBC should be fixed to ignore dangerous environmental variables instead of trying to sanitize them.