vm.overcommit_memory=2 is always the right setting for servers

The Linux kernel has a feature where you can tune the behavior of memory allocations: the vm.overcommit_memory sysctl. When overcommit is enabled (sadly, this is the default), the kernel will typically return a mapping when brk(2) or mmap(2) is called to increase a program’s heap size, regardless of whether or not memory is available. Sounds good, right?

Not really. While overcommit is convenient for application developers, it fundamentally changes the contract of memory allocation: a successful allocation no longer represents an atomic acquisition of a real resource. Instead, the returned mapping serves as a deferred promise, which will only be fulfilled by the page fault handler if and when the memory is first accessed. This is an important distinction, as it means overcommit effectively replaces a fail-fast transactional allocation model with a best-effort one where failures are only caught after the fact rather than at the point of allocation.

To understand how this deferral works in practice, let’s consider what happens when a program calls malloc(3) to get a new memory allocation. At a high level, the allocator calls brk(2) or mmap(2) to request additional virtual address space from the kernel, which is represented by virtual memory area objects, also known as VMAs.

On a system where overcommit is disabled, the kernel ensures that enough backing memory is available to satisfy the request before allowing the allocation to succeed. In contrast, when overcommit is enabled, the kernel simply allocates a VMA object without guaranteeing that backing memory is available: the mapping succeeds immediately, even though it is not known whether the request can ultimately be satisfied.

The decoupling of success from backing memory availability makes allocation failures impossible to handle correctly. Programs have no other option but to assume the allocation has succeeded before the kernel has actually determined whether the request can be fulfilled. Disabling overcommit solves this problem by restoring admission control at allocation time, ensuring that allocations either fail immediately or succeed with a guarantee of backing memory.

Failure locality is important for debugging

When allocations fail fast, they are dramatically easier to debug, as the failure is synchronous with the request. When a program crashes due to an allocation failure, the entire context of that allocation is preserved: the requested allocation size, the subsystem making the allocation and the underlying operation that required it are already known.

With overcommit, this locality is lost by design. Allocations appear to succeed and the program proceeds under the assumption that the memory is available. When the allocation is eventually accessed, the kernel typically responds by invoking the OOM killer and terminating the process outright. From the program’s perspective, there is no allocation failure to handle, only a SIGKILL. From the operator’s perspective, there is no stack trace pointing to the failure. There are only post-mortem logs which often fail to paint a clear picture of what happened.

Would you rather debug a crash at the allocation site or reconstruct an outage caused by an asynchronous OOM kill? Overcommit doesn’t make allocation failure recoverable. It makes it unreportable.

Dishonorable mention: Redis

So why am I writing about this, anyway? The cost of overcommit isn’t just technical, it also represents bad engineering culture: shifting responsibility for correctness away from application developers and onto the kernel. As an example, when you start Redis with overcommit disabled, it prints a scary warning that you should re-enable it:

WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add ‘vm.overcommit_memory = 1’ to /etc/sysctl.conf and then reboot or run the command ‘sysctl vm.overcommit_memory=1’ for this to take effect.

No. Code that requires overcommit to function correctly is failing to handle memory allocation errors correctly. The answer is not to print a warning that overcommit is disabled, but rather to surface low memory conditions explicitly so the system administrator can understand and resolve them.