Ariadne's Space

Why I am looking for Jellycat alternatives now

Sat, 28 Mar 2026 08:44:47 -0700

Many readers of my blog will note that I used to be quite enthusiastic about Jellycat stuffed animals, especially their Bashful series, but I haven’t talked much about them lately.

Outside of being well-designed, they also supported small business: the overwhelming majority of my collection has been purchased from independently-run stores, including one run by a close friend of mine in downtown Seattle. Unfortunately in the past few years, Jellycat have pursued a “brand elevation strategy” (their words, not mine), incrementally dropping small businesses from their network in favor of large chains and directing people to their website instead.

In 2025, this included dropping hundreds of independent shops; around 100 of these were in the UK alone, with no option for appeal, explicitly as part of this strategy. Many of these retailers had supported the brand for decades, and described the move as abrupt and poorly communicated.

This is a baffling change in direction. Jellycat’s popularity was built in independent shops, which makes this strategy a good way to burn that goodwill.

Frustratingly, this new direction is now becoming somewhat problematic for me, as it turns out they also make one of the best tactile tools I’ve found.

In my case, I use one of the larger Jellycat Bashful Bunny variants (what they call the “Really Big” size) as a lap object during the day. It provides consistent tactile feedback, something to hold onto, and, when needed, doubles as a small pillow. The end result is that it helps with focus and anxiety during long stretches of work.

What I am looking for in a stuffed animal

It needs to be large enough to sit comfortably in the lap and double as a small pillow, without becoming cumbersome. Beyond that, the details matter more than one might expect: it needs to be soft without being irritating, usable for long stretches, and have a shape that can be held onto or adjusted easily.

Most alternatives fail one of these constraints. They are too small and effectively disappear, too large and unwieldy, or made from materials that are either unpleasant or distracting over time. Even small differences matter here; a Bunnies by the Bay bunny that was just slightly smaller was already enough to be annoying.

This would be a minor annoyance if it were purely theoretical, but it is not: my current daily-use bunny is approaching end-of-life after sustained use. It has held up well, but it is not realistically replaceable with another one without undermining the earlier decision to stop supporting Jellycat.

So this is both an explanation and a request: if anyone has found alternatives that meet roughly the same criteria, I would be interested in hearing about them.

This is, apparently, a harder problem than it should be.

Why leaders often disappoint us

Thu, 22 Jan 2026 14:54:41 -0700

There’s an old saying about not meeting your heroes. In practice, leaders tend to confirm this over time. This is true across domains, and it’s rarely a single gaffe that does it. The interesting question is why the disappointment usually takes the same shape.

Disappointment does not always show up in the form of a bad conversation. Often there isn’t any conversation at all, at least not in the way people imagine one. As space disappears, interaction collapses into reaction. Responses come faster, positions are stated rather than tested, and dialogue gives way to declaration. At a certain distance, leadership becomes parasocial by default, taking the form of broadcast. There is nothing to push back on, only things to react to. By the time the gaffe happens, the system has already collapsed.

The accumulation of influence

Much of the time, leadership emerges through accumulated influence: someone does useful or visible work and attention gathers. Over time, that influence carries more weight, and interaction quietly changes. Exchange becomes presentation as influence becomes legible, and once influence is legible, it becomes performative by default. At that point, silence loses neutrality.

None of this requires bad intent. The same pattern shows up in very different kinds of people, which makes individual explanations less convincing. When silence carries cost, behavior tends to shift in predictable ways. Reaction becomes safer than response, not because people are reckless, but because the underlying structure rewards it.

A way to understand this shift is through ego development. Early on, most people rely on external feedback to regulate their sense of self. Attention and reinforcement help stabilize an emergent identity. This is normal, temporary, and usually invisible, and under supportive conditions that reliance softens, allowing ongoing ego development to become more internally anchored. When those conditions are absent, the process can stall.

The pressure of performance

Performance pressure changes those conditions in reliable yet subtle ways. Once visibility is continuous, there’s less room to disengage without consequence. Attention has to be managed, silence has to be explained. Over time, the space that ongoing ego development depends on gets crowded out by the need to remain legible, responsive, and present.

As the space for integration disappears, the ego adapts by turning outward again. Identity is maintained through response and visibility rather than reflection. Over time, performance stops feeling optional and becomes the only mode that consistently receives feedback, recognition, and relief from pressure.

Over time, this adaptation compresses. Everything has to be answered, nothing gets to sit. Reactionary performance closes off options until the ego is boxed in, with no clean way to pause or step back. Listening to feedback becomes difficult, refusal stops reading as information and starts registering as pressure.

Eventually the corner closes. With no room left to pause or revise, the ego can’t maintain coherence through performance alone. The result isn’t always silence, sometimes it’s escalation, overreach or the need to say something definitive. You may be familiar with the old “escalate to deescalate” saying. What collapses isn’t belief, but the capacity to remain integrated under pressure.

The collapse of integration

Once integration collapses, the pressure doesn’t disappear. It just looks for a new outlet.

While the responses vary, the shape remains familiar. Escalation replaces reflection, withdrawal masquerades as clarity. Performance hardens into something more transactional. What these paths share is an attempt to regain psychological safety without reopening space. Coherence is rebuilt outwardly, even as inward integration remains unavailable.

Another path out of collapse is grift. Performance keeps going, but it changes shape. Attention becomes stabilizing rather than incidental. Some people who end up here may also have narcissistic pathology, but grift itself isn’t cleanly reducible to that: it emerges when identity can only be held together through external return.

Sometimes people don’t collapse this way. Usually it’s because the environment gives them the opportunity to pause: silence isn’t punished, and saying no doesn’t threaten belonging. Influence is shared rather than concentrated. Under those conditions, performance loosens its grip, and integration doesn’t have to be outsourced to constant response.

The reason leaders disappoint us so often isn’t hard to find once you know what to look for. Influence changes the shape of interaction, and performance gradually replaces integration as the primary stabilizer. Over time, the space required for reflection and listening erodes. What follows isn’t corruption so much as compression, and then collapse. The disappointment comes from mistaking these outcomes for personal failure, when they’re often the trace left by a structure that no longer supports integration.

vm.overcommit_memory=2 is always the right setting for servers

Tue, 16 Dec 2025 17:23:19 -0700

The Linux kernel has a feature where you can tune the behavior of memory allocations: the vm.overcommit_memory sysctl. When overcommit is enabled (sadly, this is the default), the kernel will typically return a mapping when brk(2) or mmap(2) is called to increase a program’s heap size, regardless of whether or not memory is available. Sounds good, right?

Not really. While overcommit is convenient for application developers, it fundamentally changes the contract of memory allocation: a successful allocation no longer represents an atomic acquisition of a real resource. Instead, the returned mapping serves as a deferred promise, which will only be fulfilled by the page fault handler if and when the memory is first accessed. This is an important distinction, as it means overcommit effectively replaces a fail-fast transactional allocation model with a best-effort one where failures are only caught after the fact rather than at the point of allocation.

To understand how this deferral works in practice, let’s consider what happens when a program calls malloc(3) to get a new memory allocation. At a high level, the allocator calls brk(2) or mmap(2) to request additional virtual address space from the kernel, which is represented by virtual memory area objects, also known as VMAs.

On a system where overcommit is disabled, the kernel ensures that enough backing memory is available to satisfy the request before allowing the allocation to succeed. In contrast, when overcommit is enabled, the kernel simply allocates a VMA object without guaranteeing that backing memory is available: the mapping succeeds immediately, even though it is not known whether the request can ultimately be satisfied.

The decoupling of success from backing memory availability makes allocation failures impossible to handle correctly. Programs have no other option but to assume the allocation has succeeded before the kernel has actually determined whether the request can be fulfilled. Disabling overcommit solves this problem by restoring admission control at allocation time, ensuring that allocations either fail immediately or succeed with a guarantee of backing memory.

Failure locality is important for debugging

When allocations fail fast, they are dramatically easier to debug, as the failure is synchronous with the request. When a program crashes due to an allocation failure, the entire context of that allocation is preserved: the requested allocation size, the subsystem making the allocation and the underlying operation that required it are already known.

With overcommit, this locality is lost by design. Allocations appear to succeed and the program proceeds under the assumption that the memory is available. When the allocation is eventually accessed, the kernel typically responds by invoking the OOM killer and terminating the process outright. From the program’s perspective, there is no allocation failure to handle, only a SIGKILL. From the operator’s perspective, there is no stack trace pointing to the failure. There are only post-mortem logs which often fail to paint a clear picture of what happened.

Would you rather debug a crash at the allocation site or reconstruct an outage caused by an asynchronous OOM kill? Overcommit doesn’t make allocation failure recoverable. It makes it unreportable.

Dishonorable mention: Redis

So why am I writing about this, anyway? The cost of overcommit isn’t just technical, it also represents bad engineering culture: shifting responsibility for correctness away from application developers and onto the kernel. As an example, when you start Redis with overcommit disabled, it prints a scary warning that you should re-enable it:

WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add ‘vm.overcommit_memory = 1’ to /etc/sysctl.conf and then reboot or run the command ‘sysctl vm.overcommit_memory=1’ for this to take effect.

No. Code that requires overcommit to function correctly is failing to handle memory allocation errors correctly. The answer is not to print a warning that overcommit is disabled, but rather to surface low memory conditions explicitly so the system administrator can understand and resolve them.

Rethinking sudo with object capabilities

Fri, 12 Dec 2025 06:36:06 -0700

I hate sudo with a passion. It represents everything I find offensive about the modern Unix security model:

like su, it must be a SUID binary to work
it is monolithic: everything sudo does runs as root, there is no privilege separation
it uses a non-declarative and non-hierarchical configuration format leading to forests of complex access-control policies and user errors due to lack of concision
it supports plugins to extend the policy engine which run directly in the privileged SUID process

I could go on, but hopefully you get the point. Alpine moved to doas as the default privilege escalation tool several years ago, in Alpine 3.15, because of the large attack surface that sudo brings due to its design.

Systems built around identity-based access control tend to rely on ambient authority: policy is centralized and errors in the policy configuration or bugs in the policy engine can allow attackers to make full use of that ambient authority. In the case of a SUID binary like doas or sudo, that means an attacker can obtain root access in the event of a bug or misconfiguration.

What if there was a better way? Instead of thinking about privilege escalation as becoming root for a moment, what if it meant being handed a narrowly scoped capability, one with just enough authority to perform a specific action and nothing more? Enter the object-capability model.

In an object-capability system, there is no global decision point that asks who you are and what you might be allowed to do. Authority is explicit and local: a program can only perform an action if it has been given the capability to do so. This makes privilege boundaries visible, composable, and far easier to reason about, shifting privilege escalation from a question of identity to a question of possession.

Inspired by the object-capability model, I’ve been working on a project named capsudo. Instead of treating privilege escalation as a temporary change of identity, capsudo reframes it as a mediated interaction with a service called capsudod that holds specific authority, which may range from full root privileges to a narrowly scoped set of capabilities depending on how it is deployed.

Delegating root privilege with object capabilities

What does that look like in practice? First, let’s consider a system service which needs to perform a few privileged operations, such as mounting and unmounting filesystems, and how capsudo can be used to provide capabilities to that service. With capsudo, we have a few different options. We could, for example, grant generic mount and umount capabilities, or alternatively we could grant constrained mount and umount capabilities to specific device nodes instead.

First, let’s take a look at what generic mount and umount capabilities would look like, as it is a good example for showing how constrained capabilities work. To begin with, consider a volume management service running under the mountd user. We will grant capabilities to that mountd user to then invoke using capsudo by running a few instances of the capsudod daemon.

root# capsudod -o mountd:mountd -s /run/user/mountd/cap/mount -- mount &
root# capsudod -o mountd:mountd -s /run/user/mountd/cap/umount -- umount &

You might notice that the capabilities above have had commands bound to them. This is an important feature of capsudo which I will elaborate on in a moment.

Let’s say that the user has plugged in a USB stick and wants it to be mounted to /media/usb. To do this with capsudo, the volume manager simply makes use of the /run/user/mountd/cap/mount capability which has been delegated to it:

mountd$ capsudo -s /run/user/mountd/cap/mount -- /dev/sdb1 /media/usb

What is going on here? When capsudod was started, it bound the capability it provides to the mount command by setting the executable it will run. This means that the /run/user/mountd/cap/mount capability cannot run any other command besides mount. This is technically a suboptimal delegation, however. So let’s fix that delegation by stopping the capsudod process and fixing it:

root# capsudod -s /run/user/mountd/cap/mount -- /usr/sbin/mount

Now when the capability is invoked, capsudo will run /usr/sbin/mount directly rather than checking the PATH environment variable it was spawned with.

We can build on this by creating a specific capability for mounting the device node, and another for unmounting the specific mount point:

root# capsudod -s /run/user/mountd/cap/mount-dev-sdb1 -- /usr/sbin/mount /dev/sdb1
root# capsudod -s /run/user/mountd/cap/umount-media-usb -- /usr/sbin/umount /media/usb

These would then be invoked as one would expect:

mountd$ capsudo -s /run/user/mountd/cap/mount-dev-sdb1 -- /media/usb
mountd$ capsudo -s /run/user/mountd/cap/umount-media-usb

So in essence a capability is represented as a Unix socket, an optional argv list and optionally a set of mandatory environmental variables, creating a very composable interface for delegating authority.

Non-root delegations, or service accounts meet service capabilities

Now let’s talk about a scenario where traditionally root privilege is not required: service accounts.

Suppose we have a web application deployment system where developers are allowed to update files in a specific directory and restart a service, but otherwise shouldn’t have administrative access to the system. Traditionally, this might still be implemented using sudo, despite the fact that no global privileges are actually needed.

With capsudo, we can instead run the capsudod daemon under a dedicated service account which only owns the resources it is meant to manage.

Assume a deployment service running under the www-deployment user, which owns /srv/www/app and is allowed to reload the uWSGI service via a capability delegated to the www-deployment user. We can start capsudod instances under that user directly:

root# capsudod -o www-deployment:www-deployment \
   -s /run/user/www-deployment/cap/service-uwsgi \
   -- /usr/sbin/rc-service uwsgi &
www-deployment$ capsudod -o www-deployment:www-developers \
   -s /run/user/www-deployment/cap/update-site -- /usr/bin/rsync -a &
www-deployment$ capsudod -o www-deployment:www-developers \
   -s /run/user/www-deployment/cap/reload-site -- \
   /usr/bin/capsudo -s /run/user/www-deployment/cap/service-uwsgi \
      -- reload &

A developer in the www-developers group might then invoke these capabilities:

dev$ capsudo -s /run/user/www-deployment/cap/update-site \
  -- ./build/ /srv/www/app/
dev$ capsudo -s /run/user/www-deployment/cap/reload-site

Unpacking the delegations

There is a lot going on here, so let’s walk through it step by step.

First, the system administrator delegates a small amount of authority to the www-deployment service account. This is done by running a capsudod instance that is able to manage the uWSGI service:

root# capsudod -o www-deployment:www-deployment \
  -s /run/user/www-deployment/cap/service-uwsgi \
  -- /usr/sbin/rc-service uwsgi &

This capability is owned by the www-deployment user and allows exactly one operation: invoking the system’s service manager to act on the uwsgi service. No other services can be touched, and no other commands can be executed through this capability.

Second, the www-deployment account uses that authority to construct more narrowly scoped capabilities for others. Two additional capsudod instances are started under the www-deployment account, but with ownership granted to the www-developers group:

www-deployment$ capsudod -o www-deployment:www-developers \
   -s /run/user/www-deployment/cap/update-site -- /usr/bin/rsync -a &

www-deployment$ capsudod -o www-deployment:www-developers \
   -s /run/user/www-deployment/cap/reload-site -- \
   /usr/bin/capsudo -s /run/user/www-deployment/cap/service-uwsgi -- reload &

The first of these allows developers to update the application files using rsync, but only through the www-deployment account’s existing filesystem permissions. The second is more interesting: it does not directly reload the service. Instead, it delegates a constrained use of the previously granted uWSGI capability.

When a developer invokes reload-site, they are not calling rc-service themselves, and they are not interacting with the system service manager directly. They are invoking a capability that is itself built on top of another capability, with additional constraints applied.

The important property here is that authority only ever moves downward in scope. It is possible to further delegate a subset of the authority granted by a capability, but not more. This kind of authority layering is a natural fit for the object-capability model, but it is awkward and fragile to express with identity-based access control.

Identity-based access control asks who should be allowed to act. Object-capability systems ask where authority should live and how it should flow. capsudo today is an exploration of what happens when you take the second question seriously, treating privilege escalation as explicit delegation and opening the door to further refinements like passing concrete resources, such as pre-opened file descriptors, instead of whole identities.

I want you to understand

Tue, 02 Dec 2025 20:22:23 -0700

I want you to understand what it is like to be transgender during this time.

I want you to understand the threat to doctor-patient confidentiality. In June, the Department of Justice began targeting clinics and health systems which provide treatment for gender dysphoria with subpoenas requesting personally identifying information about patients. While these subpoenas currently target clinics which provide services to minors, it is clear that they are testing the waters for expanding their inquiry to adult patients. Although compliance with these subpoenas is likely illegal as disclosure of these records would violate HIPAA, I worry that I will be included on a list of transgender individuals and targeted for discrimination as a result.

I want you to understand the threat to medical care for trans people more broadly. Like with the subpoenas, these efforts are starting with trans children. Although I am privileged to have private health insurance through my employer, private insurers often use Medicare coverage determination criteria as a baseline for their policies. I worry that I could be denied access to medically necessary health care in the future.

I want you to understand that one does not simply quit taking hormones. Abruptly stopping HRT can leave the body in a hormonal state that may never fully return to baseline and potentially reverses some of the desired effects. This outcome is often distressing, and loss of access to medical care may lead many to self-manage their HRT. Due to confidentiality concerns, such self-managed treatment will likely not be monitored with lab work. Managing hormone therapy without proper medical supervision can be dangerous.

I want you to understand what it is like to travel as a transgender US citizen. As a result of Trump’s Executive Order 14168, it is no longer possible for transgender people to obtain a US passport that correctly reflects their gender presentation. Traveling with identity documents that do not match your gender presentation can be dangerous abroad. In some cases you can even be denied entry or even deported. Such policies discourage trans people from traveling due to fear of discrimination.

I want you to understand what it is like to be a transgender worker. A report from The Williams Institute of Law at UCLA shows that over 80% of transgender employees in the US have experienced discrimination or harassment at work at some point. Contrary to some optimistic portrayals during Pride Month, this is actually getting worse: the Movement Advancement Project 2025 NORC survey reports a significant uptick in discrimination and harassment complaints. If that wasn’t enough, Lambda Legal also reports a surge in the volume of requests submitted to their help desk.

I want you to understand what it is like to be a transgender entrepreneur. Based on a report from Pitchbook, only 0.8% of venture capital funding went to female-founded companies in 2025, the lowest since 2015. While we do not yet have data for LGBTQ founders in 2025, StartOut estimated that only 0.5% of companies which raised venture capital from 2000-2022 were founded by LGBTQ founders. These numbers plainly highlight ongoing social inequities.

I want you to understand what it is like to be a transgender leader in open source. While the open source community has made progress toward inclusion, a study by the Linux Foundation observes that people identifying as women, non-binary, LGBTQ+ or disabled were three times more likely to report threats. Another study found that simply having a Code of Conduct did not make projects safer. Without meaningful enforcement, participants continued to experience harassment. Even meaningful enforcement isn’t enough. For example, after rejecting Xlibre in Alpine due to their reactionary background, a notable alt-right Linux podcaster made a video targeting me, focusing on my transgender identity rather than the technical merits.

I need you to understand that while things are dire for trans people right now, we can fight back and win. At the same time, we must confront these realities: human decency demands it. Support politicians who fight anti-trans policies. Donate to law firms like Lambda Legal. If you are a business owner, hire trans people: we have been driving innovation since time immemorial. If you are an investor, invest in trans founders: the same StartOut report that shows that only 0.5% of funded companies were founded by LGBTQ founders also observed that those founders created more jobs with less funding than their peers.

Two weeks of wayback

Mon, 07 Jul 2025 21:28:10 -0700

A poorly kept secret is that the X11 graphics stack is under-maintained as resources shift towards the maintenance of Wayland’s graphics stack instead. To some extent, technical steering committees in major distributions have been watching this situation develop for the past few years with increasing concern, as limited maintenance becomes a security risk: bugs accumulate and already burdened distribution security teams have to carry the security maintenance load in an absence of new releases.

In Alpine, we have been discussing the sunset of the standalone X.org server implementation for several years for these reasons to come up with a strategy that allows us to keep supporting X11-based desktop environments in a world without the X.org server. Recently, a group of neofascist reactionaries announced a fork of the X.org server which, amongst other things, has introduced new security bugs into the X server they forked from X.org, which brought Alpine to a new crossroads in the general discussion we’ve been having about X11. While Alpine has rejected this fork on the grounds that collaborating with neofascist reactionaries is fundamentally incompatible with our values, the overarching problem of X11 under-maintenance still persists, and is unlikely to change any time soon, leading us to begin directly looking for a solution.

Enter Wayback: just enough Wayland to make Xwayland work

For the past year or so, the main idea circulating around the Alpine community to solve the X11 maintenance problem has been the creation of a stub Wayland compositor that can sit in front of Xwayland and act as a full X server. Given the timing and desire to put the X11 maintenance issue to bed entirely, I decided to write a quick and dirty proof of concept over a weekend, sharing it on Mastodon and BlueSky:

Since then, a lot has happened: we have been slowly putting the foundational pieces together to build a replacement X stack around Xwayland, and enough is now there for people with simple setups to use wayback as their daily X11 implementation, as long as they don’t mind bugs.

Towards the first Wayback release

There’s a lot still left to do before we can confidently say that Wayback is ready for distributions to switch to. This work is across the stack: Wayback still needs to expose surfaces that Xwayland can use, Xwayland needs to implement a few new features such as cursor warping and some X extensions inside Xwayland itself need to be properly plumbed (such as Xinerama being able to make use of the Wayland output layout data).

Longer term goals aside, we are at most a few weeks away from the first alpha-quality release of Wayback. The main focus of this release is to get to a point where enough is working that users with basic setups and requirements can be reasonably served by Wayback in place of the X.org server, to allow for further testing. It’s already to a point where I am daily driving it:

Of course, while the first release is coming soon, the project remains in an experimental state, and the first release will itself be experimental, but we’re making real progress towards a sustainable solution for the X11 problem. Come join us in IRC (irc.libera.chat #wayback) or Matrix (#wayback:catircservices.org)! Unlike other projects, we are focused on building real solutions rather than fascism.

C SBOMs, and how pkgconf can solve this problem

Sat, 08 Feb 2025 20:45:37 -0700

I recently attended FOSDEM, and saw a talk in the SBOM devroom about a software engineer’s attempts to build an SBOM for a C project. There are a number of reasons why the C ecosystem is difficult to reflect in SBOMs, but the largest problem is that the C ecosystem is fractured across a handful of build systems: GNU Autotools, CMake and Meson are the primary build systems used by projects but there are hundreds of others in the long tail.

A key thing that these build systems have in common is that they can integrate with pkg-config, which is a database that describes available build dependencies and their use. This database, naturally, is of significant relevance to SBOM generation, because it already has most of the relevant information needed to generate an SBOM.

pkgconf has a bomtool utility which is intended to generate SBOMs using pkg-config data. But how can this be leveraged in practice?

To show how bomtool can be used to generate SBOMs, lets make a simple project using Meson. First, we need a simple program:

`main.c`

#include <glib.h>

int main(int argc, const char *argv[])
{
        g_print("hello world\n");
        return 0;
}

Now we can compile this program by hand:

~/bomtool-example $ gcc -o main main.c `pkg-config --cflags --libs glib-2.0`
~/bomtool-example $ ./main
hello world

`meson.build`

Now we can generate a Meson project to build this program:

project('bomtool-example', 'c')

glib = dependency('glib-2.0')

exe = executable('bomtool-example', 'main.c', dependencies: [glib])

At that point, you can run meson setup _build && cd _build && ninja and get a bomtool-example binary that effectively matches the one built by hand earlier.

How do we turn that into an SBOM though? Well, we need to extend the Meson build script to generate a pkg-config module, which we can do by adding:

pkg = import('pkgconfig')
pcfile = pkg.generate(name: 'bomtool-example', filebase: 'bomtool-example', description: 'bomtool example', version: '0.0.1', requires: [glib])

This causes the following .pc file to be generated:

prefix=/usr/local
includedir=${prefix}/include

Name: bomtool-example
Description: bomtool example
Version: 0.0.1
Requires: glib-2.0
Cflags: -I${includedir}

Now we can use bomtool to generate an SBOM:

~/bomtool-example/build $ bomtool ./meson-private/bomtool-example.pc > bomtool-example.spdx.txt
~/bomtool-example/build $ tail -n 10 bomtool-example.spdx.txt 
Relationship: SPDXRef-Package-glib-2.0C642.82.4 DEPENDENCY_OF SPDXRef-Package-bomtool-exampleC640.0.1


Relationship: SPDXRef-Package-glib-2.0C642.82.4 DEPENDS_ON SPDXRef-Package-libpcre2-8C6410.43
Relationship: SPDXRef-Package-libpcre2-8C6410.43 DEV_DEPENDENCY_OF SPDXRef-Package-glib-2.0C642.82.4


Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-Package-bomtool-exampleC640.0.1
Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-Package-glib-2.0C642.82.4
Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-Package-libpcre2-8C6410.43

What is left to do? A few things:

Build systems like CMake and Meson should become aware of bomtool and leverage it automatically.
pkg-config module authors should add SPDX license expressions to their pkg-config modules, support for this was added in pkgconf 1.9, so it is a reasonably stable feature now. This will improve the quality of the SBOMs generated by bomtool.
Support for output formats that are more useful such as the new SPDX 3 JSON-LD format, CycloneDX, etc. Tools exist today which allow for translation between these formats, however, so it is not a huge requirement.

It is pretty clear, at least to me that the pkg-config ecosystem has a large role to play in the future of C SBOMs, as the necessary information about dependencies and other relationships are richly expressed at this layer. But at the same time, bomtool is still new, and .pc files are still being updated to reflect their projects' license data.

The XZ Utils backdoor is a symptom of a larger problem

Mon, 01 Apr 2024 17:00:00 -0700

On March 29th, Andres Freund dropped a bombshell on the oss-security mailing list: recent XZ Utils source code tarball releases made by Jia Tan were released with a backdoor. Thankfully, for multiple reasons, Alpine was not impacted by this backdoor, despite the recent source code tarball releases being published in Alpine edge. But what lessons do we need to learn from this incident?

The software “supply chain” is not real

As a community of hackers, we have built an exhaustive commons of free software released under various free licenses such as the GPL and the Apache 2.0 license. Software packages in this commons have taken over the corporate world, because it enabled more rapid innovation by allowing developers to focus more on the business logic of their applications, rather than low-level details. This has been overall a good thing for society: from the open commons we have spawned a whole world of applications which have become the foundational bedrock of modern society. It can certainly be argued that the invention of FOSS licensing models has been as revolutionary for the digital economy as the steam engine was for industry.

There is one problem, however – when we take software from the commons, we are like raccoons digging through a dumpster to find something useful. There is no “supply chain” in reality, but there is an effort by corporations which consume software from the commons to pretend there is one in order to shift the obligations related to ingesting third-party code away from themselves and to the original authors and maintainers of the code they are using.

For there to be a “supply chain”, there must be a supplier, which in return requires a contractual relationship between two parties. With software licensed under FOSS licensing terms, a consumer receives a non-exclusive license to make use of the software however they wish (in accordance with the license requirements, of course), but non-exclusive licenses cannot and do not imply a contractual supplier-consumer relationship.

With that said, many of the proposals made by people working to improve security of the software “supply chain” have practical and valuable uses for protecting the integrity of the commons, and are worthy of further examination.

“Junk drawer” libraries are valuable targets

CVE-2024-3094 happened for a simple reason: distributions patching OpenSSH to support systemd’s readiness notifications.

Frequently, authors looking to add systemd readiness notifications to their software tend to just look for the systemd pkg-config package, and use its CFLAGS and LIBS. This results in the software linking to libsystemd. What does that have to do with anything, after all we are talking about a backdoor in liblzma, not libsystemd? Simple: although sd_daemon() does not make use of any functionality in liblzma, because libsystemd has a DT_DEPEND entry against liblzma, it will pull in liblzma as a shared object dependency. Once that happens, the constructor functions in liblzma will run, as it is being loaded due to being in the dependency graph.

What can be done about this? A simple solution would be to start to split up the various libsystemd routines into smaller packages. This would allow for these packages to link against libsystemd-daemon or similar instead, which would presumably not link against liblzma, as it is unnecessary for readiness notifications. The systemd pkg-config package could be kept around as a metapackage pulling in the other libraries as a migration path.

I call libraries which are large amalgamations of unrelated routines “junk drawer” libraries because they are basically the programming equivalent to a junk drawer: routines and dependencies accumulate over years and suddenly you have a mess of programs which depend on this library but only use some small portion of the library. As these unnecessary dependencies accumulate, these “junk drawer” libraries become valuable points of interest when scouting for projects to compromise. I would recommend auditing any of the other dependencies of systemd for possible backdoors for this reason. There are a number of other libraries which could have been targeted in this way as well, which are also in the libsystemd dependency graph, such as PCRE.

Be kind to software maintainers

Although I am not certain that this lesson is particularly applicable to the xz-utils situation, since the actor who implemented the backdoor most likely made use of sockpuppet personas to advocate for his becoming a maintainer, the mental health of software maintainers is important.

Directly what this means is that if you see somebody harassing a maintainer with specific demands, you should not join in on the thread. Let the maintainer deal with it publicly, and reach out privately if you are concerned about the situation. Otherwise, even if you are concerned about burnout or the maintainer overworking, you may wind up advocating for a threat actor to become a maintainer of something.

Most breaches actually begin in corp

Wed, 06 Dec 2023 17:00:00 -0700

Readers of my blog will note that while I believe Rust is an excellent tool for developers to leverage when building software, that there is a disconnect between the developers leveraging Rust features to improve their software and many of the advocates who talk about the language, which I believe is counterproductive when it comes to Rust advocacy.

For example, I see takes like these frequently, which generally advocate that if only we adopted memory safe languages, we would solve all security problems in computing forever:

If it’s estimated that writing in a memory safe language prevented 750 vulnerabilities (in just one codebase!) and IBM calculated [1] the average cost of a data breach is $4.45 million, that’s over $3.3 billion saved by moving to memory safety.

Don’t get me wrong: it sure would be nice to change to a memory safe language and save $3.3 billion in losses, but in reality it’s far more complicated than that.

Every year, Verizon’s security group releases a Data Breaches Investigation Report. These reports are fascinating to read, and I highly recommend giving them a read if you’re interested about the past year’s notable data breaches and how they actually happened.

What we learn from these reports is that, in general:

Over 70% of data breaches actually involve a human element instead of a software vulnerability, for example a phishing attack or a misconfiguration of a service.
Almost 50% of data breaches actually involve compromised credentials, such as leaked OAuth tokens which did not expire.
Roughly 15% of data breaches have phishing as their root cause.
Only 5% of data breaches actually come from exploitation of a software vulnerability.

Don’t get me wrong – software vulnerabilities are bad and should be fixed in an expedient manner, however, to circle back to the prior example I quoted, if we are considering data breaches to have a price tag of $4.45 million, and we are talking about 750 security incidents in practice, then in reality only 38 of these incidents would have the potential to have memory safety as their root cause, which is a much smaller price tag of $169.1 million that could be attributed to memory safety.

The point is not that we shouldn’t refactor, or even rewrite software, to improve its memory safety. But we should be honest about why we are doing it. While memory safety is important, the real benefit in doing this refactoring work is to improve the clarity of the underlying software’s technical design: technical constraints can be enforced using Rust’s trait system, for example – a form of behavioral modeling.

By leveraging features such as traits to enforce behavioral correctness of the code you are writing, you wind up having a much better vulnerability posture overall, not just in the area of memory safety. This is the reason why refactoring software to use code written in Rust and other modern languages with these features is advantageous.

This is a far more interesting story than the talking points about memory safety I hear. At this point, with features such as FORTIFY and Address Sanitization, it is possible to address memory safety defects without having to go to such lengths to refactor pre-existing code.

Features like ASan do not even have to carry significant runtime performance penalties. To illustrate my point, Justine Tunney proposed building a modified version of Alpine with ASan enabled in 2021 using a production-tuned variant of her ASan runtime included in her Cosmopolitan libc project. It was estimated that enabling ASan in conjunction with this variant of her ASan runtime would only result in a 3 to 5% performance reduction over code that did not have ASan enabled. Adopting this work would have immediately derisked the use of memory unsafe code in all packages as they would be built with ASan by default.

And, of course, even with the borrow checker, and traits, and type enforcement, and the other code verification features provided by the Rust compiler, you still have unsafe{} blocks, and the Rust compiler provides support for ASan as a mitigation for these blocks. So you still really need ASan even in a memory safe world, because even when you build such a thing with perfect memory safe abstractions over a memory unsafe world, you really are still building on top of a memory unsafe world.

The point here isn’t that these abstractions are meaningless. They do provide significant harm reduction when working with otherwise memory unsafe interfaces, but even the most perfect abstraction is still, by its very nature of being an abstraction, leaky. Instead, we should recognize why Rust improves memory safety, and how the techniques which improve memory safety can also be used to enforce elements of the underlying software’s design at compile time. This is a much better story than the handwaving I usually see about memory safety from advocates.

Writing portable ARM64 assembly

Wed, 12 Apr 2023 17:00:00 -0700

An unfortunate side effect of the rising popularity of Apple’s ARM-based computers is an increase in unportable assembly code which targets the 64-bit ARM ISA. This is because developers are writing these bits of assembly code to speed up their programs when run on Apple’s ARM-based computers, without considering the other 64-bit ARM devices out there, such as SBCs and servers running Linux or BSD.

The good news is that it is very easy to write assembly which targets Apple’s computers as well as the other 64-bit ARM devices running operating systems other than Darwin. It just requires being aware of a few differences between the Mach-O and ELF ABIs, as well as knowing what Apple-specific syntax extensions to avoid. By following the guidance in this blog, you will be able to write assembly code which is portable between Apple’s toolchain, the official ARM assembly toolchain, and the GNU toolchain.

Differences between the ELF and Mach-O ABIs

Modern UNIX systems, including Linux-based systems largely use the ELF binary format. Apple uses Mach-O in Darwin instead for historical reasons. This is not a requirement for Apple imposed by their use of Mach, indeed, OSFMK, the kernel that Darwin, MkLinux and OSF/1 are all based on, supports ELF binaries just fine. Apple just decided to use the Mach-O format instead.

When it comes to writing assembly (or, really, just linking code in general) targeting Darwin, the main difference to be aware of is that all symbols are prefixed with a single underscore. For example, if you have a function that would be declared in C like:

extern void unmask(const char *payload, const char *mask, size_t len);

On Darwin, the function in your assembly code must be defined as _unmask.

The other major difference is that ELF defines different classes of data, for example STT_FUNC and STT_OBJECT. There is no equivalence in Mach-O, and thus the .type directive that you would use when writing assembly for ELF targets is not supported.

A brief note on Platform ABIs

You will also need to be aware of minor differences between the Darwin ABI and other platform ABIs. A notable example is that the x18 register is reserved by the Darwin ABI and is explicitly zeroed on context switches in some cases. This register is also reserved on Android, but not on GNU/Linux or Alpine.

Apple-specific vector mnemonics

The other main thing to watch out for is Apple’s custom mnemonics for NEON. In order to make writing NEON code less cumbersome, Apple introduced a set of mnemonics that allow simplification of specifying NEON instructions. For example, if you are targeting Apple devices only, you might write an exclusive-or NEON instruction like so:

eor.16b v2, v2, v0

This is an Apple-specific extension to the ARM assembly syntax. The official ARM assembly manual specifies that the memory layout must be specified for each register:

eor     v2.16b, v2.16b, v0.16b

Abstracting the ABI details with some macros

The good news is that the ABI details can easily be abstracted with a few macros. As for using NEON functions, the answer is simple: stick to what the ARM manual says to do, rather than using Apple’s mnemonics.

There are two macros that you need. These can be placed in a header file somewhere if wanted.

The first macro allows you to deal with the underscore requirement of the Darwin ABI:

#ifdef __APPLE__
# define PROC_NAME(__proc) _ ## __proc
#else
# define PROC_NAME(__proc) __proc
#endif

The second macro is optional, but it allows you to define the correct ELF symbol types outside of Apple’s toolchain:

#ifdef __clang__
# define TYPE(__proc, __typ)
#else
# define TYPE(__proc, __typ) .type __proc, __typ
#endif

Then you just write your assembly as normal, but using these macros:

.global PROC_NAME(unmask)
.align 2
TYPE(unmask, @function)
PROC_NAME(unmask):
   ...

And that’s all there is to it. As long as you follow these guidelines, you will have assembly which is portable to any UNIX-like environment on 64-bit ARM.

Help migrate a community from Discord to something else

Tue, 07 Mar 2023 17:00:00 -0700

During the height of the pandemic, I set up a community using Discord. Since then, it has evolved into being one of the most active (yet tight-knit) technical communities on Discord: members ranging from all around the world and from all sorts of technical and social backgrounds participate in conversations every day on a variety of topics.

Why leave Discord?

The current situation sounds pretty good, right? Well, as Richard Stallman warned, proprietary services masquerading as software do not necessarily act on behalf of the user.

In this specific case, despite paying money to Discord for its services, there have been many instances where it has been transparently obvious to myself and the rest of our team that Discord is not really acting in the interest of our community. Some examples:

Discord has banned the accounts of several community members over the past 18 months. When pressed on the issue, they usually have no viable explanation for why they took that specific action.
Discord has rolled out automated moderation features which were configured with very aggressive defaults, and enabled by default. These features have also had bugs, and when pressing support on those issues, our mileage has varied. We have been able to mostly disable the “auto-mod” features that were obnoxiously intrusive, however.
Discord’s leadership team has speculated on introducing functionality that is not aligned with the interests of our community, such as NFT support. They later rolled this speculation back after too many users complained.

On March 27th, Discord plans to roll out a new Privacy Policy which, among other things, grants them the right to record video calls without consent. It is very likely that they plan to do this in order to enforce Content ID type restrictions on the content being shared in Discord-using communities.

Although the community I started does not frequently share content which would run afoul of these issues, Content ID systems can be easily fooled into reacting to content which does not violate any copyright, such as ambient noises. On top of this, the community in question engages in a lot of activism on various topics intersectional to the world we exist in. Discord being able to step into and monitor video calls in our community is therefore entirely unacceptable from a security point of view.

Should we have chosen Discord for this community given its' security needs? Most likely not, but at the time that this community was established, the current reality was not envisioned. Had we intended to build a community explicitly for the activities that it has chosen to engage in, we would likely have avoided Discord. But at the same time, the Discord UX is likely attributable for some of the success of the community: it is simple and largely optimized for the multi-device reality we now live in.

Matrix?

During the height of the pandemic, FOSDEM was held virtually on Matrix-based infrastructure. The combination of multimedia experiences and text-based chat looked to be marginally competitive to the formula that Discord provides communities. For this reason, Matrix is at the top of a short list of alternatives we are considering.

But we have questions and concerns. This is where a helpful advocate from the Matrix community who is familiar with the Trust & Safety aspects of the protocol would be welcome. I would even be willing to pay a reasonable consultancy fee for real answers to these concerns.

The main concern we have is one of safety. Much like with the fediverse, there are homeservers run by persons known to be a security threat to our community. We need a robust solution for keeping those homeservers defederated from the rooms hosted on our own homeserver.

We are told that Synapse has an integration with Mjolnir to allow for this, but we would prefer to use Dendrite. Does Dendrite offer a similar integration with Mjolnir? Also, with a bot making real-time policy decisions on what room invitations and joins are allowed, is it possible to have physical redundancy for Mjolnir?

Building on that question, availability in general is another major area of concern. Is it possible to have multiple instances of Dendrite running at once in a geo-distributed fashion? It is okay to assume that we would be using Spanner or similar to manage the database replication to support that.

The next major concern is CSAM. When we launched our Mastodon instance, we had an incident where a user uploaded CSAM to our instance. We have heard that it is possible to convince homeservers to blindly cache CSAM from other homeservers. What mitigations exist for this issue?

Finally, the last remaining question is how to integrate this into our infrastructure. For reference, we use Kubernetes to manage our services, with Traefik acting both as Ingress and as a Service Mesh. Previously we used Knative to manage some services such as Mastodon Web. Is there any advantage to using Knative to manage a Matrix homeserver? (We assume there is not.)

Something else?

We are also open to using something other than Matrix. But it needs to be something that we can manage ourselves at the infrastructure level. We already have been burned by Discord, we’re not interested in being burned by another service. It also needs to provide a similar end-to-end user experience as Discord. From what I can find, Matrix is the only project out there that is able to meet those requirements.

But I would be happy to hear about alternatives which have done similarly well at getting over the hump, where network effect is no longer a serious concern.

Reach me at ariadne@dereferenced.org if you have answers to any of the above questions. Thanks!

pkgconf, CVE-2023-24056 and disinformation

Mon, 23 Jan 2023 17:00:00 -0700

Readers will have noticed that two maintenance releases of pkgconf were cut over the weekend, 1.9.4 and 1.8.1 respectively, to address CVE-2023-24056, a pkg-config specific variation of the now-classic “billion laughs attack”. While fixing software defects is important, a lot went wrong with how this CVE was reported and the motivations behind its disclosure, and for my own catharsis, I want to talk about this.

The origin of `pkgconf`

To hopefully explain why I am so bothered by all of this, let’s first understand the history of pkgconf: a project I began noodling on in March 2011.

2011 was a particularly rough year for me. In January, my father was diagnosed with pancreatic cancer, and declined to disclose this to anyone. When I came back to Oklahoma to visit my parents in early March, I walked into my dad’s house and found him jaundiced. I drove him to the emergency room, and was informed that he only had a few months to live due to the pancreatic cancer he allowed to progress to stage 4. This was shocking to me, especially considering I was 23 at the time. The stress of it led to me breaking up with my boyfriend at the time.

I did the only thing I could do given the situation: spent as much time with him as possible. The hospital had installed Wi-Fi earlier that year, so I was able to take my computer and work on my projects while I spent time with him. This worked out well, because it gave us a common ground of subjects to talk about: my dad was the person who originally pushed me into getting involved with software engineering as a profession in the first place. While he himself never worked as a software engineer, he developed a number of small utilities and demo programs for MS-DOS. Later, he became heavily interested in BSD, and then Slackware.

During this time period, pkg-config 0.26 was released, which required either a complicated bootstrap procedure to satisfy the glib2 requirements by hand, or a pre-existing copy of pkg-config to exist. Alpine was impacted by this bootstrap problem, and we ultimately decided to hold back pkg-config on the 0.25 version because the bootstrapping problem was too complex to solve for the pending release.

At the same time, I was looking for something, anything to work on that would serve as a distraction and conversation piece. This created an opportunity: I could work on a replacement pkg-config implementation that did not have the bootstrap requirement that the freedesktop implementation required. I began working on pkgconf, specifically the .pc file parsing and dependency graph walking code, while my dad was in the hospital. He found talking about it fascinating, and so we discussed the various aspects of implementing a parser, and walking dependency graphs in C. In a limited way, it was a project we collaborated on, in that I would write code, tell him about it, and he’d point out ways my assumptions probably didn’t hold true.

After he passed away, I quit working on it for a while, until a few friends of mine decided to pick it up and experiment with it in Gentoo and FreeBSD. Sadly, my father passed away in early April, so he didn’t get to see the first viable release, or to see pkgconf integrated into Linux distributions.

Maintaining a production-quality build tool at scale

These days, pkgconf is basically everywhere. It is the default pkg-config implementation in every mainstream Linux distribution except Ubuntu. It is used heavily in embedded Linux development and in plenty of other scenarios. My distfiles server, distfiles.dereferenced.org, logs dozens of pkgconf downloads every second of the day.

The success of pkgconf is not without its problems though. There are aspects of the software which, given what I know today, I would probably implement substantially differently. The technical debt is real. I’ve been working, however, as time permits, to improve these problems in the pkgconf-1.9.x release series.

But when pkgconf does something which is unexpected, and breaks a user’s build… those interactions are rarely fun. Many times, the user with the issue shows up on the issue tracker, or worse, my personal inbox in a bad mood, which results in a triage experience that is suboptimal for everyone involved. Thankfully, this doesn’t happen so much anymore, as we have worked hard to balance compatibility and developer-friendly output from the tool.

But as smooth as things are these days, maintaining a production build tool imposes a lot of burden that you cannot begin to expect until you’ve done it before. It is not enough to simply tell a user that the framework he is using is doing things wrong, for example, underspecifying its dependencies. You must consider “self-service” features: ones which allow the user to diagnose the issues in his build and correct them himself. By doing so, you provide the user with a good experience, and keep support requests from annoyed users much lower. All of this has to be designed and implemented in production build tools.

The appearance of “competition”

The past weekend has been a wild ride for me. I recently moved to Seattle, and have been getting settled in. A few people brought u-config: a new, lean pkg-config clone to my attention. At first, I shrugged it off, and mostly would have continued to do so. An implementation of pkg-config on Windows would be good for me, personally, as I do not develop pkgconf on Windows, and different people who contribute to the maintenance of pkgconf’s Windows support have different goals. This has led to some significant fragmentation of pkgconf on the Windows side, with different tools bundling it supporting specific aspects of the pkg-config format in different ways.

I have a number of social and technical observations about u-config. Some good, some not so good. To start off with the social aspects: I don’t particularly appreciate the level of aggression directed toward pkgconf. While that alone would not normally be a turn-off for me (one has to have a reasonably thick skin when being a FOSS maintainer), casually dropping the “billion laughs” 0day with a snyde comment about how we should use ASan (we do) when developing pkgconf was too much, and the bug itself (a mistake in accounting for available buffer space during variable expansion) was overstated.

There is a lot of good things about u-config. By focusing on only the minimally required functionality, the author was able to write an excellent tool which has the potential to someday be a replacement to pkgconf. I am open to talking about such a deprecation, even.

However, after the initial blogpost (which contained disinformation about both freedesktop pkg-config and pkgconf), there was additional disinformation from another person who is enthusiastic about the u-config project. Notably, he submitted a patch, which amongst other things, could be misinterpreted by readers to conclude that pkgconf does not consider /usr/include as a system include path. When configured correctly, it definitely does. For example, on Alpine Linux:

pestilence:~$ pkgconf --dump-personality
Triplet: default
DefaultSearchPaths: /usr/lib/pkgconfig /usr/share/pkgconfig
SystemIncludePaths: /usr/include
SystemLibraryPaths: /usr/lib

But this particular disinformation was merged by the author of the software, without regard for checking the comment for disinformation, despite how absurd it would be if it were true.

Update (28 January 2023): Since the initial publication of this blog, the comment introduced in the above patch has been corrected to reflect a specific edge case relating to -I/usr/include verses -I /usr/include. I believe the discrepancy in the handling of both fragments to be a bug, one which was not reported to me, but rather discussed only in the source code comment. The contributor of the patch in question to u-config, in particular, has pointed the fact that they later changed the source code comment to clarify the issue, as part of an attempt to deflect from the point of this blog: discussing how the u-config author and contributors have chosen to engage in bad faith with other pkg-config implementations (especially pkgconf) from the beginning of their project. While I plan to fix the non-reported discrepancy in the next pkgconf release, I will note that the u-config authors have so far chosen to not handle this edge case.

`pkg-config` implementations do specific things for a reason

In the UNIX environment, the behavior of the system toolchain is static and must be well-defined. Tools which act adjacently to the system C toolchain must behave in ways which are aware of how the C toolchain is configured to behave. This is why pkgconf checks several different environment variables to learn about how the system toolchain has been configured, and what deviations, if any, have been configured via the environment.

A frequent patten in UNIX pkg-config files is to write things like:

prefix=/usr
includedir=${prefix}/include
libdir=${prefix}/lib
Package: whatever
Version: 0
Cflags: -I${includedir}
Libs: -L${libdir} -lwhatever

On Windows, pkg-config implementations have --define-prefix, which is used to override the ${prefix} variable for this reason.

If pkg-config is not aware of /usr/include being a system include path, then a disaster can happen when querying for multiple dependencies at the same time. Consider this other pkg-config file:

prefix=/usr
includedir=${prefix}/include/OtherLib
libdir=${prefix}/lib
Package: OtherLib
Version: 0
Cflags: -I${includedir}
Libs: -L${libdir} -lother

Now lets say that OtherLib has a /usr/include/OtherLib/math.h file which uses #include_next to enhance the math.h header. A real-world example of a library which does this is libbsd. Well, if you query pkg-config with pkg-config --cflags --libs whatever OtherLib, then you will get:

pestilence:~$ pkgconf --with-path=examples/ --personality=examples/broken.personality whatever OtherLib
-I/usr/include -I/usr/include/OtherLib -lwhatever -lother

This means that /usr/include/math.h will be preferred over /usr/include/OtherLib/math.h, and your build will fail.

So this type of filtering, and the other types of filtering that pkgconf does, is very important in the UNIX environment. The author of u-config will unfortunately have to learn these things one by one as users come to him with bug reports.

There is probably an alternate reality where u-config and pkgconf work together to deprecate pkgconf, and someday I hope that will be the reality here. But until the disinformation and putdowns are addressed, it will unfortunately be impossible to collaborate.

Anyway, if you got through all of this, thanks for reading, I guess.

Building fair webs of trust by leveraging the OCAP model

Fri, 02 Dec 2022 17:00:00 -0700

Since the beginning of the Internet, determining the trustworthiness of participants and published information has been a significant point of contention. Many systems have been proposed to solve these underlying concerns, usually pertaining to specific niches and communities, but these pre-existing solutions are nebulous at best. How can we build infrastructure for truly democratic Webs of Trust?

Fairness in reputation-based systems

When considering the design of a reputation-based system, fairness must be paramount, but what is fairness in this context? A reputation-based system can be considered fair if it appropriately balances the concerns of the data publisher, the data subject, and the data consumer. Regulatory frameworks such as the GDPR attempt to provide guidance concering how this balance can be accomplished in the general sense of building internet services, but these frameworks are large and complicated, and as such make it difficult to provide a definition which is adequate for a reputation-based trust system.

To understand how these concerns must be balanced, we must understand the underlying risks for each participant in a reputation-based system:

The data subject is at risk of harm to their professional reputation due to annotations they did not consent to, and mistakes in those annotations. This is a problem which has already captured regulatory ire, as I will explain later.
The data publisher is at risk of being sued for defamation due to the annotations they publish.
The data consumer is at risk of being misled by inaccurate annotations they consume.

A fair reputation-based system must attempt to provide an adequate balance between these concerns through active harm reduction in its design:

The harm to the data subject from misleading annotations can be reduced by blinding the identity of the data subject.
The harm to the data publisher from misleading annotations can also be reduced by blinding the identity of the data subject.
The harm to the data consumer from misleading annotations can be reduced by allowing them to consume annotations from multiple sources.

Shinigami Eyes, or how designing for fairness can be difficult

The Shinigami Eyes browser extension was designed to help people establish trust in various web resources using a reputation-based system. In general, the author attempted to make thoughtful choices to ensure the system was reasonably fair in its design. However the system has a number of flaws, both technical and social, which highlight how building systems of trust requires a detailed understanding concerning how the underlying primitives interact and the consequences of those interactions.

Shinigami Eyes and Blinding

As already noted, a fair reputation-based system must blind the identity of the data subject to protect both the data subject and data publisher. The approach used by Shinigami Eyes was to use a bloom filter constructed with a 32-bit FNV-1a hash.

The FNV family of hashes are a non-cryptographic family of hashes, which provide scalability up to 1024 bits, which works by performing an XOR of the current byte’s value against the current hash value, then multiplying that value by the designated FNV prime. There is an alternate set of FNV hashes which swaps the XOR and multiplication steps, which is the variant used by Shinigami Eyes.

The use of a bloom filter is an acceptable blinding method, assuming that the underlying hash provides sufficient resolution, such as a 256-bit or 512-bit hash. Presumably, due to the constraints of having to run as a JavaScript extension, the weak 32-bit FNV-1a hash was used instead. Because of this, while the reputation lists used by Shinigami Eyes were acceptably blinded, there was an extremely high risk of false positives caused by hash collisions.

Concerns about the technical implementation of the Shinigami Eyes extension led Datatilsynet, the Norwegian GDPR regulatory agency, to ban the extension at the end of 2021, and development of the extension appears to have ended as a result of their initial inquiry.

Can we build systems like Shinigami Eyes more robustly?

The main reason why Shinigami Eyes gained attention of Datatilsynet was due to the centralized nature of the data processing. Can we build a system which avoids centralized data processing and promotes democratic participation? Yes, it is quite easy, but like most things, the challenge will be delivering a good user experience.

Leveraging the OCAP model to build a robust solution

The largest problem in building this system is ensuring that the published reputation data is reliably blinded. To this end, I propose that feeds are a simple dataset containing a set of blinded hashes and annotations. The physical representation of the dataset does not matter, though keeping it as simple as possible will expand the number of places where the data can be consumed.

In the Object Capability model, we can think of the physical feed as an object, and a blinding key as a capability to access that object in a useful way. You have to have both in order for either to be useful.

A participant can publish multiple copies of their feed, with different blinding keys for each friend they wish to share it with, or they can choose to publish a single key and share the same key with every friend, or even the public at large. Users can then choose which feeds they want to use when making trust decisions from the collection of feeds and blinding keys they have been given.

By comparison to Shinigami Eyes, this better satisfies the conditions for fairness: there is no risk of a false positive, the contents of the reputation lists remain private, and publishers can choose to consent to data sharing requests however they wish.

Choosing a reasonable set of primitives

To build such a system, I would probably personally choose to use HMAC-SHA3-256 as the blinding primitive. This provides a good balance between collision protection, cryptographic strength, and hash resolution. A scheme which provides less than 256 bits of hash resolution should be avoided due to the risk of collisions.

I would distribute the feeds as CSV files. This would allow users the most flexibility in managing feeds, they could distribute different feeds with different meanings, and include extended data alongside the blinded hash as a form of annotation.

On the client side, I would calculate sets of blinded hashes for each possible subset of the URI, all the way to the parent domain. By doing so, it would be possible for feeds to match against a large number of children URIs instead of having to list them all manually.

Implementations should store the learned hashes in a radix trie. This allows the hash lookups to be done in constant time, as well as allowing for automatic bucketing, which can be helpful for implementing quorum requirements.

Things we can build with this

The use of friend-to-friend reputation-based systems can be powerful. They provide accountability (as you know who you are getting your data from) and collaboration (your friends can consume your data in exchange).

They can be used in the way Shinigami Eyes was used: to allow interested parties to identify resources they should trust or distrust, but they can also be used to enable collaborative blocking amongst friends and system administrators.

They can also be used to determine if e-mail domains or URLs inside e-mails are actually trustworthy. The possibilities are truly endless.

Twitter's demise is ActivityPub's future

Fri, 11 Nov 2022 17:00:00 -0700

Earlier today, I deleted all of my tweets and left Twitter forever. While I plan on leaving a nightlight thread for a while, I will eventually close my account, assuming Elon doesn’t do it for me.

The past week has been an emotional rollercoaster for me as I have watched everything play out.

I was one of the original fediverse users when Indymedia UK stood up the indy.im StatusNet instance at the end of 2010. After some time, Evan Prodromou got bored with the StatusNet code base and started Pump instead, with the network losing the largest instance at that time, identi.ca. With the network fragmented as a result of that switch, I got bored of it and started using Twitter instead.

Eventually StatusNet was forked by Matt Lee and a few other FSF staffers and became GNU Social. I was not really around during this time, but it was around that time that GamerGate happened, which created a network where half of the users were Indymedia contributors and the other half were the initial seeds of the alt-right.

While I was not heavily involved from a development perspective in the early days of what we now call the fediverse, this began to change in late 2016 when Eugen Rochko started Mastodon. I was an early adopter of Mastodon, deploying Mastodon 0.6 on Heroku, using the mastodon.dereferenced.org domain for my account. But running Mastodon on Heroku (and later Scalingo) was expensive. I did not want to manage a Rails application by hand, and I hadn’t started using Docker or Kubernetes yet.

In early 2018, a developer psuedonymously known as lain began adding ActivityPub federation support to Pleroma, and he convinced me to try it out as an alternative to running Mastodon. I found Pleroma and developing with Elixir to be exciting and fresh, compared to other technology I was working with at the time. I felt empowered to start doing serious hacking on ActivityPub as a result of writing patches to Pleroma and sending them to lain.

After a while, I became a Pleroma developer with commit rights. I felt like we could use the same strategy I used to promote Alpine to promote Pleroma: build a coalition of willing influencers to demonstrate the value proposition of self-hosted social networks for user freedom, and so I started working on building a group around it. Because I was showing it to friends I already had, Pleroma grew into being a project where many of the contributors were from queer and marginalized backgrounds similar to mine. Everything was going fine. As a team, we built a lot of features that are still innovative in this space, such as MRF and building the LitePub profile of ActivityPub, which shifted the protocol from being a Content Distribution protocol to being a Content Advertisement protocol.

Towards the end of 2019, it started going to shit. By that time, I was running a public instance, and the database kept having index corruption issues on a daily basis. Around the same time, the Soapbox project was launched, and they decided to use Pleroma as their backend. This led to a lot of friction inside the project, because the Soapbox author had a tendency to share his ideological positions inside the project space as part of his anti-trans activism. I wound up leaving Pleroma toward the middle of 2020 because of the scalability issues in the database with Pleroma 2.0 and the lack of any effort to maintain a welcoming space for everyone.

I decided to take a break from the fediverse because of that decision, because I felt a break was warranted. I decided to try Twitter in earnest during that time, but to be honest, I’ve never found using Twitter to be enjoyable in the same way as I found the fediverse to be enjoyable.

As I said a few weeks ago, I think that commercial microblogging has been an absolute disaster for our society. Relationships on Twitter are parasocial and transactional, which leads to poisonous behavior, while relationships in the fediverse are largely grounded and mutual.

In April of this year, Elon Musk announced his intention to buy Twitter. Based on the experience of watching a rich fanatic purchase and then ruin something he deeply cared about and my experience of being a Tesla owner, I thought it would be relevant to set up an escape hatch. Others were of the same mind, and we shared notes.

With the events of the past few weeks, I strongly believe that Twitter’s demise is going to bring all of the proprietary social silos crashing down. People are starting to realize that trading freedom for the alleged convenience of using a proprietary network isn’t worth it. Although not perfect, ActivityPub is eating the world: there’s now a million new users a week and this number is growing.

… which brings me to the not so fun part, the things that aren’t going so well.

Although the fediverse is a decentralized and disparate network with many different groups with their own cultural norms, some of them have tried to enforce their cultural norms on the new users. This is normal and to be expected to some extent, as people don’t like big changes.

I don’t want to get into the nuances of some of these conversations. What I do want to say is that the fediverse is a diverse network of different people who bring their own styles and approaches to posting and content curation. It is entirely fine to bring your whole self to the conversation in uncensored form if that is what you feel is right to do. Do what you feel is right, and don’t worry about people muting or blocking your account, because you’re not here for them, you’re here for yourself and you will meet likeminded people regardless of who blocks you.

The other problem is, of course, a question of scaling anti-abuse tools. Many have posted screenshots of abuse they have received, and it comes from a segment of the larger network where the culture is most diplomatically described as “player vs player.” It is fine for those instances to exist, but we need to build better tools so that newcomers can be aware of segments of the network that they may want to exclude themselves from: what we have today where admins informally share threat data with each other is hard to scale upwards.

In general these are good problems to have, because they are easy to overcome. Overall the future is looking bright.

Which instances are you recommending right now?

At the moment, I am trying to recommend instances which have a moderation policy aligned with providing a safe space for marginalized identities like mine which are also targeted at technical people.

Some recommendations:

hachyderm.io, running Mastodon 3.5.3 and administrated by Kris Nova and other volunteers.
social.restless.systems, running Mastodon 4.0 and administrated by NCommander, a tech YouTuber.
social.treehouse.systems, run by me and other volunteers. It also runs Mastodon 4.0.

The reasons why I recommend these instances are because the administrative capabilities far exceed those required by the Mastodon Server Covenant: the above instances are run by teams with marginalized backgrounds and extensive SRE experience.

I am planning to put together a larger tool for finding instances which have been stood up as part of this new wave of SRE-backed quasi-professional instances.

What next?

Next time, I will write a bit about how my own instance is put together and how it has evolved over the past few months. Stay tuned for that one.

The internet is broken due to structural injustice

Wed, 26 Oct 2022 17:00:00 -0700

Over the past few years, I’ve come to realize that the Internet as we know it is utterly broken. Lately, I’ve also been pondering how participants in the modern Internet have enabled and perpetuated harm to society at large. Repeatedly, we have seen the independence of the commons chipped away by powerful men who wish for participants to serve their own whims, while those who raise concerns with these developments are either shunned, banned or doxed.

On Friday, October 28th, we will see another demonstration of these structural injustices where the commons takes another loss to the whims of a powerful man. Last time, it was freenode’s takeover by Andrew Lee, and this time it will be Twitter’s takeover by Elon Musk. No, really, the deal is already concluded: TWTR will be delisted from NASDAQ on Friday.

Will this be the end of Twitter? Probably not, but it will be the end of the current relationship the commons shares with Twitter. Instead of acting as a self-described “public square,” it will further evolve into a chaotic cacophony of trolling and counter-trolling driven in the name of algorithmic engagement. Some will move to other microblogging services and networks, and will likely discover that everything which made Twitter horrible likely applies in some way to the replacement.

The reality is that microblogging sucks, but Twitter managed to make it addictive for a few reasons, which is why the most popular alternative, Mastodon, is basically a copy of the underlying formula, but tweaked to work on the ActivityPub federated network (the so-called fediverse).

The formula is not that hard to understand if you understand how people think and react to stimulation. People are inherently social creatures, and because of the formula used by Twitter, have tried their best to use Twitter despite the inherent conceptual flaws behind microblogging.

If you’ve ever sat down at a slot machine, you will likely note that they are constantly making noises as you interact with them. These sounds are designed to stimulate the reward center in your brain and thus cause it to release endorphins. In the same way, microblogging and other social platform formulas have built rich notification systems to ensure that users experience pleasure from being online. Don’t believe me? Try muting the notifications from Twitter or Mastodon and see if you remain interested in it: odds are, after a while, you won’t.

The other key part of the formula: sow discord amongst the users. This can be done organically (by users) or algorithmically. People have an inherent desire to be right, and this keeps the engagement loop going as people fight over stupid things like whether Android or iPhones are better. The things being argued over do not even have to have any basis in reality: people are more than happy to hold positions which falter to any modicum of dialectical analysis, such as whether furries are actually shitting in litter boxes in schools (obviously this is bullshit if you think about it for more than 10 seconds).

Eventually these pointless arguments evolve into arguments which have actual societal impact: are trans people legitimate and do they deserve rights? Obviously, they are, and they do, but in a world where microblogging discourse is the primary form of media ingestion, the consumer is manipulated with fight-or-flight challenges to make their own 280-character thought piece on the discourse of the day, which leads them to consider the possibility that perhaps Chudlord18 might be on to something when he points out that George Soros was seen at the last Bilderberg meeting, entirely ignoring the part where Chudlord18’s post was disinformation.

Sadly, as we see in the world today, it turns out that fascism is the most optimized ideology available given the limited cognitive bandwidth constraints of a 280-character post. This is because the answer is always simple with fascism: generally a death threat towards the marginalized group of the day will do just fine, which easily fits into 280 characters: “Storm the capitol building!"? “Hang Mike Pence!"? Yep, even congressional members and vice presidents can be marginalized under the right circumstances, and it’s under 280 characters.

Spamming and scamming

Fascism is hardly the only problem that these networks face. Almost every day I get spam like this on either Twitter or Mastodon:

Spam like this is a huge problem with Mastodon, but not with Pleroma, another ActivityPub server, which provides a robust message filtering facility. However, due to the combination of mismanagement of the Pleroma project and an absolutely absurd fediverse turf war, admins of Pleroma instances are written off by some Mastodon admins as being evil, even if they are otherwise harmless.

Between this and the architectural complexity of deploying a BEAM application like Pleroma on Kubernetes, by comparison to how easy it is to deploy Mastodon using Knative on Kubernetes, I am using Mastodon. Since the project mismanagement issues are largely resolved now, I might suck it up and convert the instance to Pleroma in the near future just so I can deal with the spam in a more automated way.

I’ll probably continue to use Mastodon (or maybe Pleroma if I switch my instance to it), but lately I’ve been using microblogging platforms less and less, as I have realized that ultimately the format doesn’t provide the sense of community I am looking for.

And this is ultimately the problem with the fediverse: everything on the fediverse is a clone of a proprietary platform, with basically the same social downsides. It turns out when you take something useful, and turn it into a “social experience,” you basically ruin its utility.

To me, social tools exist to facilitate communication with my friends, and perhaps expansion of my friend group to others which have the same interests. It turns out that we already had good social tools for this all along: blogs and IRC. Because of certain realities – it is inherently easier to clone an open protocol and turn it into a proprietary service – for most people, these social tools turned into centralized platforms like Dreamwidth and Discord.

Microblogging forces you to shout at people, while IRC (now for the most part Discord) facilitates thoughtful conversation. Social photo sharing encourages the editing of photographs to make people appear more attractive for additional likes, while posting photos of yourself to your blog removes that dopamine loop and lets you just focus on living and occasionally documenting your life.

Yes, the point is that these tools are largely boring. They aren’t meant to dominate your life, they are meant to facilitate communication with your friends. They exist to serve the needs of the commons.

Maybe somebody will eventually build the tools I am ultimately looking for. In the meantime, I’ve expanded my list of contact points to include services I previously kept mostly private.

But either way, for the most part, I won’t be investing my time in microblogging anymore, be it on Twitter or Mastodon.

So you've decided to start a free software consultancy...

Wed, 10 Aug 2022 17:00:00 -0700

Recently a friend of mine told me that he was planning to start a free software consultancy, and asked for my advice, as I have an extensive background doing free software consulting for a living. While I have already given him some advice on how to proceed, I thought it might be nice to write a blog expanding on my answer, so that others who are interested in pursuing free software consulting may benefit.

Framing the value proposition

There are many things to consider when launching a free software consultancy, but the key aspect to consider is how you frame the value proposition of your consultancy. A common mistake that new founders make when starting their free software consultancies is to frame the value proposition toward developers. Rather than doing this, you should frame your value proposition towards management.

For example, my friend described the value proposition of his consultancy like this:

“I help people manage their open source server stuff for money.”

This is not a good way to frame the value proposition of a consultancy, because the manager will inevitably ask a question like:

“Why can’t we just hire an intern to manage that?”

In this case, the manager is right to ask a question like that, because the value proposition is not correctly framed. The purpose of a free software consultancy is to augment the business' IT competencies by leveraging the consultant’s gained experience working in FOSS. When you frame your value proposition this way, it becomes more clear to the management why they should engage with your consultancy.

When pitching your value proposition to a prospective client, you should try to empathise with the needs of the client, and tailor your value proposition around how your consultancy can satisfy their needs.

Pricing for services

For serious engagements, pricing should be defined as a function of the value gained for the client from the engagement. For example, if the client saves $250k as a result of the engagement, then you should charge a percentage of that savings.

Proof of concept engagements should be priced lower than your standard rate, as they represent higher risk for the client. Since they are priced lower, the scope of work should also be reduced verses a normal engagement. A common strategy is to split proof of concept engagements into phases, so that the client does not have to commit budget for the entire engagement up front, which can provide an opportunity to charge a little more for the overall engagement.

When you are starting out, you will also want to focus on recurring revenue. This provides two key benefits: first, you have a bottom line greater than $0 if you aren’t able to close larger engagements, which happens from time to time, especially during the summer months, as managers tend to go on holiday. Second, the recurring revenue customers, assuming that you provide them with good service, will recommend your consultancy to others, including large businesses.

These types of engagements should be priced according to what you believe to be a fair value for X hours of your time per month. As a general rule, most consultancies charge at least $100 per hour for prepaid consultancy services.

An example of a recurring service would be something like server maintenance for a small business. In this case, you are augmenting the business with IT services, but the engagement is likely to not require a large amount of time, meaning that with automation, you can build out a customer portfolio of a few hundred of these engagements.

Professional services networks

It is critical to pursue certifications like the RHCE. The value in these certifications are the access to the professional services networks they provide. They will also help with customers who have compliance requirements that state that the engineers working on a project have to be certified.

Larger firms like Red Hat largely outsource their professional services engagements to consultancies which have passed their certification and joined their partner network. These types of relationships are critical: you get to leverage the power of the larger firm’s sales capability to acquire new engagements by bidding on them.

Similarly, you should seek out partnerships with other consultancies, as doing so will expand the range of capabilities that your consultancy has. For example, you might not have familiarity with enterprise networking equipment, but if you have a relationship with a consultancy that does have the ability to take on managing enterprise networking equipment, then you can join forces and bid on contracts which have that requirement in their success criteria.

All of the companies from Red Hat to AWS have professional services networks. Find the ones relevant to the skills your consultancy has and join them.

Invoicing and payment

Larger engagements will always be NET-30 at the least, where NET means no earlier than X days. This allows the client the ability to check your work and ensure they are satisfied with what you have delivered.

If you need the money sooner, there are a few options. First you can offer a discount for paying early, an industry standard is a 10% early payment discount. Another option is to use a factoring company. Factoring works by selling the obligation to a third party, which collects on your behalf for a fee and advances you the payment. If you use a payment platform such as Quickbooks or Bill.com, these platforms have integration with factoring companies, allowing you to get payment sooner.

Negotiation

An engagement will always consist of a written contract with a Statement of Work, frequently called an SOW. The SOW lays out the success criteria for the engagement. SOWs can be open-ended or they can be highly precise. There are advantages to both approaches when authoring an SOW, but an open-ended SOW can wind up creating problems during the engagement, as it provides flexibility for both you and your client.

Always negotiate deals in writing, never take an engagement on an oral promise alone. If a deal requires a third-party to provide some of the success criteria, get their commitment in writing, or you may be left holding the bag.

Following up

An engagement should ideally be thought of as a free-flowing conversation that results in the resolution of the success criteria stated in the SOW. Accordingly, it is vital to keep the conversation going.

This means that you should follow up with the client on a regular basis to keep them informed of the progress of the work being done as part of the engagement, and to solicit feedback early. It is far easier to change the course of an engagement earlier than after hundreds of hours have gone into the work.

When discussing the engagement, it should be considered an active listening exercise: you lay out what your team is building, and then the client provides feedback based on your presentation. From there, the conversation moves into defining what forward progress looks like.

Takeaways

These are just my observations from nearly 20 years of doing professional consulting around FOSS. There is no singular right way of running a consultancy, but these are the key aspects that helped me to maintain good working relationships with my customers.

Running a FOSS consultancy is hard work, but can result in a sustainable business, if you are willing to put in the work.

Free software grows as a function of social utility

Fri, 05 Aug 2022 17:00:00 -0700

A frequent complaint I see from users and inexperienced contributors concerning free software projects is that they are allegedly not doing enough to grow the userbase, sometimes even asserting that a fork is necessary to right the course of the project.

Are these complaints missing the point, or do they have merit? How do free software projects grow their userbase into thriving communities?

In general, these complaints go something like this:

[PROJECT] developers have explicitly said they do not want the project to grow. The [PROJECT] is its own worst enemy, and this is just the latest example of it I’ve seen. I don’t trust the direction of [PROJECT], and neither should you.

The experienced maintainer understands that we must play the long game, not the short game. Tactics such as embrace, extend, extinguish are largely only effective when maintainers are looking at the short term picture. This is because organic growth in the use of a free software package is a function of that package’s social utility: a software package which provides utility to its community will experience growth in adoption because its users will recommend that others try the software package and join the project’s community.

The social utility of a given software package is not necessarily tied to mass-market adoption. It is possible for a software package to be extremely popular in a tight-knit community, while holding very little social utility for the mass-market, and this is totally fine. In fact, this is the case for most software packages which exist in the world.

Likewise, it is possible to pursue new feature development as a gamble on obtaining mass-market adoption, and destroy the social utility of the product for its current userbase. An experienced maintainer will recognize that such gambles rarely pay off, and usually wind up damaging the project rather than growing it.

Unfortunately, society teaches us that we should grow at any cost, which means that inexperienced maintainers can be swayed by such arguments to make harmful decisions to their project. But if we recognize that these types of arguments are inherently defective, we can help maintainers to avoid taking them seriously.

Migrating away from WordPress

Wed, 03 Aug 2022 17:00:00 -0700

Astute followers of this blog might have noticed that the layout has dramatically changed. This is because I migrated away from WordPress last weekend, switching back to Hugo after a few years. This time around, the blog is fully self-hosted, rather than depending on GitHub pages, and the deployment pipeline is reasonably secure. Perhaps we can call it a “secure blog factory” with some further work, even.

When most people deploy static websites anymore, they use a service like Netlify, or GitHub pages to do it. These services are reasonable, but when you do not own your own infrastructure, you are dependent on a third party continuing to offer the service. With the latest news that GitLab has decided to delete user data that has not been touched in over a year, depending on third party services may be something to start considering in your security and reliability posture.

Migrating back to self-hosting

Because of the fact that I cannot really depend on third party services to conduct themselves in alignment with my own personal ethics and expectations for service reliability, even when paying them to do so, I started to self-host the majority of my services over the past year. This has had significant benefit in terms of enabling me to have actual visibility into the behavior of my services, and to tune the performance of them, as needed. Software such as Knative has enabled me to work with my own infrastructure as if it were a managed cloud service at one of the big providers.

There was one problem, however. Some of the services I adopted when I decided to start seriously hosting my own services again, have a less than stellar security record, such as WordPress. While the core of WordPress itself has an acceptable security record, as soon as you use basically any plugin, it goes out the window. Much of this was mitigated by the fact that I ran a custom WordPress image as a Knative service, which meant that if any of the plugins got compromised, I could just restart the pod, and I would be back to normal, but I have always thought that it could be done better.

Setting up Gitea because GitHub introduced Copilot

Last year, GitHub announced Copilot, a neural model that was trained on the entire corpus of publicly available source code posted to GitHub. While Microsoft claims that this is allowed under fair use, the overwhelming majority of experts disagree so far. My personal opinion is that this was a breach of the public trust that the FOSS community originally placed in GitHub, and a lesson that we must own our own infrastructure in order to maintain our autonomy in the digital world.

As a result of all of this, I wound up setting up my own Gitea instance, which I use to maintain my own source code. In addition to Gitea, because I needed a CI service, I deployed a Woodpecker instance. Both of these services are very easy to deploy if you are already using Kubernetes, and come highly recommended (it is also possible to use Tekton for CI with Gitea, but it requires more work at the moment).

Automatically publishing the blog with Woodpecker

If you look at the source code for my blog, you will notice that it is largely a normal Hugo site, that has some basic plumbing around Woodpecker. I also use apko to build a custom image that has all of the tools needed to build and deploy the website, which is self-hosted on the new OCI registry implemented in Gitea 1.17. For those interested, you can look at the logs of the deploy job used to post this article!

Almost all of the interesting stuff is in the woodpecker.yml file, however, which does the following:

Builds an up to date Hugo image from scratch on every deploy using apko.
Builds the new site.
Fetches the contents of the last announcement post (newswire.txt).
Deploys the new site using an SSH key stored as a secret, and a pinned known SSH key also stored as a secret. The latter is largely so I can just update the secret if I change SSH keys on that host.
Checks if the last announcement post is different than the new one, and if so, sends off a post to my Mastodon account using a personal access token.

The last point is the big deal (to me). While ultimately it was not difficult to set up, my original reason for using WordPress was that it could do this type of automation out of the box. Paying off the technical debt of having to worry about WordPress being compromised has certainly been worth it, however.

Future improvements

There are some future improvements I would like to do. For example, I would like to sign the Hugo image I create with cosign. The current blocker on this is that Gitea does not support a configurable audience setting on its OpenID Connect implementation. Once that is done, it should be possible to start working towards allowing Gitea instances to work as OpenID Connect identities for use with the Sigstore infrastructure. This will be very powerful when combined with the new OCI registry support introduced in Gitea 1.17!

I also have some opinions on Woodpecker and other self-hosted CI systems, that I plan to go into more detail on in a near-future blog post.

Hopefully the above provides some inspiration to play with self-hosting your own website, or perhaps playing with apko outside of the GitHub ecosystem. Thanks to the Gitea and Woodpecker developers for making software that is easy to deploy, as well.

How efficient can cat(1) be?

Sat, 16 Jul 2022 17:00:00 -0700

There have been a few initiatives in recent years to implement a new userspace base system for Linux distributions as an alternative to the GNU coreutils and BusyBox. Recently, one of the authors of one of these proposed implementations made the pitch in a few IRC channels that her cat implementation, which was derived from OpenBSD’s implementation, was the most efficient. But is it actually?

Understanding what `cat` actually does

At the most basic level, cat takes one or more files and dumps them to stdout. But do we need to actually use stdio for this? Actually, we don’t, and most competent cat implementations at least use read(2) and write(2) if not more advanced approaches.

If we consider cat as a form of buffer copy between an arbitrary file descriptor and STDOUT_FILENO, we can understand what the most efficient strategy to use for cat would be: splicing. Anything which isn’t doing splicing, after all, involves unnecessary buffer copies, and thus cannot be the most efficient.

To get the best performance out of spliced I/O, we have to have some prerequisites:

The source and destination file descriptors should be unbuffered.
Any intermediate buffer should be a multiple of the filesystem block size. In general, to avoid doing a stat syscall, we can assume that a multiple of PAGE_SIZE is likely acceptable.

A simple `cat` implementation

The simplest way to implement cat is the way that it is done in BSD: using read and write on an intermediate buffer. This results in two buffer copies, but has the best portability. Most implementations of cat work this way, as it generally offers good enough performance.

/* This program is released into the public domain. */
#include <stdio.h>
#include <stdlib.h>
#include <err.h>
#include <errno.h>
#include <limits.h>
#include <fcntl.h>
#include <unistd.h>

void dumpfile(const char *path)
{
	int srcfd = STDIN_FILENO;
	char buf[PAGE_SIZE * 16];
	ssize_t nread, nwritten;
	size_t offset;

	/* POSIX allows - to represent stdin. */
	if (*path != '-')
	{
		srcfd = open(path, O_RDONLY);
		if (srcfd < 0)
			err(EXIT_FAILURE, "open %s", path);
	}

	while ((nread = read(srcfd, buf, sizeof buf)) >= 1)
	{
		for (offset = 0; nread > 0; nread -= nwritten, offset += nwritten)
		{
			if ((nwritten = write(STDOUT_FILENO, buf + offset, nread)) <= 0)
				err(EXIT_FAILURE, "write stdout");
		}
	}

	if (srcfd != STDIN_FILENO)
		(void) close(srcfd);
}

int main(int argc, const char *argv[])
{
	int i;

	for (i = 1; i < argc; i++)
		dumpfile(argv[i]);

	return EXIT_SUCCESS;
}

Implementing spliced I/O

Linux has no shortage of ways to perform spliced I/O. For our cat implementation, we have two possible ways to do it.

The first possible option is the venerable sendfile syscall, which was originally added to improve the file serving performance of web servers. Originally, sendfile required the destination file descriptor to be a socket, but this restriction was removed in Linux 2.6.33. Unfortunately, sendfile is not perfect: because it only supports file descriptors which can be memory mapped, we must use a different strategy when using copying from stdin.

/* This program is released into the public domain. */
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <err.h>
#include <errno.h>
#include <limits.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/sendfile.h>

bool spliced_copy(int srcfd)
{
	ssize_t nwritten;
	off_t offset = 0;

	do
	{
		nwritten = sendfile(STDOUT_FILENO, srcfd, &offset,
				    PAGE_SIZE * 16);
		if (nwritten < 0)
			return false;
	} while (nwritten > 0);

	return true;
}

void copy(int srcfd)
{
	char buf[PAGE_SIZE * 16];
	size_t nread, nwritten, offset;

	while ((nread = read(srcfd, buf, sizeof buf)) >= 1)
	{
		for (offset = 0; nread > 0;
		     nread -= nwritten, offset += nwritten)
		{
			if ((nwritten = write(STDOUT_FILENO,
					      buf + offset, nread)) <= 0)
				err(EXIT_FAILURE, "write stdout");
		}
	}
}

void dumpfile(const char *path)
{
	int srcfd = STDIN_FILENO;
	char buf[PAGE_SIZE * 16];

	/* POSIX allows - to represent stdin. */
	if (*path != '-')
	{
		srcfd = open(path, O_RDONLY);
		if (srcfd < 0)
			err(EXIT_FAILURE, "open %s", path);
	}

	/* Fall back to traditional copy if the spliced version fails. */
	if (!spliced_copy(srcfd))
		copy(srcfd);

	if (srcfd != STDIN_FILENO)
		(void) close(srcfd);
}

int main(int argc, const char *argv[])
{
	int i;
	int stdout_flags;

	stdout_flags = fcntl(STDOUT_FILENO, F_GETFL);
	if (stdout_flags < 0)
		err(EXIT_FAILURE, "fcntl(STDOUT_FILENO, F_GETFL)");
	stdout_flags &= ~O_APPEND;
	if (fcntl(STDOUT_FILENO, F_SETFL, stdout_flags) < 0)
		err(EXIT_FAILURE, "fcntl(STDOUT_FILENO, F_SETFL)");

	for (i = 1; i < argc; i++)
		dumpfile(argv[i]);

	return EXIT_SUCCESS;
}

Another approach is to use splice and a pipe. This allows for true zero-copy I/O in userspace, as a pipe is simply implemented as a 64KB ring buffer in the kernel. In this case, we just use two splice operations per block of data we want to copy: one to move the data to the pipe and another to move the data from the pipe to the output file.

/* This program is released into the public domain. */
#define _GNU_SOURCE
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <err.h>
#include <errno.h>
#include <limits.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/sendfile.h>

#define BLOCK_SIZE ((PAGE_SIZE * 16) - 1)

bool spliced_copy(int srcfd)
{
	int pipefd[2];
	ssize_t nread, nwritten;
	off_t in_offset = 0;
	bool ret = true;

	if (pipe(pipefd) < 0)
		err(EXIT_FAILURE, "pipe");

	do
	{
		nread = splice(srcfd, &in_offset, pipefd[1], NULL,
			       BLOCK_SIZE, SPLICE_F_MOVE | SPLICE_F_MORE);
		if (nread <= 0)
		{
			ret = nread < 0 ? false : true;
			goto out;
		}

		nwritten = splice(pipefd[0], NULL, STDOUT_FILENO, NULL,
				  BLOCK_SIZE, SPLICE_F_MOVE | SPLICE_F_MORE);
		if (nwritten < 0)
		{
			ret = false;
			goto out;
		}
	} while (nwritten > 0);

out:
	close(pipefd[0]);
	close(pipefd[1]);

	return ret;
}

void copy(int srcfd)
{
	char buf[PAGE_SIZE * 16];
	size_t nread, nwritten, offset;

	while ((nread = read(srcfd, buf, sizeof buf)) >= 1)
	{
		for (offset = 0; nread > 0;
		     nread -= nwritten, offset += nwritten)
		{
			if ((nwritten = write(STDOUT_FILENO,
					      buf + offset, nread)) <= 0)
				err(EXIT_FAILURE, "write stdout");
		}
	}
}

void dumpfile(const char *path)
{
	int srcfd = STDIN_FILENO;
	char buf[PAGE_SIZE * 16];

	/* POSIX allows - to represent stdin. */
	if (*path != '-')
	{
		srcfd = open(path, O_RDONLY);
		if (srcfd < 0)
			err(EXIT_FAILURE, "open %s", path);

		(void) posix_fadvise(srcfd, 0, 0, POSIX_FADV_SEQUENTIAL);
	}

	/* Fall back to traditional copy if the spliced version fails. */
	if (!spliced_copy(srcfd))
		copy(srcfd);

	if (srcfd != STDIN_FILENO)
		(void) close(srcfd);
}

int main(int argc, const char *argv[])
{
	int i;
	int stdout_flags;

	stdout_flags = fcntl(STDOUT_FILENO, F_GETFL);
	if (stdout_flags < 0)
		err(EXIT_FAILURE, "fcntl(STDOUT_FILENO, F_GETFL)");
	stdout_flags &= ~O_APPEND;
	if (fcntl(STDOUT_FILENO, F_SETFL, stdout_flags) < 0)
		err(EXIT_FAILURE, "fcntl(STDOUT_FILENO, F_SETFL)");

	for (i = 1; i < argc; i++)
		dumpfile(argv[i]);

	return EXIT_SUCCESS;
}

Honorable mention: `copy_file_range`

While copy_file_range is not really that relevant to a cat implementation, if both the source and output files are normal files, you can use it to get even faster performance than using splice, as the kernel handles all of the details on its own. An optimized cat might try this strategy and then downgrade to splice, sendfile, and the normal read and write loop.

Performance comparison

To measure the performance of each strategy, we can simply use dd as a sink, running each cat program piped into dd of=/dev/null bs=64K iflag=fullblock.

The runs in the table below are averaged across 1000 runs on a 8GB RAM Linode, using a 4GB file in tmpfs.

Strategy	Throughput
`cat-simple` (`read` and `write` loop)	3.6 GB/s
`cat-sendfile`	6.4 GB/s
`cat-splice`	11.6 GB/s

If you are interested in using these implementations in your own cat implementation, you may do so under any license terms you wish.

a silo can never provide digital autonomy to its users

Thu, 30 Jun 2022 17:00:00 -0700

Lately there has been a lot of discussion about various silos and their activities, notably GitHub and an up and coming alternative to Tumblr called Cohost. I’d like to talk about both to make the point that silos do not, and can not elevate user freedoms, by design, even if they are run with the best of intentions, by analyzing the behavior of both of these silos.

It is said that if you are not paying for a service, that you are the product. To look at this, we will start with GitHub, who have had a significant controversy over the past year with their now-commercial Copilot service. Copilot is a paid service which provides code suggestions using a neural network model that was trained using the entirety of publicly posted source code on GitHub as its corpus. As many have noted, this is likely a problem from a copyright point of view.

Microsoft claims that this use of the GitHub public source code is ethically correct and legal, citing fair use as their justification for data mining the entire GitHub public source corpus. Interestingly, in the EU, there is a “text and data mining” exception to the copyright directive, which may provide for some precedent for this thinking. While the legal construction they use to justify the way they trained the Copilot model is interesting, it is important to note that we, as consumers of the GitHub service, enabled Microsoft to do this by uploading source code to their service.

Now let’s talk about Cohost, a recently launched alternative to Tumblr which is paid for by its subscribers, and promises that it will never sell out to a third party. While I think that Cohost will likely be one of the more ethically-run silos out there, it is still a silo, and like Microsoft’s GitHub, it has business interests (subscriber retention) which place it in conflict with the goals of digital autonomy. Specifically, like all silos, Cohost’s platform is designed to keep users inside the Cohost platform, just as GitHub uses the network effect of its own silo to make it difficult to use anything other than GitHub for collaboration on software.

Some have argued that, due to the network effects of silos, the only thing which can defeat a bad silo is a good silo. The problem with this argument is that it requires one to accept the supposition that there can be a good silo. Silos, by their very nature of being centralized services under the control of the privileged, cannot be good if you look at the power structures imposed by them. Instead, we should use our privilege to lift others up, something that commercial silos, by design, are incapable of doing.

How do we do this though? One way is to embrace networks of consent. From a technical point of view, the IndieWeb people have worked on a number of simple, easy to implement protocols, which provide the ability for web services to interact openly with each other, but in a way that allows for a website owner to define policy over what content they will accept. From a social point of view, we should avoid commercial silos, such as GitHub, and use our own infrastructure, either through self-hosting or through membership to a cooperative or public society.

Although I understand that both of these goals can be difficult to achieve, they make more sense than jumping from one silo to the next after they cross the line. You control where you choose to participate – for me, that means I am shifting my participation so that I only participate in commercial silos when absolutely necessary. We should choose to participate in power structures which value our communal membership, rather than value our ability to generate or pay revenue.

it is correct to refer to GNU/Linux as GNU/Linux

Tue, 29 Mar 2022 17:00:00 -0700

You’ve probably seen the “I’d like to interject for a moment” quotation that is frequently attributed to Richard Stallman about how Linux should be referred to as GNU/Linux. While I disagree with that particular assertion, I do believe it is important to refer to GNU/Linux distributions as such, because GNU/Linux is a distinct operating system in the family of operating systems which use the Linux kernel, and it is technically correct to recognize this, especially as different Linux-based operating systems have different behavior, and different advantages and disadvantages.

For example, besides GNU/Linux, there are the Alpine and OpenWrt ecosystems, and last but not least, Android. All of these operating systems exist outside the GNU/Linux space and have significant differences, both between GNU/Linux and also each other.

what is GNU/Linux?

I believe part of the problem which leads people to be confused about the alternative Linux ecosystems is the lack of a cogent GNU/Linux definition, in part because many GNU/Linux distributions try to downplay that they are, in fact, GNU/Linux distributions. This may be for commercial or marketing reasons, or it may be because they do not wish to be seen as associated with the FSF. Because of this, others, who are fans of the work of the FSF, tend to overreach and claim other Linux ecosystems as being part of the GNU/Linux ecosystem, which is equally harmful.

It is therefore important to provide a technically accurate definition of GNU/Linux that provides actual useful meaning to consumers, so that they can understand the differences between GNU/Linux-based operating systems and other Linux-based operating systems. To that end, I believe a reasonable definition of the GNU/Linux ecosystem to be distributions which:

use the GNU C Library (frequently referred to as glibc)
use the GNU coreutils package for their base UNIX commands (such as /bin/cat and so on).

From a technical perspective, an easy way to check if you are on a GNU/Linux system would be to attempt to run the /lib/libc.so.6 command. If you are running on a GNU/Linux system, this will print the glibc version that is installed. This technical definition of GNU/Linux also provides value, because some drivers and proprietary applications, such as the nVidia proprietary graphics driver, only support GNU/Linux systems.

Given this rubric, we can easily test a few popular distributions and make some conclusions about their capabilities:

Debian-based Linux distributions, including Debian itself, and also Ubuntu and elementary, meet the above preconditions and are therefore GNU/Linux distributions.
Fedora and the other distributions published by Red Hat also meet the same criterion to be defined as a GNU/Linux distribution.
ArchLinux also meets the above criterion, and therefore is also a GNU/Linux distribution. Indeed, the preferred distribution of the FSF, Parabola, describes itself as GNU/Linux and is derived from Arch.
Alpine does not use the GNU C library, and therefore is not a GNU/Linux distribution. Compatibility with GNU/Linux programs should not be assumed. More on that in a moment.
Similarly, OpenWrt is not a GNU/Linux distribution.
Android is also not a GNU/Linux, nor is Replicant, despite being sponsored by the FSF.

on compatibility between distros

Even between GNU/Linux distributions, compatibility is difficult. Different GNU/Linux distributions upgrade their components at different times, and due to dynamic linking, this means that a program built against a specific set of components with a specific set of build configurations may or may not successfully run between GNU/Linux systems, but some amount of binary compatibility is otherwise possible as long as you take care to deal with that.

On top of this, there is no binary compatibility between Linux ecosystems at large. GNU/Linux binaries require the gcompat compatibility framework to run on Alpine, and it generally is not possible to run OpenWrt binaries on Alpine or vice versa. The situation is the same with Android: without a compatibility tool (such as Termux), it is not possible to run binaries from other ecosystems there.

Exacerbating the problem, developers also target specific APIs only available in their respective ecosystems:

systemd makes use of glibc-specific APIs, which are not part of POSIX
Android makes use of bionic-specific APIs, which are not part of POSIX
Alpine and OpenWrt both make use of internal frameworks, and these differ between the two ecosystems (although there are active efforts to converge both ecosystems).

As a result, as a developer, it is important to note which ecosystems you are targeting, and it is important to refer to individual ecosystems, rather than saying “my program supports Linux.” There are dozens of ecosystems which make use of the Linux kernel, and it is unlikely that a program supports all of them, or that the author is even aware of them.

To conclude, it is both correct and important, to refer to GNU/Linux distributions as GNU/Linux distributions. Likewise, it is important to realize that non-GNU/Linux distributions exist, and are not necessarily compatible with the GNU/Linux ecosystem for your application. Each ecosystem is distinct, with its own strengths and weaknesses.

the tragedy of gethostbyname

Sat, 26 Mar 2022 17:00:00 -0700

A frequent complaint expressed on a certain website about Alpine is related to the deficiencies regarding the musl DNS resolver when querying large zones. In response, it is usually mentioned that applications which are expecting reliable DNS lookups should be using a dedicated DNS library for this task, not the getaddrinfo or gethostbyname APIs, but this is usually rebuffed by comments saying that these APIs are fine to use because they are allegedly reliable on GNU/Linux.

For a number of reasons, the assertion that DNS resolution via these APIs under glibc is more reliable is false, but to understand why, we must look at the history of why a libc is responsible for shipping these functions to begin with, and how these APIs evolved over the years. For instance, did you know that gethostbyname originally didn’t do DNS queries at all? And, the big question: why are these APIs blocking, when DNS is inherently an asynchronous protocol?

Before we get into this, it is important to again restate that if you are an application developer, and your application depends on reliable DNS performance, you must absolutely use a dedicated DNS resolver library designed for this task. There are many libraries available that are good for this purpose, such as c-ares, GNU adns, s6-dns and OpenBSD’s libasr. As should hopefully become obvious at the end of this article, the DNS clients included with libc are designed to provide basic functionality only, and there is no guarantee of portable behavior across client implementations.

the introduction of `gethostbyname`

Where did gethostbyname come from, anyway? Most people believe this function came from BIND, the reference DNS implementation developed by the Berkeley CSRG. In reality, it was introduced to BSD in 1982, alongside the sethostent and gethostent APIs. I happen to have a copy of the 4.2BSD source code, so here is the implementation from 4.2BSD, which was released in early 1983:

struct hostent *
gethostbyname(name)
	register char *name;
{
	register struct hostent *p;
	register char **cp;

	sethostent(0);
	while (p = gethostent()) {
		if (strcmp(p->h_name, name) == 0)
			break;
		for (cp = p->h_aliases; *cp != 0; cp++)
			if (strcmp(*cp, name) == 0)
				goto found;
	}
found:
	endhostent();
	return (p);
}

As you can see, the 4.2BSD implementation only checks the /etc/hosts file and nothing else. This answers the question about why gethostbyname and its successor, getaddrinfo do DNS queries in a blocking way: they did not want to introduce a replacement API for gethostbyname that was asynchronous.

the introduction of DNS to `gethostbyname`

DNS resolution was first introduced to gethostbyname in 1984, when it was introduced to BSD. This version, which is too long to include here also translated dotted-quad IPv4 addresses into a struct hostent. In essence, the 4.3BSD implementation does the following:

If the requested hostname begins with a number, try to parse it as a dotted quad. If this fails, set h_errno to HOST_NOT_FOUND and bail. Yes, this means 4.3BSD would fail to resolve hostnames like 12-34-56-78.static.example.com.
Attempt to do a DNS query using res_search. If the query was successful, return the first IP address found as the struct hostent.
If the DNS query failed, fall back to the original /etc/hosts searching algorithm above, now called _gethtbyname and using strcasecmp instead of strcmp (for consistency with DNS).

A fixed version of this algorithm was also included with BIND’s libresolv as res_gethostbyname, and the res_search and related functions were imported into BSD libc from BIND.

standardization of `gethostbyname` in POSIX

The gethostbyname and getaddrinfo APIs were first standardized in X/Open Networking Services Issue 4 (commonly referred to as XNS4) specification, which itself was part of the X/Open Single Unix Specification version 3 (commonly referred to as SUSv3), released in 1995. Of note, X/Open tried to deprecate gethostbyname in favor of getaddrinfo as part of the XNS5 specification, removing it entirely except for a mention in their specification for netdb.h.

Later, it returned as part of POSIX issue 6, released in 2004. That version says:

Note: In many cases it is implemented by the Domain Name System, as documented in RFC 1034, RFC 1035, and RFC 1886.

POSIX issue 6, IEEE 1003.1:2004.

Oh no, what is this about, and do application developers need to care about it? Very simply, it is about the Name Service Switch, frequently referred to as NSS, which allows the gethostbyname function to have hotpluggable implementations. The Name Service Switch was a feature introduced to Solaris, which was implemented to allow support for Sun’s NIS+ directory service.

As developers of other operating systems wanted to support software like Kerberos and LDAP, it quickly was reimplemented in other systems as well, such as GNU/Linux. These days, systems running systemd frequently use this feature in combination with a custom NSS module named nss-systemd to force use of systemd-resolved as the DNS resolver, which has different behavior than the original DNS client derived from BIND that ships in most libc implementations.

An administrator can disable support for DNS lookups entirely, simply by editing the /etc/nsswitch.conf file and removing the dns module, which means application developers depending on reliable DNS service need to care a lot about this: it means on systems with NSS, your application cannot depend on gethostbyname to actually support DNS at all.

musl and DNS

Given the background above, it should be obvious by now that musl’s DNS client was written under the assumption that applications that have specific requirements for DNS would be using a specialized library for this purpose, as gethostbyname and getaddrinfo are not really suitable APIs, since their behavior is entirely implementation-defined and largely focused around blocking queries to a directory service.

Because of this, the DNS client was written to behave as simply as possible, but the use of DNS for bulk data distribution, such as in DNSSEC, DKIM and other applications, have led to a desire to implement support for DNS over TCP as an extension to the musl DNS client.

In practice, this will fix the remaining complaints about the musl DNS client once it lands in a musl release, but application authors depending on reliable DNS performance should really use a dedicated DNS client library for that purpose: using APIs that were designed to simply parse /etc/hosts and had DNS support shoehorned into them will always deliver unreliable results.

how to refresh older stuffed animals

Fri, 11 Feb 2022 17:00:00 -0700

As many of my readers are likely aware, I have a large collection of stuffed animals, but my favorite one is the first generation Jellycat Bashful Bunny that I have had for the past 10 years or so. Recently I noticed that my bunny was starting to turn purple, likely from the purple stain that is applied to my hair, which bleeds onto anything when given the opportunity to do so. As Jellycat no longer makes the first generation bashfuls (they have been replaced with a second generation that uses a different fabric), I decided that my bunny needed to be refreshed, and as there is not really any good documentation on how to clean a high-end stuffed animal, I figured I would write a blog on it.

understanding what you’re dealing with

What the stuffed animal is made out of is important to know about before coming up with a strategy to refresh it. If the stuffed animal has plastic pellets to help it sit right (which the Jellycat Bashfuls do), then you need to use lower temperatures to ensure the pellets don’t melt. If there are glued on components (as is frequently the case with lower-end stuffed animals), forget about trying this and just buy a new one.

If the stuffed animal has vibrant colors, you should probably avoid using detergent, or, at the very least, you should use less detergent than you would normally. These vibrant colors are created by staining white fabric, rather than dyeing it, in other words, the pigment is sitting on the surface of the fabric, rather than being part of the fabric itself. As with plastic components, you should use lower temperatures too, as the pigment used in these stains tends to wash away if the temperature is warm enough (around 40 celsius or so).

the washing process

Ultimately I decided to play it safe and wash my stuffed bunny with cold water, some fabric softener and a tide pod. However, the spin cycle was quite concerning to me, as it spins quite fast and with a lot of force. To ensure that the bunny was not harmed by the spin cycle, I put him in a pillowcase and tied the end of it. Put the washing machine on the delicate program to ensure it spends the least amount of time in the spin cycle as possible. Also, I would not recommend washing a stuffed animal with other laundry.

Come back in 30 minutes after the program completes, and put the stuffed animal in the dryer. You should remove the stuffed animal from the pillowcase at this time and dry both the animal and the pillowcase separately. Put the dryer on the delicate program again, and be prepared to run it through multiple cycles. In the case of my bunny, it took a total of two 45 minute cycles to completely dry.

Once done, your stuffed animal should be back to its usual self, and with the tumble drying, it will likely be a little bit fuzzier than it was before, kind of like it came from the factory.

twitter.com/ariadneco…

Bonus content: 1 minute of a tumbling bunny.

JSON-LD is ideal for Cloud Native technologies

Thu, 10 Feb 2022 17:00:00 -0700

Frequently I have been told by developers that it is impossible to have extensible JSON documents underpinning their projects, because there may be collisions later. For those of us who are unaware of more capable graph serializations such as JSON-LD and Turtle, this seems like a reasonable position. Accordingly, I would like to introduce you all to JSON-LD, using a practical real-world deployment as an example, as well as how one might use JSON-LD to extend something like OCI container manifests.

You might feel compelled to look up JSON-LD on Google before continuing with reading this. My suggestion is to not do that, because the JSON-LD website is really aimed towards web developers, and this explanation will hopefully explain how a systems engineer can make use of JSON-LD graphs in practical terms. And, if it doesn’t, feel free to DM me on Twitter or something.

what JSON-LD can do for you

Have you ever wanted any of the following in the scenarios where you use JSON:

Conflict-free extensibility
Strong typing
Compatibility with the RDF ecosystem (e.g. XQuery, SPARQL, etc)
Self-describing schemas
Transparent document inclusion

If you answered yes to any of these, then JSON-LD is for you. Some of these capabilities are also provided by the IETF’s JSON Schema project, but it has a much higher learning curve than JSON-LD.

This post will be primarily focused on how namespaces and aliases can be used to provide extensibility while also providing backwards compatibility for clients that are not JSON-LD aware. In general, I believe strongly that any open standard built on JSON should actually be built on JSON-LD, and hopefully my examples will demonstrate why I believe this.

ActivityPub: a real-world case study

ActivityPub is a protocol that is used on the federated social web (thankfully entirely unrelated to Web3), that is built on the ActivityStreams 2.0 specification. Both ActivityPub and ActivityStreams are RDF vocabularies that are represented as JSON-LD documents, but you don’t really need to know or care about this part.

This is a very simplified representation of an ActivityPub actor object:

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    {
      "alsoKnownAs": {
        "@id": "as:alsoKnownAs",
        "@type": "@id"
      },
      "sec": "https://w3id.org/security#",
      "owner": {
        "@id": "sec:owner",
        "@type": "@id"
      },
      "publicKey": {
        "@id": "sec:publicKey",
        "@type": "@id"
      },
      "publicKeyPem": "sec:publicKeyPem",
    }
  ],
  "alsoKnownAs": "https://corp.example.org/~alice",
  "id": "https://www.example.com/~alice",
  "inbox": "https://www.example.com/~alice/inbox",
  "name": "Alice",
  "type": "Person",
  "publicKey": {
    "id": "https://www.example.com/~alice#key",
    "owner": "https://www.example.com/~alice",
    "publicKeyPem": "..."
  }
}

Pay attention to the @context variable here, it is doing a few things:

It pulls in the entire ActivityStreams and ActivityPub vocabularies by reference. These can be downloaded on the fly or bundled with the application using context preloading.
It then defines a few terms outside of those vocabularies: alsoKnownAs, sec, owner, publicKey and publicKeyPem.

When an application that is JSON-LD aware parses this document, it will receive a document that looks like this:

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    {
      "alsoKnownAs": {
        "@id": "as:alsoKnownAs",
        "@type": "@id"
      },
      "sec": "https://w3id.org/security#",
      "owner": {
        "@id": "sec:owner",
        "@type": "@id"
      },
      "publicKey": {
        "@id": "sec:publicKey",
        "@type": "@id"
      },
      "publicKeyPem": "sec:publicKeyPem",
    }
  ],
  "@id": "https://www.example.com/~alice",
  "@type": "Person",
  "as:alsoKnownAs": "https://corp.example.org/~alice",
  "as:inbox": "https://www.example.com/~alice/inbox",
  "as:name": "Alice",
  "sec:publicKey": {
    "@id": "https://www.example.com/~alice#key",
    "sec:owner": "https://www.example.com/~alice",
    "sec:publicKeyPem": "..."
  }
}

This allows extensions to interoperate with minimal conflicts, as the application is operating on a normalized version of the document that has as many things namespaced as possible, without the user having to worry about it. This allows a parser to easily ignore things it does not know about, as they aren’t defined in the context (which does not actually have to be defined, you can preload a root context), and so they aren’t placed in a namespace.

In other words, that @context variable can be built into the application, or stored in an S3 bucket somewhere, or whatever you want to do. If you are planning to have an interoperable protocol, however, providing a useful @context is crucial.

How OCI image manifests could benefit from JSON-LD

There was a discussion on Twitter this evening about how extending the OCI image spec with signature references has taken a year. If OCI used JSON-LD (ironically, its JSON vocabulary is already similar to several pre-existing JSON-LD ones), then implementations could just store the pre-existing metadata, mapped to a namespace. In the case of an OCI image, this might look something like:

{
  "@context": [
    "https://opencontainers.org/ns",
    {
      "sigstore": "https://sigstore.dev/ns",
      "reference": {
        "@type": "@id",
        "@id": "sigstore:reference"
      }
    }
  ],
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:d539cd357acb4a6df2a4ef99db5fe70714458349232dad0ec73e1ed65f6a0e13",
    "size": 585
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:59bf1c3509f33515622619af21ed55bbe26d24913cedbca106468a5fb37a50c3",
      "size": 2818413
    },
    {
      "mediaType": "application/vnd.example.signature+json",
      "size": 3514,
      "digest": "sha256:19387f68117dbe07daeef0d99e018f7bbf7a660158d24949ea47bc12a3e4ba17",
      "reference": {
        "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
        "digest": "sha256:59bf1c3509f33515622619af21ed55bbe26d24913cedbca106468a5fb37a50c3",
        "size": 2818413
      }
    }
  ]
}

The differences are minimal from a current OCI image manifest. Namely, schemaVersion has been deleted, because JSON-LD handles this detail automatically, and the signature reference extension has been added as the sigstore:reference property. Hopefully you can imagine how the rest of the document looks namespace wise.

One last thing about this example. You might notice that I am using URIs when I define namespaces in the @context. This is a great feature of the RDF ecosystem: you can put up a webpage at those URIs defining how to make use of the terms defined in the namespace, meaning that JSON-LD tooling can have rich documentation built in.

Also, since I am well aware that basically all of these OCI tools are written in Go, it should be noted that Go has an excellent implementation of JSON-LD, and for those concerned that W3C proposals are sometimes not in touch with reality, the creator of JSON-LD has some words about it that are interesting. Now, please, use JSON-LD and stop worrying about extensibility in open technology, this problem is totally solved.

how I wound up causing a major outage of my services and destroying my home directory by accident

Thu, 03 Feb 2022 17:00:00 -0700

As a result of my FOSS maintenance and activism work, I have a significant IT footprint, to support the services and development environments needed to facilitate everything I do. Unfortunately, I am also my own system administrator, and I am quite terrible at this. This is a story about how I wound up knocking most of my services offline and wiping out my home directory, because of a combination of Linux mdraid bugs and a faulty SSD. Hopefully this will be helpful to somebody in the future, but if not, you can at least share in some catharsis.

A brief overview of the setup

As noted, I have a cluster of multiple servers, ranging from AMD EPYC machines to ARM machines to a significant interest in a System z mainframe which I talked about at AlpineConf last year. These are used to host various services in virtual machine and container form, with the majority of the containers being managed by kubernetes in the current iteration of my setup. Most of these workloads are backed by an Isilon NAS, but some workloads run on local storage instead, typically for performance reasons.

Using kubernetes seemed like a no-brainer at the time because it would allow me to have a unified control plane for all of my workloads, regardless of where (and on what architecture) they would be running. Since then, I’ve realized that the complexity of managing my services with kubernetes was not justified by the benefits I was getting from using kubernetes for my workloads, and so I started migrating away from kubernetes back to a traditional way of managing systems and containers, but many services are still managed as kubernetes containers.

A Samsung SSD failure on the primary development server

My primary development server, is named treefort. It is an x86 box with AMD EPYC processors and 256 GB of RAM. It had a 3-way RAID-1 setup using Linux mdraid on 4TB Samsung 860 EVO SSDs. I use KVM with libvirt to manage various VMs on this server, but most of the server’s resources are dedicated to the treefort environment. This environment also acts as a kubernetes worker, and is also the kubernetes controller for the entire cluster.

Recently I had a stick of RAM fail on treefort. I ordered a replacement stick and had a friend replace it. All seemed well, but then I decided to improve my monitoring so that I could be alerted to any future hardware failures, as having random things crash on the machine due to uncorrected ECC errors is not fun. In the process of implementing this monitoring, I learned that one of the SSDs had fallen out of the RAID.

I thought it was a little weird that one drive failed out of the three, so I assumed it was just due to maintenance, perhaps the drive had been reseated after the RAM stick was replaced, after all. As the price of a replacement 4TB Samsung SSD is presently around $700 retail, I thought I would re-add the drive to the array, assuming it would fail out of the array again during rebuild if it had actually failed.

# mdadm —-manage /dev/md2 —-add /dev/sdb3  
mdadm: added /dev/sdb3

I then checked /proc/mdstat and it reported the array as healthy. I thought nothing of it, though in retrospect maybe I should have found this suspicious, there was no discussion about the array being in a recovery state, instead it was healthy, with three drives present. Unfortunately, I figured “ok, I guess it’s fine” and left it at that.

Silent data corruption

Meanwhile, the filesystem in the treefort environment being backed by the local SSD storage for speed reasons, began to silently corrupt itself. Because most of my services, such as my mail server, DNS and network monitoring, are running on other hosts, there wasn’t really any indicator of anything wrong. Things seemed to be basically working fine: I had been compiling kernels all week long as I tested various mitigations for the execve(2) issue. What I didn’t know at the time was that with each kernel compile I was slowly corrupting the disk more and more.

I was not aware of the data corruption issue until today, anyway, when I logged into the treefort environment, and decided to fire up nano to finish up some work I had been doing that needed to be resolved this week. That led me to have a rude surprise:

treefort:~$ nano  
Segmentation fault

This worried me, after all, why could nano crash if it were working yesterday, and nothing had changed? So, I used apk fix to reinstall nano, making it work again. At this point, I was quite suspicious, that something was up with the server, so I immediately killed all the guests running on it, and focused on the bare metal host environment (what we would call the dom0 if we were still using Xen).

I ran e2fsck -f on the treefort volumes and hoped for the best. Instead of a clean bill of health, I got lots of filesystem errors. But this still didn’t make any sense to me, I checked the array again, and it was still showing as fully healthy. Accordingly, I decided to run e2fsck -fy on the volumes and hope for the best. This took out the majority of the volume storing my home directory.

The loss of the kubernetes controller

Kubernetes is a fickle beast, it assumes you have set everything up with redundancy including, of course, redundant controllers. I found this out the hard way when I took treefort offline, and the worker nodes got confused and took the services they were running offline as well, presumably because they were unable to talk to the controller.

Eventually, with some help from friends, I was able to recover enough of the volume to allow the system to boot enough to get the controller back up and running enough to restore the services on the workers that were not treefort, but much like the data in my home directory, the services that were running on treefort are likely permanently lost.

Some thoughts

First of all, it is obvious I need to improve my backup strategy from something other than “I’ll figure it out later”. I plan on packaging Rich Felker’s bakelite tool to do just that.

The other big elephant in the room, of course, is “why weren’t you using ZFS in the first place”. While it is true that Alpine has supported ZFS for years, I’ve been hesitant to use it due to the CDDL licensing. In other words, I chose the mantra instilled in me about GPL compatibility since the days when I was using GNU/Linux over pragmatism. And my prize for that decision was this mess. While I think Oracle and the Illumos and OpenZFS contributors should come together to relicense the ZFS codebase under MPLv2 to solve the GPL compatibility problem, I am starting to think that I should care more about having a storage technology I can actually trust.

I’m also quite certain that the issue I hit is a bug in mdraid, but perhaps I am wrong. I am told that there is a dirty bitmap system and perhaps if all bitmaps are marked clean on both the good pair of drives and the bad drive, it can cause this kind of split-brain issue, but I feel like there should be timestamping on those bitmaps to prevent something like this. It’s better to have an unnecessary rebuild because of clock skew than to go split brain and have 33% of all reads causing silent data corruption due to being out of sync with the other disks.

Nonetheless, my plans are to rebuild treefort with ZFS and SSDs from another vendor. Whatever happened with the Samsung SSDs has made me anxious enough that I don’t want to trust them for continued production use.