Migrating away from WordPress

Astute followers of this blog might have noticed that the layout has dramatically changed. This is because I migrated away from WordPress last weekend, switching back to Hugo after a few years. This time around, the blog is fully self-hosted, rather than depending on GitHub pages, and the deployment pipeline is reasonably secure. Perhaps we can call it a “secure blog factory” with some further work, even.

When most people deploy static websites anymore, they use a service like Netlify, or GitHub pages to do it. These services are reasonable, but when you do not own your own infrastructure, you are dependent on a third party continuing to offer the service. With the latest news that GitLab has decided to delete user data that has not been touched in over a year, depending on third party services may be something to start considering in your security and reliability posture.

Migrating back to self-hosting

Because of the fact that I cannot really depend on third party services to conduct themselves in alignment with my own personal ethics and expectations for service reliability, even when paying them to do so, I started to self-host the majority of my services over the past year. This has had significant benefit in terms of enabling me to have actual visibility into the behavior of my services, and to tune the performance of them, as needed. Software such as Knative has enabled me to work with my own infrastructure as if it were a managed cloud service at one of the big providers.

There was one problem, however. Some of the services I adopted when I decided to start seriously hosting my own services again, have a less than stellar security record, such as WordPress. While the core of WordPress itself has an acceptable security record, as soon as you use basically any plugin, it goes out the window. Much of this was mitigated by the fact that I ran a custom WordPress image as a Knative service, which meant that if any of the plugins got compromised, I could just restart the pod, and I would be back to normal, but I have always thought that it could be done better.

Setting up Gitea because GitHub introduced Copilot

Last year, GitHub announced Copilot, a neural model that was trained on the entire corpus of publicly available source code posted to GitHub. While Microsoft claims that this is allowed under fair use, the overwhelming majority of experts disagree so far. My personal opinion is that this was a breach of the public trust that the FOSS community originally placed in GitHub, and a lesson that we must own our own infrastructure in order to maintain our autonomy in the digital world.

As a result of all of this, I wound up setting up my own Gitea instance, which I use to maintain my own source code. In addition to Gitea, because I needed a CI service, I deployed a Woodpecker instance. Both of these services are very easy to deploy if you are already using Kubernetes, and come highly recommended (it is also possible to use Tekton for CI with Gitea, but it requires more work at the moment).

Automatically publishing the blog with Woodpecker

If you look at the source code for my blog, you will notice that it is largely a normal Hugo site, that has some basic plumbing around Woodpecker. I also use apko to build a custom image that has all of the tools needed to build and deploy the website, which is self-hosted on the new OCI registry implemented in Gitea 1.17. For those interested, you can look at the logs of the deploy job used to post this article!

Almost all of the interesting stuff is in the woodpecker.yml file, however, which does the following:

  • Builds an up to date Hugo image from scratch on every deploy using apko.

  • Builds the new site.

  • Fetches the contents of the last announcement post (newswire.txt).

  • Deploys the new site using an SSH key stored as a secret, and a pinned known SSH key also stored as a secret. The latter is largely so I can just update the secret if I change SSH keys on that host.

  • Checks if the last announcement post is different than the new one, and if so, sends off a post to my Mastodon account using a personal access token.

The last point is the big deal (to me). While ultimately it was not difficult to set up, my original reason for using WordPress was that it could do this type of automation out of the box. Paying off the technical debt of having to worry about WordPress being compromised has certainly been worth it, however.

Future improvements

There are some future improvements I would like to do. For example, I would like to sign the Hugo image I create with cosign. The current blocker on this is that Gitea does not support a configurable audience setting on its OpenID Connect implementation. Once that is done, it should be possible to start working towards allowing Gitea instances to work as OpenID Connect identities for use with the Sigstore infrastructure. This will be very powerful when combined with the new OCI registry support introduced in Gitea 1.17!

I also have some opinions on Woodpecker and other self-hosted CI systems, that I plan to go into more detail on in a near-future blog post.

Hopefully the above provides some inspiration to play with self-hosting your own website, or perhaps playing with apko outside of the GitHub ecosystem. Thanks to the Gitea and Woodpecker developers for making software that is easy to deploy, as well.