How we Deployed PostHog on Kubernetes
Deploying PostHog on Kubernetes is challenging. Here's how to do it on your own and how we streamlined the deployment with Plural.
Last month, PostHog announced that it was discontinuing support for its product on Kubernetes. The timing was coincidental since I just spent a considerable amount of time figuring out how to add it to our marketplace for our users.
It's not surprising that this news came out, considering how many moving parts users need to figure out when self-hosting it. However, this puts me in a position to offer an educational look at the processes involved in deploying a complex application on Kubernetes.
I’ll go into why PostHog was difficult to deploy on Kubernetes, how you would do it by yourself, and how we packaged it for deployment on Plural.
Why deploying PostHog on Kubernetes is challenging
If you aren’t familiar with PostHog, it is an open-source product analytics platform and an alternative to products like Amplitude, Heap, and Mixpanel, which can be expensive.
While a reduction in cost is definitely a huge benefit of self-hosting PostHog, users also regain the ability to own all the data and customize the application to better serve their stakeholders. Even internally, every department at Plural relies on our own PostHog deployment to drive decision-making for the future of our platform.
When PostHog announced that they are sunsetting support for Kubernetes, their announcement highlighted that maintaining applications on Kubernetes is an intensely manual process with a lot of dependencies:
“Hosting PostHog at scale is complex. With our Kubernetes users, we've seen issues crop up in every part of the stack. In event ingestion, Kafka, ClickHouse, Postgres, Redis and within the application itself. Sometimes the fix is simple (‘increase disk space’), but often the issue is something a couple of layers deep and very hard to debug, involving long calls with expensive engineers on both sides. Even something as simple as a full disk would cause their instance of PostHog to be down for hours or days.”
It’s not just PostHog that is complicated to deploy. Most open-source applications are notoriously challenging to deploy and maintain on Kubernetes. To do so effectively, developers will need to understand things like:
- How readiness and liveness probes work
- How resource requests and limits work
- How horizontal or vertical pod autoscaling works
- How all these things influence each other
If we now add a service mesh like Istio to the mix, the amount of technologies developers need to understand and be able to debug grows quickly.
As you begin to scale PostHog in production it becomes a very data-intensive application that has a complex system architecture. So when our customers heard that we were able to deploy PostHog in Kubernetes they were quite impressed.
An application like PostHog requires access to cloud object storage, a PostgreSQL database, Redis, Kafka, and Clickhouse (which requires a Zookeeper installation that PostHog also needs to access.
The PostHog engineering team was spending a good chunk of time troubleshooting issues, ultimately slowing down their team from working on their main infrastructure priorities like their cloud and open-source Docker compose deployment.
This exact scenario from PostHog sums up most of what we hear on a daily basis when we talk to developers. Here is what the PostHog team had to say in their announcement that they were sunsetting PostHog on Kubernetes:
“We also learned that the tools to do that automation just don't exist. We kept finding new failure modes. When onboarding a new customer we would have to vet their engineering team for Kubernetes experience so that we'd be confident they could help us debug issues in their PostHog deploy. Folks that didn't have infra experience would often be able to get something set up, only to get stuck when something went wrong.”
Deploying and maintaining applications on Kubernetes is very complex and takes a good amount of time.
Steps to Self-host PostHog on Kubernetes
So what would you have to do if you really wanted to do this by yourself? First off, you will need to ensure that your Kubernetes cluster contains some essential components
- An Ingress controller like Ingress-nginx to allow external traffic to the applications you host on that specific Kubernetes cluster
- A cert-manager for issuing SSL certificates for your applications
- An external DNS to ensure that the DNS records for your applications are always up to date
The first step is to deploy a Kubernetes cluster on your preferred cloud provider. While there are numerous ways to do this, Infrastructure-as-code solutions (IAC) like Terraform or Pulumi would be the most robust path, with GitOps to manage changes to the code used to deploy the cluster.
Next, you would need to install cluster-level dependencies and PostHog and its dependencies.
Cluster-level dependencies such as ingress controller, cert-manager, and external DNS can be installed with a simple Helm command. However, to maintain and run things in a production environment, you need a GitOps tool like Flux or Argo CD.
For PostHog and its primary dependencies, their Helm chart can deploy everything you need in a non-production environment. However, in a production environment, you likely wouldn’t want to use a dedicated MinIO installation for object storage and would rather have it directly integrated with your existing cloud provider.
For PostHog’s secondary dependencies like Kafka and a Postgres database, a Helm chart will get you up and running quickly. However, to maintain the software stack over time, you would want to either integrate with cloud-managed services, the Strimzi Kafka operator, or the Zalando Postgres operator to manage Kafka and Postgres deployments on Kubernetes in production.
You can use the Altinity Clickhouse operator for running and managing ClickHouse on Kubernetes. The PostHog Helm chart bundles this operator as well for its ClickHouse deployment. An enterprising operator would make several improvements here, such as removing the need for an external Zookeeper with the built-in ClickHouse-Keeper.
While the above steps might lay the groundwork for deploying PostHog on Kubernetes, other complications will inevitably arise, such as keeping secrets out of IaC repositories but still managing the entire deployment in a reproducible manner. PostHog themselves admit this, hence their recent decision.
How we deployed PostHog at Plural
At Plural, part of our bootstrapping involves deploying a Kubernetes cluster, cluster-level dependencies, and a system for secret management. Plural allows you to easily track dependencies between applications. When we deployed PostHog, two of its dependencies were our baked-in Kafka and Postgres applications that deploy the Kafka and Postgres operators in our production cluster.
We can also depend on our Redis application which allows us to share a single Redis cluster among different applications to reduce cost and management overhead.
As Plural allows application publishers to bundle both Terraform and Helm within code for deploying an application, it is easy for us to create an object storage bucket in the cloud provider using Terraform and set up the PostHog Helm chart to connect to the cloud object storage.
That leaves us with two large dependencies, ClickHouse and Zookeeper.
For ClickHouse we are using the Altinity ClickHouse operator that we recommended above for self-deployment. This allows you to easily deploy and maintain ClickHouse deployments on Kubernetes, which makes us not need to reinvent the wheel here. To reduce cost and the number of dependencies needed for ClickHouse and PostHog we decided to remove the need for Zookeeper and instead use the ClickHouse-Keeper that's built-in to ClickHouse.
Since the Altinity ClickHouse operator does not have built-in support for integrating ClickHouse-Keeper in the ClickHouse deployments it manages, we resorted to some creative problem-solving. The ClickHouse operator allows users to define pod and service templates that it uses to deploy ClickHouse on Kubernetes.
By creating special templates for the first three replicas of a ClickHouse deployment that configures the internal ClickHouse-Keeper on those replicas, we were able to remove the need for an external Zookeeper installation along with the resource usage and management overhead that comes along with it.
Deploy PostHog on Kubernetes using Plural
Ultimately, PostHog’s concerns about their self-hosted offering were fair, but I feel like we’ve made the experience of deploying PostHog on Kubernetes more approachable.
We offer an in-browser cloud-shell experience with all the dependencies loaded or a command line interface. Simply follow alongside our documentation to get up and running in under 30 minutes.
If you happen to get stuck check out our Troubleshooting documentation or reach out to us on our community Discord for help.
Be the first to know when we drop something new.