Table of Contents
Chris Hronek, the Director of Data Engineering at Linqto, a marketplace specializing in private equity investments, initially opted for Amazon Web Services Managed Workflow for Apache Airflow (MWAA).
He did so because Apache Airflow is the de facto standard for orchestrating data pipelines. When deploying Airflow in production the two most popular options are to deploy it yourself via Helm charts or use some sort of a managed service.
Despite the initial promise, Linqto's data operations faced challenges as the scale increased. The costs associated with MWAA became a concern, coupled with limitations in features and delays in Airflow updates.
In this post, we’ll cover Linqto’s journey transitioning from MWAA to self-hosting Apache Airflow on AWS Kubernetes (K8s) via Plural, highlighting the cost savings achieved when switching over to Plural.
Why it was time to migrate off MWAA
While MWAA seemed promising for Chris and his team, they soon reached a breaking point driven by several significant concerns that impacted their efficiency, flexibility, and costs.
Hidden Costs of CloudWatch Logging
One of the first challenges Chris and his team faced was the hidden costs that were associated with CloudWatch logging. Sure, MWAA handled operational aspects, but the pricing structure for CloudWatch logging grew unexpectedly high as their data operations expanded. It was a financial burden that they had yet to anticipate fully.
Chris and his team were faced with the reality that MWAA standard environment pricing conveniently omits CloudWatch from the discussion because it isn’t enabled by default. While you could run tasks without task logs, you wouldn’t know what went wrong when they fail.
Missing Features and Limitations
A huge flaw of managed services is that they come with some limitations since they are out-of-the-box solutions. Managed services are a one-size-fits-all solution and are often not fully customizable for your use cases.
For Linqto, features like Deferrable Operators and the Airflow Stable REST API were absent, limiting their ability to design and execute workflows to fit their needs. These missing features added complexity to their operations.
Delayed Airflow Updates
Since MWAA depended on AWS infrastructure, Linqto couldn’t use the latest Airflow version until AWS officially released it for MWAA. This delay was often weeks to months, limiting their access to new features that were desperately needed.
This led Chris and his team to reach their breaking point with MWAA, and they began looking at alternative offerings that could give them greater control, flexibility and reduce their costs. The team at Linqto decided that it would be best to self-host Airflow to meet their growing needs. They decided to leverage Plural to self-host Airflow in their cloud environment.
Why use Plural to self-host applications?
For Chris and his team, there were three options for deploying Airflow:
- Pay for a white-glove service like Astronomer
- Continue using a managed service like MWAA
- Self-host Airflow on a Kubernetes cluster
With cost and flexibility being a huge concern for Chris and his team he opted to self-host Airflow on a Kubernetes cluster. However, self-hosting Airflow on your own is a daunting task as it is a fairly complicated stateful application, with a SQL database and a Redis cache, which makes for a tricky setup.
Airflow is not only complicated to get up and running but is challenging to maintain over time especially if you don’t have prior experience self-hosting applications with Kubernetes.
With Plural, Chris and his team didn’t have to be experts in Kubernetes to self-host applications. “We were able to manage infrastructure and deploy applications without giving up control, portability, privacy, and cost-effectiveness for the sake of convenience,” said Chris.
How Linqto’s Data Stack Evolves Around Plural
Once Chris and his team could switch from MWAA to Airflow with Plural, they began to explore the possibilities of switching over their entire data stack to Plural.
After reviewing the applications available in our marketplace, they decided that they could:
- Switch from Talend Stitch to OSS Airbyte (Talend pricing was similar to Fivetran where they billed by the row and the cost increased as their data grew).
- Instead of paying a premium for a managed service Data Catalog solution, they were able to self-host DataHub for cataloging and lineage for an end-to-end view of data.
“Everything seemed too good to be true because we were killing three birds with one stone,” said Chris, referring to Data Ingestion, Data Orchestration, and Data Cataloging. To prepare for this switchover, Chris and his team put together a plan that would allow us to test out Plural without entirely deprecating MWAA & Stitch.
This first involved them creating a 1:1 copy of their MWAA Airflow in Plural. They simply spun up an EKS cluster in their AWS account by following Plural's CLI quickstart guide.
After installing an Airflow environment in our Plural Cluster, they then completed the following:
- Created their own custom Airflow image using AWS ECS. The custom image allowed them to install pip dependencies needed in Airflow and initialize a Python virtual environment to keep our dbt workloads isolated from Airflow.
- Copied their MWAA DAGs and supporting files to the repo that Plural was syncing to the newly created Airflow environment.
- Fine-tuned the Plural Airflow configuration settings so that:
- The cluster could authenticate to AWS Secrets Manager to pull secrets from their Airflow Connections/Variables.
- Airflow Executor used KubernetesExecutor with proper requests/limits for Airflow Tasks.
- Airflow Tasks used a custom node group of SPOT instances to reduce our AWS costs.
After doing so Chris and his team tested each Airflow DAG individually to ensure it worked. Upon successfully testing all of our DAGs, we paused the DAGs in MWAA and unpaused them in our Plural Airflow.
After pausing the MWAA DAGs on 10/10/23, you can see that their CloudWatch costs disappeared from AWS (because the tasks were no longer running and generating logs). After about a week of successful DAG Runs in Plural, they completely deleted their MWAA environment on 10/17/23.
As you can see from the AWS Cost Explorer Graph, their daily costs were almost cut in half. So, for half of the cost of running MWAA, The Linqto team can run Airflow, eliminate Stitch costs (approximately $300/mo), and avoid paying a managed service for Data Cataloging.
Getting started with Plural
Are you looking to deploy open-source applications on Kubernetes without having to reinvent the wheel?
If you are interested in learning more about how Plural works and how we can help your engineering team deploy open-source applications in a cloud production environment, reach out to me and the rest of the team over at Plural.
Be the first to know when we drop something new.