Oriel Vergara is the Head of Data at Cayena, a Brazilian marketplace that connects food suppliers with restaurants, bars, hotels, and dark kitchens. Oriel leads a five-person data team that is hungry for results.
“We want to democratize data for everybody,” said Oriel.
His data team consists of a data scientist, a data engineer, an analytics engineer, and a machine learning engineer. The core of Cayena’s business strategy is supplying data and intelligence to a market that lacks access to new technologies. This is where Oriel and his team come into play.
As a Head of Data, one of Oriel’s main responsibilities is to develop a data strategy for the broader organization and help his team in the deployment of data products like automatic food labeling, food recommendation systems, credit risk assessment, and analytics for internal stakeholders.
Oriel and his team realize how valuable it is to deliver results promptly while being able to do more with fewer resources. “The food service is a very big market and we are a very ambitious team,” added Oriel. “We need to deliver our product fast and iterate on our applications even faster.”
The Cayena team delivers thousands of ingredients every day, ranging from black beans for feijoada (a local favorite) to fresh cheese for pizza shops.
“Each single kitchen needs to get hundreds of ingredients with ease to focus on cooking. So this dynamic demands agility,” said Oriel. “Our team needs to be able to anticipate prices and know where and when deliveries have to go out. We are aware that our data applications are helping restaurants cook wonderful dishes.”
Why Cayena chose Plural
Cayena digests data from a variety of sources in real time to better serve its customers. The engineering team at Cayena is an open-source first team, which is why they decided to utilize Airbyte, an open-source ELT tool to handle data ingestion. However, a majority of open-source tools are server-side applications, meaning you are responsible for deploying and maintaining the application yourself.
While Airbyte could solve their data ingestion problems, they still needed a way to quickly and consistently deploy open-source applications like Airbyte into production environments. “Our problem is not deploying on Kubernetes,” said Oriel. “Our problem is deploying applications fast enough and maintaining them over time”
It’s no secret how hard Kubernetes is to maintain, especially if you have little knowledge of the software and even less time to deploy and maintain those applications properly. Simply put, time is far too valuable for data professionals to worry about setting and maintaining data infrastructure.
“I’m a data scientist at heart. I know about models and machine learning but less about deploying applications,” said Oriel. “To us, the most important thing is to deliver fast and powerful applications onto Kubernetes without being specialists in it.”
To deploy Airbyte into production, Cayena chose Plural as their open-source deployment platform. Considering Cayena likes to work fast, Oriel was impressed with how fast they were able to deploy Airbyte into production using Plural, “When we decided to use Plural we were able to get all of our data into production in one day like WOW.”
Before using Plural, Cayena came really close to hiring a Kubernetes specialist to handle the deployment and monitoring of applications on Kubernetes. While the opportunity didn’t work out, Cayena’s data applications are operating fine thanks to Plural, which Oriel compared to having a DevOps specialist on board.
“We also have deployed Kubecost and the Plural console to monitor how exactly we are spending our resources on each application we deploy,” said Oriel.
How Plural fits into Cayena’s future
Currently, Cayena’s data stack consists of the following tools:
- Airbyte, Hevo, and custom Python scripts for data ingestion.
- S3 and Amazon Redshift for the data storage layer.
- Custom Python scripts (soon to be Apache Spark) and dbt for data transformation.
- Apache Airflow for data orchestration, and Metabase for a data visualization layer.
While they currently deploy two layers of their data stack on Plural (Airbyte and Metabase) they soon plan to deploy the following data tools on Plural
- Apache Spark: For batch processing of large amounts of data
- Airflow: For scheduling and executing workflows (they plan to migrate off a fully-managed Airflow instance soon to allow for further customization)
- Datahub: For Data discovery and to better understand data lineage
Getting started with Plural
Does your organization's story sound similar to Cayena’s? Are you looking to deploy your open-source applications into a production environment at a rapid speed?
If you’re interested in learning more about how Plural works and how we can help your engineering team deploy open-source applications in a cloud production environment, reach out to myself and the rest of the team over at Plural.
Join us on our Discord channel for questions, discussions, and to meet the rest of the community.
Be the first to know when we drop something new.