Get startedSign in
Back

Mage

🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data.

Available providers

Why use Mage on Plural?

Plural helps you deploy and manage the lifecycle of open-source applications on Kubernetes. Our platform combines the scalability and observability benefits of managed SaaS with the data security, governance, and compliance benefits of self-hosting Mage.

If you need more than just Mage, look for other cloud-native and open-source tools in our marketplace of curated applications to leapfrog complex deployments and get started quickly.

Mage’s websiteGitHubLicenseInstalling Mage docs

Deploying Mage is a matter of executing these 3 commands:

plural bundle install mage mage-aws
plural build
plural deploy --commit "deploying mage"
Read the install documentation

Mage

🧙 A modern replacement for Airflow.

Documentation   🌪️    Get a 5 min overview   🌊    Play with live tool   🔥    Get instant help

Give your data team magical powers

Integrate and synchronize data from 3rd party sources

Build real-time and batch pipelines to transform data using Python, SQL, and R

Run, monitor, and orchestrate thousands of pipelines without losing sleep


1️⃣ 🏗️

Build

Have you met anyone who said they loved developing in Airflow?
That’s why we designed an easy developer experience that you’ll enjoy.

| | | | --- | --- | | Easy developer experience
Start developing locally with a single command or launch a dev environment in your cloud using Terraform.

Language of choice
Write code in Python, SQL, or R in the same data pipeline for ultimate flexibility.

Engineering best practices built-in
Each step in your pipeline is a standalone file containing modular code that’s reusable and testable with data validations. No more DAGs with spaghetti code. | |

2️⃣ 🔮

Preview

Stop wasting time waiting around for your DAGs to finish testing.
Get instant feedback from your code each time you run it.

| | | | --- | --- | | Interactive code
Immediately see results from your code’s output with an interactive notebook UI.

Data is a first-class citizen
Each block of code in your pipeline produces data that can be versioned, partitioned, and cataloged for future use.

Collaborate on cloud
Develop collaboratively on cloud resources, version control with Git, and test pipelines without waiting for an available shared staging environment. | |

3️⃣ 🚀

Launch

Don’t have a large team dedicated to Airflow?
Mage makes it easy for a single developer or small team to scale up and manage thousands of pipelines.

| | | | --- | --- | | Fast deploy
Deploy Mage to AWS, GCP, or Azure with only 2 commands using maintained Terraform templates.

Scaling made simple
Transform very large datasets directly in your data warehouse or through a native integration with Spark.

Observability
Operationalize your pipelines with built-in monitoring, alerting, and observability through an intuitive UI. | |


🧙 Intro

Mage is an open-source data pipeline tool for transforming and integrating data.

  1. Quick start
  2. Demo
  3. Tutorials
  4. Documentation
  5. Features
  6. Core design principles
  7. Core abstractions
  8. Contributing

🏃‍♀️ Quick start

You can install and run Mage using Docker (recommended), pip, or conda.

Install using Docker

  1. Create a new project and launch tool (change demo_project to any other name if you want):

    bash
    docker run -it -p 6789:6789 -v $(pwd):/home/src mageai/mageai \
      /app/run_app.sh mage start demo_project
    • If you want to run Mage locally on a different port, change the first port after -p in the command above. For example, to change the port to 6790, run:
    bash
    docker run -it -p 6790:6789 -v $(pwd):/home/src mageai/mageai \
      /app/run_app.sh mage start demo_project

    Want to use Spark or other integrations? Read more about integrations.

  2. Open http://localhost:6789 in your browser and build a pipeline.

  • If you changed the Docker port for running Mage locally, go to the url http://127.0.0.1:[port] (e.g. http://127.0.0.1:6790) in your browser to view the pipelines dashboard.

Using pip or conda

  1. Install Mage

    (a) To the current virtual environment:

    bash
    pip install mage-ai

    or

    bash
    conda install -c conda-forge mage-ai

    (b) To a new virtual environment (e.g., myenv):

    bash
    python3 -m venv myenv
    source myenv/bin/activate
    pip install mage-ai

    or

    bash
    conda create -n myenv -c conda-forge mage-ai
    conda activate myenv

    For additional packages (e.g. spark, postgres, etc), please see Installing extra packages.

    If you run into errors, please see Install errors.

  2. Create a new project and launch tool (change demo_project to any other name if you want):

    bash
    mage start demo_project
  3. Open http://localhost:6789 in your browser and build a pipeline.


🎮 Demo

Live demo

Build and run a data pipeline with our demo app.

WARNING

The live demo is public to everyone, please don’t save anything sensitive (e.g. passwords, secrets, etc).

Demo video (5 min)

Mage quick start demo

Click the image to play video


👩‍🏫 Tutorials

Fire mage

🔮 Features

| | | | | --- | --- | --- | | 🎶 | Orchestration | Schedule and manage data pipelines with observability. | | 📓 | Notebook | Interactive Python, SQL, & R editor for coding data pipelines. | | 🏗️ | Data integrations | Synchronize data from 3rd party sources to your internal destinations. | | 🚰 | Streaming pipelines | Ingest and transform real-time data. | | ❎ | dbt | Build, run, and manage your dbt models with Mage. |

A sample data pipeline defined across 3 files ➝

  1. Load data ➝
    python
    @data_loader
    def load_csv_from_file():
        return pd.read_csv('default_repo/titanic.csv')
  2. Transform data ➝
    python
    @transformer
    def select_columns_from_df(df, *args):
        return df[['Age', 'Fare', 'Survived']]
  3. Export data ➝
    python
    @data_exporter
    def export_titanic_data_to_disk(df) -> None:
        df.to_csv('default_repo/titanic_transformed.csv')

What the data pipeline looks like in the UI ➝

data pipeline overview

New? We recommend reading about blocks and learning from a hands-on tutorial.

Ask us questions on Slack


🏔️ Core design principles

Every user experience and technical design decision adheres to these principles.

| | | | | --- | --- | --- | | 💻 | Easy developer experience | Open-source engine that comes with a custom notebook UI for building data pipelines. | | 🚢 | Engineering best practices built-in | Build and deploy data pipelines using modular code. No more writing throwaway code or trying to turn notebooks into scripts. | | 💳 | Data is a first-class citizen | Designed from the ground up specifically for running data-intensive workflows. | | 🪐 | Scaling is made simple | Analyze and process large data quickly for rapid iteration. |


🛸 Core abstractions

These are the fundamental concepts that Mage uses to operate.

| | | | --- | --- | | Project | Like a repository on GitHub; this is where you write all your code. | | Pipeline | Contains references to all the blocks of code you want to run, charts for visualizing data, and organizes the dependency between each block of code. | | Block | A file with code that can be executed independently or within a pipeline. | | Data product | Every block produces data after it's been executed. These are called data products in Mage. | | Trigger | A set of instructions that determine when or how a pipeline should run. | | Run | Stores information about when it was started, its status, when it was completed, any runtime variables used in the execution of the pipeline or block, etc. |


🙋‍♀️ Contributing and developing

Add features and instantly improve the experience for everyone.

Check out the contributing guide to set up your development environment and start building.


👨‍👩‍👧‍👦 Community

Individually, we’re a mage.

🧙 Mage

Magic is indistinguishable from advanced technology. A mage is someone who uses magic (aka advanced technology). Together, we’re Magers!

🧙‍♂️🧙 Magers (/ˈmājər/)

A group of mages who help each other realize their full potential! Let’s hang out and chat together ➝

Hang out on Slack

For real-time news, fun memes, data engineering topics, and more, join us on ➝

| | | | --- | --- | | Twitter | Twitter | | LinkedIn | LinkedIn | | GitHub | GitHub | | Slack | Slack |


🤔 Frequently Asked Questions (FAQs)

Check out our FAQ page to find answers to some of our most asked questions.


🪪 License

See the LICENSE file for licensing information.

Water mage casting spell


How Plural works

We make it easy to securely deploy and manage open-source applications in your cloud.

Select from 90+ open-source applications

Get any stack you want running in minutes, and never think about upgrades again.

Securely deployed on your cloud with your git

You control everything. No need to share your cloud account, keys, or data.

Designed to be fully customizable

Built on Kubernetes and using standard infrastructure as code with Terraform and Helm.

Maintain & Scale with Plural Console

Interactive runbooks, dashboards, and Kubernetes api visualizers give an easy-to-use toolset to manage application operations.

Learn more
Screenshot of app installation in Plural app

Build your custom stack with Plural

Build your custom stack with over 90+ apps in the Plural Marketplace.

Data
Stack
Airbyte
DATA
Clickhouse
DATA
Dagster
DATA
Datahub
DATA
Growthbook
DATA
Jitsu
DATA
Lightdash
DATA
Posthog
DATA
Explore the Marketplace

Used by fast-moving teams at

  • CoachHub
  • Digitas
  • Fnatic
  • FSN Capital
  • Justos
  • Mott Mac

What companies are saying about us

We no longer needed a dedicated DevOps team; instead, we actively participated in the industrialization and deployment of our applications through Plural. Additionally, it allowed us to quickly gain proficiency in Terraform and Helm.

Walid El Bouchikhi
Data Engineer at Beamy

I have neither the patience nor the talent for DevOps/SysAdmin work, and yet I've deployed four enterprise-caliber open-source apps on Kubernetes... since 9am today. Bonkers.

Sawyer Waugh
Head of Engineering at Justifi

This is awesome. You saved me hours of further DevOps work for our v1 release. Just to say, I really love Plural.

Ismael Goulani
CTO & Data Engineer at Modeo

Wow! First of all I want to say thank you for creating Plural! It solves a lot of problems coming from a non-DevOps background. You guys are amazing!

Joey Taleño
Head of Data at Poplar Homes

We have been using Plural for complex Kubernetes deployments of Kubeflow and are excited with the possibilities it provides in making our workflows simpler and more efficient.

Jürgen Stary
Engineering Manager @ Alexander Thamm

Plural has been awesome, it’s super fast and intuitive to get going and there is zero-to-no overhead of the app management.

Richard Freling
CTO and Co-Founder at Commandbar

Case StudyHow Fnatic Deploys Their Data Stack with Plural

Fnatic is a leading global esports performance brand headquartered in London, focused on leveling up gamers. At the core of Fnatic’s success is its best-in-class data team. The Fnatic data team relies on third-party applications to serve different business functions with every member of the organization utilizing data daily. While having access to an abundance of data is great, it opens up a degree of complexity when it comes to answering critical business questions and in-game analytics for gaming members.

To answer these questions, the data team began constructing a data stack to solve these use cases. Since the team at Fnatic are big fans of open-source they elected to build their stack with popular open-source technologies.

Fnatic’s Data Stack

Airbyte
Airflow
Clickhouse
Grafana
Metabase
PostgreSQL

FAQ

Plural is open-source and self-hosted. You retain full control over your deployments in your cloud. We perform automated testing and upgrades and provide out-of-the-box Day 2 operational workflows. Monitor, manage, and scale your configuration with ease to meet changing demands of your business. Read more.

We support deploying on all major cloud providers, including AWS, Azure, and GCP. We also support all on-prem Kubernetes clusters, including OpenShift, Tanzu, Rancher, and others.

No, Plural does not have access to any cloud environments when deployed through the CLI. We generate deployment manifests in the Plural Git repository and then use your configured cloud provider's CLI on your behalf. We cannot perform anything outside of deploying and managing the manifests that are created in your Plural Git repository. However, Plural does have access to your cloud credentials when deployed through the Cloud Shell. Read more.