
How to deploy MLFlow on Kubernetes
Setting up MLFlow components locally is a tedious task. It involves installing MLFlow, getting a tracking server running, setting up databases, and then configuring everything right.
Typically, in a machine learning project, model performance fluctuates during experimentation. Because of this, you’ll want to have the capability to keep track of every experiment you run. Typically, this process is quite complex and it’s common to end up with model names quite similar to this:
model_1, model_2, model_final, model_final_final
On top of this, you also want to benchmark machine learning models while managing and iterating over the process. MLFlow is a compelling solution for these exact problems, offering an integrated platform for centrally tracking experiments, packaging models, and managing the model lifecycle from development to production.
The open-source tool streamlines the machine-learning lifecycle and organizes the machine-learning process into organized and reproducible components, enabling both individuals and teams to develop models faster and fostering collaboration.
In this article, we’ll provide an overview of MLFlow architecture and components, how to set up MLFlow locally, and how to deploy it on Kubernetes using Plural.
What is MLFlow?
MLFlow provides a central hub for tracking machine learning experiments, models, code, parameters, and results. It helps data scientists and ML engineers to organize and optimize their workflow.
The key components of MLFlow include
Experiment Tracking - Without clear organization and logging of runs/experiments, model development descends into a chaotic state, teams lose reproducibility, benchmarking becomes difficult, insights get lost in notebook clutter, and productionizing models become precarious at best. MLFlow allows tracking runs under experiments for organization and analysis. Each run records the code, metadata, configuration, and results to reproduce the workflow.
Models and Artifacts - The key part of MLFlow is its ability to cleanly package models and artifacts produced during runs/experiments. Trained models can be stored in MLFlow to share between runs. Artifacts like images and data files can also be logged.
MLFlow Projects - One interesting component of MLFlow is workflow modularization into reusable components enables collaboration at scale within organizations. MLFlow allows you to package entire workflows as auditable, repeatable, and shareable units.
Model Registry - Once you are done training your model, managing and monitoring them collaboratively during the transition to production can be critical. MLFlow Model Registry introduces an organized approach to govern and store models post-development for smooth handoff to applications.
Why use MLFlow:
Unlike other alternatives such as Weights and Biases, Neptune, Comet ML, BentoML, Yatai, Great, Imho, MlFlow is an open-source platform.
MLFlow provides a valuable solution to the core challenges of managing machine learning workflows and models. By introducing systems, conventions, and tools tailored to the ML lifecycle, MLFlow streamlines the development, collaboration, and deployment of ML applications.
The platform is also great for tracking experiments, MLFlow introduces the concept of Runs which automatically log parameters, code versions, metrics, and output files to fully capture the inputs and outputs of workflows using `mlflow.auto_log()`. These runs can be organized under Experiments for easy search and comparison. The combination of experiment tracking and model packaging enables reproducible research.
Finally, MLFlow projects are self-contained environments that allow workflows to be reliably reproduced across platforms. They encapsulate the code, dependencies, and parameters to provide encapsulated components for end-to-end pipelines. This makes collaborating across data scientists productive by avoiding dependency conflicts. Projects also integrate nicely with automation and orchestration tools.
MLFlow also provides a much-needed central hub for teams to collaboratively manage models post-development. Versioning, stage transitions, annotations, access control, and integration with deployment tools bring rigor to model handoff and governance.
For these reasons, MLFlow has quickly gained popularity at companies doing machine learning at any substantial scale.
How to Setup MLFlow Locally
MLFlow can be installed on your local machine with a few simple steps:
1. Install MLFlow - Use pip to install the MLFlow Python package:
2. Start local server - Launch the MLFlow tracking server on localhost:
This launches a server on port 5000 to store your data, and stores tracking data and artifacts in a Mlruns/ subdirectory of where you ran the code.
3. Run experiments - In your Python scripts, import mlFlow, set tracking URL to:
And then call
And then head over to APIs to track experiments. In the tracking server, metrics, parameters and artifacts logged will be recorded.
4. Launch UI - Go to http://localhost:5000 in your browser to open the MLFlow UI and see logged experiments, runs, artifacts, models, etc. You can visually compare runs.
That covers the basics of an MLFlow local setup. Setting up all the MLFlow components locally is a tedious task. It involves having to install MLFlow, get a tracking server running, set up databases, and configure everything just right.
While this is doable, the management of production-stage operations and monitoring requires manual intervention and this process varies drastically across cloud providers and platforms.
How to Setup MLFlow on Plural
Next, we’ll cover how to set up a fresh Kubernetes cluster and install MLFlow onto that cluster using Plural, an open-source Kubernetes DevOps platform that allows you to deploy and maintain open-source applications in your own cloud with little to no management experience necessary.
Requirements:
- Your preferred cloud admin credentials (AWS, GCP, Azure). We do provide a six-hour complimentary GCP demo to test out Plural.
- A GitHub or GitLab account
- An account with Plural. To sign up visit the Plural App and follow the on-screen instructions to set up an account.
1.Create and Configure a Kubernetes Cluster
After going through onboarding, you'll begin to create and configure a Kubernetes cluster to deploy MLFlow on with Plural. For this tutorial we'll deploying it in a AWS cloud environment.
2. Create either a GitHub or GitLab repository
3. Continue configuring your cloud credentials
You'll then enter your preferred GitHub or GitLab repository name >> input or paste your AWS credentials (Key ID, Secret ID, and preferred region) >> Set a preferred name for your cluster, s3 bucket, and subdomain.
Plural sets up the infrastructure on your cloud platform's account. To be able to perform the necessary steps, the provided credentials must have sufficient permissions to create, update, and delete all of the required resources and services.
Once done with the workspace configuration, it is time to install MLFlow on that Kubernetes cluster we created.
4. Install MLFlow on Kubernetes
After configuring your workspace you can explore a list of open-source applications available in our marketplace. However, for this tutorial we'll be installing MLFlow. You can search and select MLFlow on the left side of your screen.
Next, we will Configure your dependencies and input our preferred name for VPC, and MLFlow Hostname.
Additionally, enabling OIDC will require all users to sign in using their Plural credentials, regardless of whether they already have a Plural account. This can be helpful for organizations that want to centralize authentication and authorization for all of their applications.
OIDC (OpenID Connect) is a good feature that can be used to prevent unauthorized access to your MLFlow dashboard. However, if you don't want to use single sign-on, you can disable OIDC.
After the configuration process is done, you will get a url/endpoint for MLFlow and Console.
Next, let’s scale our MLFlow resources on the console.
5. Scale up MLFlow Resources
To upgrade or update our MLFlow resources, click on the launch console or navigate to your console.
Navigate to console >> click on MLFlow >> navigate to Configuration on the right panel.
You can make changes to the values.yaml file of the helm chart in your deployment repo. Alternatively, you can navigate to the runbook and change the allocated resources.
6. Simple MLflow Logging
Now, that we have successfully set up MLFlow using Plural, let’s log an experiment for our Machine learning model training process.
7. Viewing the result
Navigate to your browser and input the MLFlow URL. If you are prompted to sign in then you enabled OIDC.
Alternatively, you could use the Plural CLI to install MLFlow
To set up MLFlow with the Plural CLI directly through your deployment repository, you can run the following steps below;
Step 1: Install the Plural CLI. Follow along our CLI quickstart guide to get the Plural CLI installed and configured appropriately.
Step 2: Run the command below; to initialize and configure your environment
Running the plural init command will start a configuration wizard that will help you configure your GitHub repository, and cloud provider to use with Plural.
The wizard will ask you a few questions about your project, such as your cloud provider, your preferred Github repository name, your project's name, and your hostname.
Once you have answered these questions, the wizard will create a configuration file workspace. yaml for you that will store your project configurations.
Then you can install any application or dependencies required with the command below
Install any application using Plural Bundle
After running the plural bundle you will be asked to approve some configuration like OIDC, etc.
Finally, run the command below to generate all deployment artifacts in the repo, then deploy them in dependency order.
Additionally, if you want to delete any application use the command below:
With these simple steps, you can be productively running MLflow at scale on Plural
Wrapping Up
Throughout this article, we provided an overview of MLflow's capabilities, discussed the multifaceted importance of MLflow for machine learning teams today, and covered hands-on steps for setting it up locally and on Plural.
MLflow has evolved into an essential standard for organizing, reusing, and scaling machine learning workflows. Every practitioner should be familiar with leveraging MLflow for their experiments, models, and pipelines.
Integrating MLflow into your team's process will enable iteration speed, collaboration, and continuity of innovation as your projects grow.
Are you looking to get your MLFlow instance up and running on Kubernetes with minimal effort?
Make sure to join our Discord community for deployment help, discussion, and meeting other Plural users.
Ready to effortlessly deploy and operate open-source applications in minutes? Get started with Plural today.
Newsletter
Be the first to know when we drop something new.