/ better AI pipelines by design

Infrastructure for AI, ML & Data

For developers managing AI, ML, and data workflows in production, the challenges extend well beyond scheduling and orchestrating DAGs. Union.ai addresses these complexities by offering a comprehensive infrastructure management platform designed for the nuances of such environments.

Union optimizes resources across teams and implements cost-effective strategies that can reduce expenses by up to 66%. Moreover, it’s engineered to fit within your own cloud ecosystem, ensuring a robust and tailored infrastructure that scales with your technical demands.

View product
Input Graph
Powerful DAGs, observability & cost-efficient engineering
/ Union: just bring your compute, we bring Flyte

Powerful DAGs, observability & cost-efficient engineering

Union is a fully-managed Flyte platform deployed in your VPC that provides a single-endpoint workflow orchestration and compute service to engineers building data and ML products.

Get built-in dashboards, live-logging, and task-level resource monitoring, enabling users to identify resource bottlenecks and simplifying the debugging process, resulting in optimized infrastructure and faster experimentation.

Get a demo
/ from engineers for engineers

AI engineering for engineers

Union is an open AI orchestration platform that simplifies AI infrastructure so you can develop, deploy, and innovate faster. Unlike popular—but simple—AI engineering orchestrators, Union wrangles the infrastructure setup and management as well.

Write your code in Python, collaborate across departments, and enjoy full reproducibility and auditability. Union lets you focus on what matters.

Explore docs
@task
def get_data() -> pd.DataFrame:
    return load_digits(as_frame=True).frame

@task
def train_model(data: pd.DataFrame) -> MLPClassifier:
    features = data.drop("target", axis="columns")
    target = data["target"]
    return MLPClassifier().fit(features, target)

@workflow
def training_workflow() -> MLPClassifier:
    data = get_data()
    return train_model(data=data)

The Union AI orchestration partner network

Available now on the AWS Marketplace

Available now on the AWS Marketplace

Read announcement
Available soon on the GCP Marketplace

Available now on the GCP Marketplace

Read announcement
Member of the Nvidia Inception Program

Member of the Nvidia Inception Program

/ the better replacement for Airflow & Kubeflow

Purpose-built for lineage-aware pipeline orchestration

Bring your own Airflow code (BYOAC) and take advantage of modern AI orchestration features—out of the box! Get full reproducibility, audibility, experiment tracking, cross-team task sharing, compile-time error checking, and automatic artifact capture.

Explore features
Airflow
Union
Versioning

Easily experiment and iterate in isolation with versioned tasks and workflows.

Multi-tenancy

A centralized infrastructure for your team and organization, enables multiple users to share the same platform while maintaining their own distinct data and configurations.

Type checking

Strongly typed inputs and outputs can simplify data validation and highlight incompatibilities between tasks making it easier to identify and troubleshoot errors before launching the workflow.

Caching

Caching the output of task executions can accelerate subsequent executions and prevent wasted resources.

Data lineage

As a data-aware platform, it can simplify rollbacks and error tracking.

Immutability

Immutable executions help ensure reproducibility by preventing any changes to the state of an execution.

Recovery

Rerun only failed tasks in a workflow to save time, resources, and more easily debug.

Human-in-the-loop

Enable human intervention to supervise, tune and test workflows - resulting in improved accuracy and safety.

Intra-task checkpointing

Checkpoint progress within a task execution in order to save time and resources in the event of task failure.

Reproducibility

With every task versioned and every dependency set is captured, making it easy to share workflows across teams and reproduce results.

/ AI orchestration: the essential fabric for rapid Data, ML, & AI development

The best teams choose Union & Flyte

Across Data, ML, and AI, Flyte has established a stellar reputation as the most scalable AI orchestrator. It manages and executes workflows with over 10,000 CPUs and tens of thousands of pipelines, all powered by Python code. Union brings the powerful Flyte platform to your team in a managed environment, so you don’t have to set it up. Discover why the Flyte-powered Union is a game-changer

Faster time-to-market

In today’s fast-paced business environment, the ability to quickly develop and deploy machine learning models can be the difference between success and failure.

Union helps businesses accelerate their ML projects by automating many of the processes involved in model development and deployment, reducing the time and effort required to get models into production.

View Union features

Scalable ML workflows

Scaling machine learning efforts can be challenging due to the need for specialized infrastructure, in-house expertise in distributed systems management, and tools to handle large-scale data processing and model training.

Union enables reproducibility, observability at the workflow, task, and data level, and provides plugins for model deployment and distributed model training tools and frameworks.

Read MethaneSAT case study

Reduce ML technical debt

Without standardized operations and processes in place, many teams struggle to promote models to production resulting in sunk costs and wasted compute resources.

Union enables more efficient and accurate workflows through automated validation and optimization throughout the development and deployment process.

Read ML use case

Integrate with existing tooling

Whether you are working with ML frameworks like TensorFlow and PyTorch, or using tools like Jupyter notebooks and Apache Spark, Union is designed with an extensible plugin system that spans both data science and infrastructure stacks.

This allows users to leverage the power of a managed platform without disrupting existing processes.,

View Union integrations

Globally trusted & tested

10
k+
Community members
1
m+
Downloads per month
30
+
Fortune 100 companies

Join our developer community

“FlyteFile is a really nice abstraction on a distributed platform. [I can say,] ‘I need this file,’ and Flyte™ takes care of downloading it, uploading it and only accessing it when we need to. We generate large binary files in netcdf format, so not having to worry about transferring and copying those files has been really nice.”

Nicholas LoFaso, Senior Platform Software Engineer at MethaneSAT

“Versioning, caching and the different domains we can have in Flyte™ prompted us to move from Airflow to Flyte™ because you don’t really need to think about them and they are … available out of the box in Flyte™.”

Stephen Batifol, Machine Learning Engineer at Wolt

“The multi-tenancy that Flyte™ provides is obviously important in regulated spaces where you need to separate users and resources and things like amongst each other within the same organization.”

Jake Neyer, Software Engineer at Striveworks

“With Flyte™, we want to give the power back to biologists. We want to stand up something that they can play around with different parameters for their models because not every … parameter is fixed. We want to make sure we are giving them the power to run the analyses.”

Krishna Yeramsetty, Principal Data Scientist at Infinome

“We're going to have 10,000-plus CPUs that we plan to use every day to process the raw data. There'll be 30 different targets approximately that we're collecting data on every day. That's about 200 GB of raw data and probably 2 TB or so on the output — a lot of data process. We're leaning heavily on Flyte™ to make that happen.”

Nicholas LoFaso, Senior Platform Software Engineer at MethaneSAT

“Union Cloud solves our operational complexity problems across diverse workloads, whether it is running data cleaning & pre-processing workflows or protein structure ML predictions for low-volume, high-complexity scientific workloads to large-scale scientific simulations. Additionally, the platform can drive down the relative cost of protein production by orders of magnitude. With Union Cloud as our standardized workflow orchestration platform, we can stop managing our own systems and infrastructure, and instead focus on antibody discovery and development.”

Alex Ford, Head of Data Platform at AbCellera Biologics

“When you write Python scripts, everything runs and takes a certain amount of time, whereas now for free we get parallelism across tasks. Our data scientists think that's really cool.”

Dylan Wilder, Engineering Manager at Spotify

“One thing that I really like compared to my previous experience with some of these tools: the local dev experience with pyflyte and the sandbox are super, super nice to reduce friction between production and dev environment.”

Krishna Yeramsetty, Principal Data Scientist at Infinome

“We’ve migrated about 50% of all training pipelines over to Flyte™ from Kubeflow. In several cases, we saw an 80% reduction in boilerplate between workflows and tasks vs. the Kubeflow pipeline and components. Overall, Flyte™ is a far simpler system to reason about with respect to how the code actually executes, and it’s more self-serve for our research team to handle.”

Rahul Mehta, ML Infrastructure/Platform Lead at Theorem LP