Sage Elliott

Min Read

•

March 3, 2025

Reproducible Workflows for Compound AI: Reliable and Scalable AI Development

In AI and machine learning, the need for reproducibility is essential for ensuring reliability, transparency, and trustworthiness of models and experiments. Reproducible workflows allow teams to verify, track, and iterate on complex workflows across code, data, and multiple environments, which is essential for achieving consistency, accuracy, and scalability as you build compound AI systems.

Union’s AI platform provides tools and enforces best practices to make building reproducible workflows an integrated part of your ML and data pipeline lifecycle. In this guide, we’ll explore the value of reproducibility and how to implement it effectively in your ML organization.

If you have questions about implementing reproducible workflows in your organization, schedule time with one of our AI Engineers!

<div class="button-group is-center"><a class="button" href="https://www.union.ai/consultation">Book a free consultation</a></div>

Why Reproducibility Matters in AI Pipelines

Reproducibility is foundational to the reliability and credibility of compound AI systems. Without it, insights, models, and experimentation are susceptible to inconsistencies and confusion, making scaling complex AI applications difficult. Here are some key reasons reproducibility is essential in AI workflows:

Enhanced Collaboration and Debugging: Reproducibility allows teams to seamlessly share their version of workflows and artifacts, enabling others to pick up where someone left off or to troubleshoot issues with the same exact configuration and data.
Consistency Across Environments: Code that works on a developer's local setup often fails when deployed to production. Reproducible workflows ensure that code, data, and dependencies are consistent across environments, from local to remote clusters.
Auditability and Compliance: Many organizations require documented steps to ensure transparency and compliance with standards. Reproducible workflows ensure that every part of the ML pipeline is traceable by versioning code, artifacts, environment and data lineage.
Scalability and Future-Proofing: When workflows are reproducible, they can be easily adapted and scaled. This flexibility allows teams to expand their capabilities without re-engineering pipelines from scratch.

Best Practices for Building Reproducible AI Pipelines:

Achieving reproducibility in AI workflows requires implementing some best practices. Here’s how to make reproducibility a standard part of your workflow design.

By breaking down the process, let’s look at how Union can enable these best practices for reproducibility across machine learning and data teams. Union is an AI platform designed to empower engineering teams to build, deploy, and manage reproducible AI workflows at scale.

History of AI Workflow Executions in Union

Versioned Workflows for Consistent Tracking & Sharing:

Versioning AI workflows ensures you can trace back to the exact code, hyperparameters, datasets, and configurations used for a particular execution, making debugging, collaboration, and compliance much more straightforward. By providing a shared, consistent framework for experiments and deployments, versioning allows better communication and alignment across development teams.

Union automatically versions workflows, data, and models by capturing a snapshot of your workflow each time it’s executed. Each execution is versioned, so when a dataset or model changes, the workflow logs these changes, ensuring the lineage of data and changes are preserved.

The versioned workflow is assigned a unique ID and is accessible via the platform UI or remote API.

Versioned ML Workflow in Union: Code, Dependencies, input, and data visualization

Model and data lineage and versioning can be extended further by explicitly using Union Artifacts.

Containerized Execution & Declarative Infrastructure for Environment Consistency

Containers allow you to execute tasks and workflows with the same versions of libraries, operating systems, and packages across all environments. Using declarative configurations makes workflows transparent and portable. By managing configuration in tasks, it becomes easier to track changes and avoid accidental modifications that may affect results. This also allows teams to monitor resource usage and adjust accordingly.

Union uses declarative infrastructure to set containers and resources at the task level, allowing for a consistent runtime environment for each task and efficient use of compute resources, such as only using a GPU when needed for an individual task.

Copied to clipboard!

@task(
    container_image=hosted_data_image,
    requests=Resources(cpu="2", mem="4Gi"),
)
def get_data
...

@task(
    container_image=hosted_model_image,
    requests=Resources(cpu="1", mem="16Gi", gpu="1"),
)
def train_model
...

Task level resource and container management

Union’s ImageSpec and image builder features let you define, manage, and version the runtime environments for your workflows by specifying the base image, dependencies, and configurations directly in your Python code. This makes it even easier to build and manage containers.

Copied to clipboard!

image = ImageSpec(
    apt_packages=["git", "wget"],
    requirements="requirements.lock.txt",
    env={"GIT_PYTHON_REFRESH": "quiet"},
    cuda="11.8",
)

When the image is called from a container Union will check on run to see if it exists yet, if not it will build it (securely in your cloud).

Launch Forms and Parameterization for Flexibility

Union provides launch forms, enabling parameterization of workflows without altering the underlying code. This means you can re-run workflows and tasks with new parameters, test different models, and log the version of the results in real time, which is critical for experimentation. You can also define a custom launch plan to start the workflow while passing the inputs as parameters.

Workflows & tasks can easily be relaunched from the UI, API, or terminal.

Using launch forms to relaunch an AI workflow

Python API example using Union Remote to relaunch workflows or entities.

Copied to clipboard!

from union.remote import UnionRemote

remote = UnionRemote(
    default_project="my-project",
    default_domain="my-domain",
)
some_entity = ...  # one of FlyteTask, FlyteWorkflow, or FlyteLaunchPlan
execution = remote.execute(
    some_entity,
    inputs={...},
    execution_name="my_execution",
    wait=True,
)

Define Clear Data Types for Tasks & Workflows

Clearly defined data types help check for compatibility and correctness in your data flow, reducing runtime errors. It also serves as a form of documentation when reusing or sharing tasks.

In Union, each task must specify the type of inputs and outputs explicitly, allowing Union to validate them automatically. This is done by adding type hinting annotations to your task functions.

Typed input catches different data type returned

The Union SDK enforces strongly typed inputs and outputs for each task and workflow.

Copied to clipboard!

@workflow
def batch_prediction_knn(
    model: KNeighborsClassifier = KnnModelArtifact.query(),
    pred_data: List[List[float]] = [[1, 2, 3, 4], [5, 6, 7, 8]]
) -> list[int]:
    pred = batch_knn_predict(
        pred_data=pred_data,
        model=model
    )
    return pred

Benefits of Building Reproducible Workflows with Union

Here’s how Union’s focus on reproducibility can directly impact AI projects:

Accelerated Development: With composable and reproducible workflows, experimentation cycles are faster. Teams can modularize workflow components, enabling rapid iteration while avoiding unnecessary time spent on configuration.
Improved Collaboration: Reproducibility makes workflows shareable and understandable, enabling seamless team collaboration without worrying about mismatched dependencies or data discrepancies.
Reduced Operational Overhead: By reducing inconsistencies across environments, reproducibility eliminates “environment drift” issues, which leads to fewer production incidents and a more efficient operations pipeline.
Increased Trust and Reliability: Unions’s infrastructure allows users to trust that models and workflows will perform consistently every time they’re run, instilling confidence in both development and production stages.

“We want to simplify and not have to think about and manage different technology stacks. We want to write everything in a Union workflow and have one platform for orchestrating these jobs; that’s awesome and less stuff for us to worry about.” —Thomas Busath, ML Engineer at Porch

Conclusion: Reproducible ML Pipelines

Reproducibility is more than a best practice—it’s a foundational principle for successful AI and ML development. With Union’s AI workflow platform, building end-to-end reproducible workflows is simplified and integrated into every part of the development process. From versioning and containerization to data caching and parameterization, Union empowers teams to make reproducibility an effortless part of their workflows, leading to faster iteration, greater reliability, and a stronger foundation for scaling AI applications.

Union’s dedication to reproducible workflows translates into tangible benefits for AI teams, allowing them to focus on innovation and building without compromising on reliability. By embracing these tools and practices, your organization can build compound AI systems with confidence and control.

Try Union serverless: There are examples you can run right from a notebook.
Join the community: Say hello and ask questions in our community Slack.

If you have questions about implementing reproducible workflows in your organization, schedule time with one of our AI Engineers!

Reproducible Workflows for Compound AI: Reliable and Scalable AI Development

Why Reproducibility Matters in AI Pipelines

Best Practices for Building Reproducible AI Pipelines:

Versioned Workflows for Consistent Tracking & Sharing:

Containerized Execution & Declarative Infrastructure for Environment Consistency

Launch Forms and Parameterization for Flexibility

Define Clear Data Types for Tasks & Workflows

Benefits of Building Reproducible Workflows with Union

Conclusion: Reproducible ML Pipelines

Open Source Projects

Use Cases

Learn

Company

Reproducible Workflows for Compound AI: Reliable and Scalable AI Development

Why Reproducibility Matters in AI Pipelines

Best Practices for Building Reproducible AI Pipelines:

Versioned Workflows for Consistent Tracking & Sharing:

Containerized Execution & Declarative Infrastructure for Environment Consistency

Launch Forms and Parameterization for Flexibility

Define Clear Data Types for Tasks & Workflows

Benefits of Building Reproducible Workflows with Union

Conclusion: Reproducible ML Pipelines

More from Union

Table of Contents

Open Source Projects

Use Cases

Learn

Company