Sage Elliott

Min Read

•

December 2, 2024

Union: The Unified AI Platform

Developing and deploying AI models at scale is challenging. Many teams face obstacles like disconnected workflows, runaway or prohibitive infrastructure costs, and slow time-to-market for AI solutions. These hurdles delay productivity and prevent organizations from quickly experimenting and getting models into production.

We built Union’s unified AI platform to tackle these challenges and optimize your AI development lifecycle. Our platform unifies data, models, libraries, and compute, enabling seamless collaboration, cost-effective infrastructure use, and rapid innovation.

The Pain Points of AI Development:

Disconnected Development Teams
The complexity of AI workflows leads to silos between data engineers, ML engineers, and product teams. Misaligned tools and fragmented pipelines result in poor collaboration and slower progress.
Runaway GPU/Cloud Costs
Cloud costs, especially GPU usage, often spiral out of control due to inefficient resource allocation and lack of transparency. Teams struggle to optimize resource utilization, leading to wasted budgets.
Slow Time-to-Market
Long iteration cycles and clunky workflows delay innovation. AI teams need a way to rapidly experiment, iterate, and deploy solutions without compromising on quality or scalability.

Nearly 4,000 organizations already use Union’s open-source orchestration layer to solve these problems by building reproducible AI workflows.

How Union transforms the AI development cycle

Union Provides a Single Pane of Glass for Teams to Build, Manage, and Deploy AI Products.

Union’s Unified AI Platform:

Enables teams to share and automate reproducible AI workflows.
Controls runaway costs with task level resource management and observability to operate as efficiently as possible.
Centralizes the entire AI lifecycle, from data processing, artifact management, and model deployment, into a cohesive experience.

New Features

We’re excited to announce new features that extend Union’s capabilities even further, creating a completely flexible end-to-end AI platform for efficient AI workflows and inference.

Cost Observability

Understanding and managing costs in AI workflows is a tedious manual process, especially when scaling across multiple projects and teams. This responsibility often falls on engineers, distracting them from more meaningful work.

Union users already love the task level resource management and observability for building cost effective AI workflows, and now we’re excited to announce our cost observability dashboard, which makes it easy to see invoices and expenses across all projects and executions from one centralized interface.

Click on “Cost” in the upper right menu to get started.

This newly launched feature gives full visibility into your resource utilization with detailed cost metrics for your AI projects.

Invoices: Monthly invoices from your cloud providers on BYOC.
Compute Costs: Node level resource utilization, uptime, and code per cluster.
Workload Costs: Breakdown into costs of specific executions within time intervals

By bringing transparency to your workflows, Union empowers teams to allocate budgets wisely, ensuring financial sustainability as you scale.

Actors - Long Running Stateful Containers

Containers for repeated tasks or stateful operations often encounter delays from spinning up, loading dependencies, initializing models, and setting up data. Actors address this by maintaining long-running stateful containers, enabling efficient task execution and model serving without repeated initialization. This can reduce cold-start time by 99% in AI Workflows.

Actors allows relaunching tasks almost instantly

For example, you may want to reuse a container with an AI model to return predictions. Actors enable near real-time inference. By avoiding repeated setup costs, Actors ensure faster execution for both high-frequency and on-demand tasks.

Leverage Actors to reduce costs and improve execution efficiency:

Reusable Containers: Share environments between tasks, cutting down redundant setup times and resource usage.
Near Real-Time Inference: Deploy tasks instantly, paving the way for iterative experimentation and agile AI development.
Stateful Task Execution: Keep context between tasks, simplifying workflows like data enrichment, streaming data processing, and dynamic model updates.

Whether you're running batch jobs, serving AI models, Actors deliver significant time and cost savings. They empower teams to scale seamlessly while keeping infrastructure overhead low.

Example of enable Actors to run a predictive task:

Copied to clipboard!

actor = ActorEnvironment(
    name="my-actor",
    container_image=image,
    replica_count=1,
    ttl_seconds=120,
    requests=Resources(
        cpu="2",
        mem="500Mi",
    ),
)

@actor.task
def actor_knn_predict(
    model: KNeighborsClassifier, pred_data: List[List[float]]
) -> List[int]:
    predictions = model.predict(pred_data)
    return predictions.tolist()

In this example:

ActorEnvironment initializes a stateful container.
The environment remains live for the specified TTL (time-to-live).
Tasks like actor_knn_predict reuse the same environment, making inference faster and more efficient.

Try Actors now in both BYOC or Serverless! Read the docs or book a free demo to learn more.

Serving

Deploying AI applications involves navigating complex infrastructure, managing scaling, and ensuring seamless updates—all of which slow down delivery. Union's serving feature eliminates these complications, making deploying AI models and applications effortless and efficient.

With Union, all you need to do is create an interface application with libraries you may already know and love, such as Streamlit or Gradio, and run a simple deployment command. Almost instantly, you'll get an endpoint to host and serve your application—no additional setup required.

Deploying AI artifacts and streamlit on Union

Why use Union’s serving feature?

Abstracts Infrastructure Complexity: Focus on building your application, not managing servers. Union handles the heavy lifting for hosting and serving your AI models.
Dynamic Scaling: Automatically scales to handle varying workloads, from a few users to thousands, without requiring manual intervention or infrastructure tuning.
Seamless Model Updates: Consume versioned artifacts directly from your existing Union workflows, making it easy to keep your applications up to date with the latest AI models and improvements.

This capability enables teams to deliver robust, scalable deployed AI applications faster than ever before.

Copied to clipboard!

union deploy apps nim_app.py nim-llama-3-1-8b

This feature will be generally available soon, but if you’d like to get a demo before then, let us know!

Unify Your AI Development Today

With Union, disconnected teams transform into collaborative powerhouses, runaway costs become manageable, and slow iteration cycles give way to rapid experimentation. Unify your AI development on a single, end-to-end platform.

If you’re at AWS re:Invent this week, stop by booth the NVIDIA booth #1620 to see Union in action.

Union: The Unified AI Platform

The Pain Points of AI Development:

How Union transforms the AI development cycle

Union Provides a Single Pane of Glass for Teams to Build, Manage, and Deploy AI Products.

New Features

Cost Observability

Actors - Long Running Stateful Containers

Serving

Unify Your AI Development Today

Open Source Projects

Use Cases

Learn

Company

Union: The Unified AI Platform

The Pain Points of AI Development:

How Union transforms the AI development cycle

Union Provides a Single Pane of Glass for Teams to Build, Manage, and Deploy AI Products.

New Features

Cost Observability

Actors - Long Running Stateful Containers

Serving

Unify Your AI Development Today

More from Union

Table of Contents

Open Source Projects

Use Cases

Learn

Company