/ GenAI & LLMs

Take generative AI applications to production faster

Union provides a full-stack ecosystem equipped with everything you need to deploy your generative AI solutions at scale.

Build production-grade RAG data pipelines

RAG data pipelines typically involve document pre-processing, ingestion, and embedding generation. You may want to run multiple such pipelines and iterate on them quickly. Union enables you to build “reproducible” RAG pipelines with “versioning”, “caching” and “lineage”.

Parallelize LLM batch inference using map tasks

Generating predictions from an LLM sequentially is both resource-intensive and time-consuming. By using map tasks to distribute inference workloads, you can significantly reduce latency and ensure efficient resource utilization. Declaratively specify the resources needed for each inference call to maximize the utilization of your compute resources.

Optimize RAG workloads with long-running actors*

Union’s long-running actors provide reusable containers to pre-load models and data into readily available execution environments. Warm containers cut down on container startup time enabling near real-time responsiveness for RAG workloads.

*Coming soon

Build repeatable and observable AI pipelines

Reproducibility and observability are core strengths of Union. Leverage Union to implement advanced AI pipelines, including agentic workflows. Use raw containers for agent sandboxing, OpenAI agents for controlled communication with LLMs, and dynamic and eager constructs to refine your workflows. Customize hardware, memory, and image requirements for each pipeline component to ensure optimal performance and resource utilization.