/ Flyte features

The modern framework for orchestrating complex, data-intensive workflows

By combining a powerful compute backend with an elegant Pythonic interface, Flyte brings software engineering best practices to every step of the AI lifecycle, enabling teams to build resilient, reproducible pipelines.

Built with AI in mind

Flyte was built from first principles to solve the unique challenges presented by AI development. Containerized workloads, automatically-versioned entities, and data management abstractions ensure that work can be reproduced and seamlessly promoted from development to production. Native compute and first-class integrations with external systems allow for fast, efficient distribution of workloads across a shared backend. An API-driven architecture promotes interoperability and reuse of data and model artifacts across the organization.

Containerized tasks

Easily manage complex, heterogeneous dependencies within the same workflow

Automatic versioning

Seamlessly integrate with GitOps and never lose historical workflow executions

Abstracted data flow

Avoid data management in code and easily recover from the point of failure

Native compute

Instantly run workloads on GPU, TPU, and spot instances (and scale to zero afterwards)

Agents

Easily manage authentication and control flow through external services

Notebook support

Execute workloads and pull down results directly using a Python-based SDK

Built for rapid iteration

Flyte was designed to help AI developers rapidly prototype, test, and ship complex, data- and compute-intensive workflows. Single-task executions, image management in code, and declarative infrastructure allow workflow authors to incrementally develop pipelines one step at a time. Local-remote parity enables teams to test workflows in CI while running the same logic at scale in remote environments. Task and workflow composability and dynamism support complex AI-specific use cases such as hyperparameter tuning.

Single-task executions

Incrementally develop workflows one step at a time using ad-hoc task executions

Dynamic workflows

Dynamically alter the shape of DAGs using data

Local-to-remote parity

Locally test workflows in CI and seamlessly ship to remote

Dynamic image management

Automatically build container images without writing Dockerfiles

Task & workflow reusability

Easily build on pre-existing work simply by running an import

Declarative infrastructure

Adjust resources on the fly in order to right size infrastructure to suit the job at hand

Production-ready resiliency

Flyte was conceived by a team of distributed systems experts to provide extreme failure resiliency and ease of debugging. Caching and automatic recovery facilitate self-healing workflows. Type safety and error-driven branching increase the probability that a given workflow succeeds. Native multi-tenancy and deep integration with IAM ensure secure yet efficient sharing of resources.

Fault tolerance

Automatically retry failed workloads according to user-defined policies

Caching

Cache results of intermediate executions in order to recover from the point of failure

Native multi-tenancy

Efficiently share resources while protecting important workloads from resource starvation

Type safety

Catch compile-time type errors before kicking off long-running workflows

Robust error handling

Dynamically handle different types of errors during execution

Isolation & security

Define independent IAM permissions at the workflow level

Get started

Try Union, the only Flyte-native AI platform.

Request demo

Explore Union

Learn about how Union extends Flyte for enterprise scale and performance.

See Union features