Union Cloud Features
Simplify the orchestration of data and ML workflows with Union Cloud’s well-designed architecture, extensible plugin system, and robust features that boost your team’s productivity and adapt to your changing needs and evolving workflows.
Many of these features can be realized in both the open source project, Flyte™, and the managed solution, Union Cloud. However, standing up and managing Flyte™ can be complex, and often requires dedicated infrastructure specialists to maintain the cluster and associated resources. Union Cloud is a managed solution that runs in your cloud environment and provides additional observability, built-in security, integrated monitoring and authorization
Monitoring & Visualization
Features in this category enable comprehensive monitoring and visualization of your workflows, allowing you to stay informed about their state and performance.
Data lineage provides an essential means to trace the origin of errors within your workflows. By monitoring the data's journey and transformations across the entire lifecycle of your workflows, it becomes easier to identify the source of any issues. This efficient approach to debugging and troubleshooting saves time and resources, enabling rapid resolution of problems as they occur.
Enable comprehensive visualization throughout every step of your workflow to visualize your data, monitor your models, and view training history through plots.
Experience enhanced task-level monitoring with Union Cloud, empowering entire teams with valuable insights to optimize their workflows. By reducing task execution time and providing system feedback comparable to a local environment, Union Cloud enables a more efficient and streamlined experience. Benefit from detailed, individualized data rather than just aggregated information, allowing for more precise analysis and informed decision-making across your organization.
Resource Management & Security
This category includes features that facilitate effective resource management, organization, and collaboration, allowing for seamless teamwork and efficient use of available resources.
RBAC is a crucial feature for managing security and access within your organization. It allows you to assign permissions and control access to resources based on users' roles, ensuring that each team member has the appropriate level of access to perform their tasks effectively while maintaining data privacy and security.
Union Cloud is designed with a focus on data ownership and control, as it is installed only as control plane services. The customer is responsible for running and owning the compute and data plane, including user code, data sets, and secrets. This architecture ensures that customers have full control over their data and resources, maintaining privacy, security, and compliance with organizational policies.
Union Cloud’s multi-tenancy feature allows multiple users to share the same platform while maintaining their own unique data and configurations. This centralized infrastructure enables effective resource management and organization, while also facilitating seamless team collaboration within your organization.
Set a specified cadence for your workflows by scheduling them to run at regular intervals. This ensures that your workflows execute automatically at the desired frequency, minimizing manual intervention and optimizing efficiency.
Performance & Accuracy
This set of features encompasses features that help improve the performance of your workflows, by leveraging GPU processing, enabling parallelism, and optimizing resource allocation.
Strongly typed interfaces
Ensure the integrity of your data throughout your workflow by establishing data guardrails. This will prevent any data errors from slipping through the cracks, while also allowing your workflow to remain informed of how the data evolves at each step.
Schedule your tasks to run on GPUs. Leverage the power of GPU processing, providing faster execution times and enhanced performance for ML and data-intensive workloads.
Tasks are inherently parallel to optimize resource consumption and improve performance, so you don't have to do anything special to enable parallelism.
Signaling allows manual actions to influence the course of a workflow. This allows a human to potentially intercept a workflow and either redirect or approve the tasks. This is helpful for labeling, supervised learning, and data curation.
These features focus on optimizing the efficiency of your workflows by providing tools to minimize resource wastage, reduce execution times, and streamline the debugging process.
Task boundaries provide natural checkpoints for your workflow, but in certain scenarios, such as training a model, they can be expensive. Training can be both time-consuming and resource-intensive, making it critical to ensure that progress is regularly saved. Intra-task checkpoints provide a solution by allowing you to checkpoint progress within a task execution, minimizing resource waste and optimizing performance.
Recover from failures
Rerun a single task
Optimize your workflow's execution time by caching task outputs. When the task signature remains unchanged, the cache skips the need to rerun any long-running executions, preventing unnecessary resource wastage and significantly speeding up your workflow's execution.
Immutable executions are critical for ensuring reproducibility, as they prevent any changes to the state of execution. This provides the flexibility to completely restructure a data or ML workflow between versions without fear of any negative impact on production. With immutable executions, you can confidently iterate and experiment while maintaining the integrity of your data and workflow.
Spot or preemptible instances
Take advantage of spot instances with ease by scheduling your workflows to run on them and significantly reduce your costs. Effortlessly optimize your workflow's efficiency while minimizing expenses.
Ensure reliable task completion by setting timeouts. Timeout allows you to specify a maximum amount of time for a task to run, ensuring that the task always completes within a specified timeframe.
Dynamic resource allocation
Dynamic resource allocation is a key feature in optimizing workflow efficiency. With Union Cloud, resources required for a task can be adjusted on the fly based on user-provided inputs or real-time calculations. This adaptability ensures that your tasks have the necessary resources to run efficiently, enhancing overall performance and resource utilization.
Stay up-to-date on your workflow's state by configuring notifications through popular platforms such as Slack, PagerDuty, or email. Receive valuable real-time updates and alerts to quickly address any issues that may arise and maintain control over your workflow's performance
This category emphasizes flexibility, supporting various programming languages, allowing multiple users to work on the same platform, and isolating dependencies for seamless integration.
Compose your workflows in the language that best suits your team's expertise, with support for both SDK and raw containers. This flexibility allows you to write workflows in any programming language you prefer.
Data and ML practitioners require the ability to experiment and iterate without disrupting their workflow. With versioning, they can work in isolation and reproduce their results, as well as rollback to a previous version of their workflow at any time. This provides the necessary flexibility to experiment and iterate with confidence.
Dependency isolation via containers
Different tasks within a workflow may have varying resource requirements and library dependencies. Without careful management, this can cause conflicts that can negatively impact your workflow's performance. Union Cloud lets you maintain separate sets of dependencies for your tasks, ensuring that no conflicts arise and that your workflow runs smoothly and efficiently.
Union Cloud’s multi-cloud support enables users to seamlessly utilize multiple cloud providers (current support for AWS and GCP with more to come), including on-premises infrastructure if requested. This flexibility allows organizations to choose the best cloud solution that suits their specific requirements and preferences, ensuring a streamlined, adaptable, and efficient workflow experience regardless of the underlying infrastructure.