Running your code
Set up your development environment
If you have not already done so, follow the Getting started section to sign in to Union.ai, and set up your local environment.
CLI commands for running your code
The Union CLI and Uctl CLI provide commands that allow you to deploy and run your code at different stages of the development cycle:
-
union run
: For deploying and running a single script immediately in your local Python environment. -
union run --remote
: For deploying and running a single script immediately in the cloud on Union.ai. -
union register
: For deploying multiple scripts to Union.ai and running them from the Web interface. -
union package
anductl register
: For deploying workflows to production and for scripting within a CI/CD pipeline.
In some cases, you may want to test your code in a local cluster before deploying it to Union.ai. This step corresponds to using the commands 2, 3, or 4, but targeting your local cluster instead of Union.ai. For more details, see Running in a local cluster.
Running a script in local Python with union run
During the development cycle you will want to run a specific workflow or task in your local Python environment to test it.
To quickly try out the code locally use union run
:
$ union run workflows/example.py wf --name 'Albert'
Here you are invoking union run
and passing the name of the Python file and the name of the workflow within that file that you want to run.
In addition, you are passing the named parameter name
and its value.
This command is useful for quickly testing a workflow locally to check for basic errors. For more details see union run details.
Running a script on Union.ai with union run --remote
To quickly run a workflow on Union.ai, use union run --remote
:
$ union run --remote --project basic-example --domain development workflows/example.py wf --name 'Albert'
Here we are invoking union run --remote
and passing:
- The project,
basic-example
- The domain,
development
- The Python file,
workflows/example.py
- The workflow within that file that you want to run,
wf
- The named parameter
name
, and its value
This command will:
- Build the container image defined in your
ImageSpec
.
- Package up your code and deploy it to the specified project and domain in Union.ai.
- Run the workflow on Union.ai.
This command is useful for quickly deploying and running a specific workflow on Union.ai. For more details see union run details.
Running tasks through uctl
This is a multi-step process where we create an execution spec file, update the spec file, and then create the execution.
Generate execution spec file
$ uctl launch task --project flytesnacks --domain development --name workflows.example.generate_normal_df --version v1
Update the input spec file for arguments to the workflow
iamRoleARN: 'arn:aws:iam::12345678:role/defaultrole'
inputs:
n: 200
mean: 0.0
sigma: 1.0
kubeServiceAcct: ""
targetDomain: ""
targetProject: ""
task: workflows.example.generate_normal_df
version: "v1"
Create execution using the exec spec file
$ uctl create execution -p flytesnacks -d development --execFile exec_spec.yaml
Monitor the execution by providing the execution id from create command
$ uctl get execution -p flytesnacks -d development <execid>
Running workflows through uctl
Workflows on their own are not runnable directly. However, a launchplan is always bound to a workflow (at least the auto-create default launch plan) and you can use
launchplans to launch
a workflow. The default launchplan
for a workflow has the same name as its workflow and all argument defaults are also identical.
Tasks also can be executed using the launch command. One difference between running a task and a workflow via launchplans is that launchplans cannot be associated with a task. This is to avoid triggers and scheduling.
Generate an execution spec file
$ uctl get launchplan -p flytesnacks -d development myapp.workflows.example.my_wf --execFile exec_spec.yaml
Update the input spec file for arguments to the workflow
inputs:
name: "adam"
Create execution using the exec spec file
$ uctl create execution -p flytesnacks -d development --execFile exec_spec.yaml
Monitor the execution by providing the execution id from create command
$ uctl get execution -p flytesnacks -d development <execid>
Deploying your code to Union.ai with union register
$ union register workflows --project basic-example --domain development
Here we are registering all the code in the workflows
directory to the project basic-example
in the domain development
.
This command will:
- Build the container image defined in your
ImageSpec
. - Package up your code and deploy it to the specified project and domain in Union.ai.
The package will contain the code in the Python package located in the
workflows
directory. Note that the presence of the__init__.py
file in this directory is necessary in order to make it a Python package.
The command will not run the workflow. You can run it from the Web interface.
This command is useful for deploying your full set of workflows to Union.ai for testing.
Fast registration
union register
packages up your code through a mechanism called fast registration.
Fast registration is useful when you already have a container image that’s hosted in your container registry of choice, and you change your workflow/task code without any changes in your system-level/Python dependencies. At a high level, fast registration:
-
Packages and zips up the directory/file that you specify as the argument to
union register
, along with any files in the root directory of your project. The result of this is a tarball that is packaged into a.tar.gz
file, which also includes the serialized task (inprotobuf
format) and workflow specifications defined in your workflow code. -
Registers the package to the specified cluster and uploads the tarball containing the user-defined code into the configured blob store (e.g. S3, GCS).
At workflow execution time, Union.ai knows to automatically inject the zipped up task/workflow code into the running container, thereby overriding the user-defined tasks/workflows that were originally baked into the image.
WORKDIR
, PYTHONPATH
, and PATH
When executing any of the above commands, the archive that gets creates is extracted wherever the WORKDIR
is set.
This can be handled directly via the WORKDIR
directive in a Dockerfile
, or specified via source_root
if using ImageSpec
.
This is important for discovering code and executables via PATH
or PYTHONPATH
.
A common pattern for making your Python packages fully discoverable is to have a top-level src
folder, adding that to your PYTHONPATH
,
and making all your imports absolute.
This avoids having to “install” your Python project in the image at any point e.g. via pip install -e
.
Inspecting executions
Uctl supports inspecting execution by retrieving its details. For a deeper dive, refer to the API reference guide.
Monitor the execution by providing the execution id from create command which can be task or workflow execution.
$ uctl get execution -p flytesnacks -d development <execid>
For more details use --details
flag which shows node executions along with task executions on them.
$ uctl get execution -p flytesnacks -d development <execid> --details
If you prefer to see yaml/json view for the details then change the output format using the -o flag.
$ uctl get execution -p flytesnacks -d development <execid> --details -o yaml
To see the results of the execution you can inspect the node closure outputUri in detailed yaml output.
"outputUri": "s3://my-s3-bucket/metadata/propeller/flytesnacks-development-<execid>/n0/data/0/outputs.pb"
Deploying your code to production
Package your code with union package
The combination of union package
and uctl register
is the standard way of deploying your code to production.
This method is often used in scripts to build and deploy workflows in a CI/CD pipeline.
First, package your workflows:
$ union --pkgs workflows package
This will create a tar file called flyte-package.tgz
of the Python package located in the workflows
directory.
Note that the presence of the __init__.py
file in this directory is necessary in order to make it a Python package.
You can specify multiple workflow directories using the following command:
union --pkgs DIR1 --pkgs DIR2 package ...
This is useful in cases where you want to register two different projects that you maintain in a single place.
If you encounter a ModuleNotFoundError when packaging, use the –source option to include the correct source paths. For instance:
union --pkgs <dir1> package --source ./src -f
Register the package with uctl register
Once the code is packaged you register it using the uctl
CLI:
$ uctl register files \
--project basic-example
--domain development \
--archive flyte-package.tgz
--version "$(git rev-parse HEAD)"
Let’s break down what each flag is doing here:
-
--project
: The target Union.ai project. -
--domain
: The target domain. Usually one ofdevelopment
,staging
, orproduction
. -
--archive
: This argument allows you to pass in a package file, which in this case is theflyte-package.tgz
produced earlier. -
--version
: This is a version string that can be any string, but we recommend using the Git SHA in general, especially in production use cases.
See Uctl CLI for more details.
Using union register versus union package + uctl register
As a rule of thumb, union register
works well when you are working on a single cluster and iterating quickly on your task/workflow code.
On the other hand, union package
and uctl register
is appropriate if you are:
-
Working with multiple clusters, since it uses a portable package
-
Deploying workflows to a production context
-
Testing your workflows in your CI/CD infrastructure.
You can also perform the equivalent of the three methods of registration using a UnionRemote object.
Image management and registration method
The ImageSpec
construct available in union
also has a mechanism to copy files into the image being built.
Its behavior depends on the type of registration used:
-
If fast register is used, then it’s assumed that you don’t also want to copy source files into the built image.
-
If fast register is not used (which is the default for
union package
, or ifunion register --copy none
is specified), then it’s assumed that you do want source files copied into the built image. -
If your
ImageSpec
constructor specifies asource_root
and thecopy
argument is set to something other thanCopyFileDetection.NO_COPY
, then files will be copied regardless of fast registration status.
Building your own images
ImageSpec
and the union
cloud image builder, you can, if you wish build and deploy your own images.You can start with union init --template basic-template-dockerfile
, the resulting template project includes a docker_build.sh
script that you can use to build and tag a container according to the recommended practice:
$ ./docker_build.sh
By default, the docker_build.sh
script:
-
Uses the
PROJECT_NAME
specified in the union command, which in this case is my_project. -
Will not use any remote registry.
-
Uses the Git SHA to version your tasks and workflows.
You can override the default values with the following flags:
$ ./docker_build.sh -p PROJECT_NAME -r REGISTRY -v VERSION
For example, if you want to push your Docker image to Github’s container registry you can specify the -r ghcr.io
flag.
The docker_build.sh
script is purely for convenience; you can always roll your own way of building Docker containers.
Once you’ve built the image, you can push it to the specified registry. For example, if you’re using Github container registry, do the following:
$ docker login ghcr.io
$ docker push TAG
CI/CD with Flyte and GitHub Actions
You can use any of the commands we learned in this guide to register, execute, or test Union.ai workflows in your CI/CD process. Union.ai provides two GitHub actions that facilitate this:
-
flyte-setup-action
: This action handles the installation of uctl in your action runner. -
flyte-register-action
: This action usesuctl register
under the hood to handle registration of packages, for example, the.tgz
archives that are created byunion package
.
Some CI/CD best practices
In the case where workflows are registered on each commit in your build pipelines, you can consider the following recommendations and approach:
-
Versioning Strategy : Determining the version of the build for different types of commits makes them consistent and identifiable. For commits on feature branches, use
{branch-name}-{short-commit-hash}
and for the ones on main branches, usemain-{short-commit-hash}
. Use version numbers for the released (tagged) versions. -
Workflow Serialization and Registration : Workflows should be serialized and registered based on the versioning of the build and the container image. Depending on whether the build is for a feature branch or
main
, the registration domain should be adjusted accordingly. -
Container Image Specification : When managing multiple images across tasks within a workflow, use the
--image
flag during registration to specify which image to use. This avoids hardcoding the image within the task definition, promoting reusability and flexibility in workflows.