Deploy an agent as a service
Once you’ve built an agent — with
pure Python, the
Agent harness, or a
third-party framework — how you run it is an independent choice. The same agent object can be deployed in several ways:
| Pattern | When to use it | What invokes the agent |
|---|---|---|
| As a task | On-demand runs from the CLI, a notebook, or another service | flyte.run(...) |
| As a scheduled task | Recurring autonomous wakeups (triage, monitoring, reports) | A flyte.Trigger (cron or fixed-rate) |
| Behind a webhook | React to external events (GitHub, paging tools, CI) | An HTTP POST to an AppEnvironment |
All three wrap the agent loop in a regular Flyte task, so every run is durable, retryable, and observable in the Union.ai dashboard. The examples below use the Agent harness, but the pattern is identical for any agent — just call your agent’s entry point inside the task.
As a task
The simplest deployment: put the agent loop in an @env.task and invoke it on demand. This works for any agent.
import flyte
from flyte.ai.agents import Agent
env = flyte.TaskEnvironment(
name="concierge-agent",
image=flyte.Image.from_debian_base().with_pip_packages("litellm"),
secrets=[flyte.Secret(key="internal-anthropic-api-key", as_env_var="ANTHROPIC_API_KEY")],
)
agent = Agent(
name="customer-concierge",
instructions="You are a customer-service concierge.",
tools=[...],
)
@env.task(report=True)
async def concierge(request: str) -> str:
"""Run the agent for a single request."""
result = await agent.run.aio(request)
return result.summary or result.errorRun it on demand:
flyte run agent.py concierge --request "Refund order #12345 to the customer."Or from Python with flyte.run(concierge, request="..."). To register a stable, deployed version of the task, use flyte deploy agent.py env.
As a scheduled task (via Trigger)
To run an agent autonomously on a schedule, attach a flyte.Trigger to the task. The “wakeup” is a regular Flyte task — the agent loop runs inside it, so every tool call is durable, observable, and retryable. Pair this with
agent memory so the agent resumes prior context on each wakeup.
agent = Agent(
name="github-triage",
instructions=(
"You are a GitHub issue triager. For each wakeup: list open issues for "
"the configured repo, classify each one, group them by severity, and "
"post a concise digest to the team channel. Always end by calling post_digest."
),
model="claude-haiku-4-5",
tools=[list_open_issues, classify_issue, post_digest],
max_turns=20,
)
@env.task(
triggers=flyte.Trigger(
"daily-triage",
flyte.Cron("0 9 * * *"), # every day at 09:00
inputs={"trigger_time": flyte.TriggerTime, "repo": "flyteorg/flyte", "channel": "#flyte-triage"},
),
report=True,
)
async def triage_repo(trigger_time: datetime, repo: str, channel: str) -> str:
"""Scheduled wakeup that runs the triage agent end-to-end."""
message = f"It is {trigger_time.isoformat()}. Triage the open issues in {repo} and post a digest to {channel}."
with flyte.group("triage-loop"):
result = await agent.run.aio(message)
return result.summary or f"[triage failed] {result.error}"
The agent’s tools (list_open_issues, classify_issue, post_digest) are durable @env.tasks; see the
full example for their definitions.
Deploying the task registers the trigger; from then on Union.ai wakes the agent on schedule. Use flyte.Cron(...) for calendar schedules or flyte.FixedRate(...) for fixed intervals. The flyte.TriggerTime input is filled with the scheduled fire time. See
Triggers for the full schedule reference.
Behind a webhook (AppEnvironment)
To kick off an agent run in response to an external event, deploy a small FastAPI app via an AppEnvironment that exposes an HTTP endpoint. The endpoint launches the agent task with flyte.run.aio(...), so the long-running agent loop executes durably in the background while the webhook returns immediately with a run URL.
@tool_env.task(report=True)
async def review_pr(repo: str, pr_number: int, event: str) -> str:
"""Durable task that runs the agent for a single webhook event."""
message = f"GitHub webhook fired for {repo}#{pr_number} (event={event}). Review the PR."
result = await agent.run.aio(message)
return result.summary or result.error
def _build_app():
from fastapi import FastAPI
api = FastAPI(title="flyte-agent-webhook")
@api.post("/trigger")
async def trigger(payload: dict) -> dict[str, str]:
repo = payload.get("repository")
pr_number = int(payload.get("pull_request", {}).get("number", 0))
event = payload.get("action")
run = await flyte.run.aio(review_pr, repo=repo, pr_number=pr_number, event=event)
return {"run_url": run.url, "name": run.name}
return api
webhook_env = flyte.app.AppEnvironment(
name="flyte-agent-webhook",
image=flyte.Image.from_debian_base().with_pip_packages("fastapi", "uvicorn", "litellm"),
resources=flyte.Resources(cpu=1, memory="512Mi"),
requires_auth=True,
depends_on=[tool_env],
)
@webhook_env.server
async def serve():
import uvicorn
config = uvicorn.Config(_build_app(), host="0.0.0.0", port=webhook_env.get_port().port)
await uvicorn.Server(config).serve()
The agent and its tools (fetch_pr, post_comment) are defined in the
full example.
Once deployed, point your external system at the /trigger URL:
curl -X POST -H "Content-Type: application/json" \
-d '{"repository": "flyteorg/flyte", "pull_request": {"number": 123}, "action": "opened"}' \
https://<subdomain>.apps.<endpoint>/triggerWhen the webhook app submits runs on behalf of incoming requests, it needs valid Union.ai credentials. Use passthrough auth (a FastAPIPassthroughAuthMiddleware and flyte.init_passthrough) so the run is submitted with the caller’s identity. See
FastAPI apps.
Chat and other app patterns
- Chat UI: To let users converse with the agent in a browser, serve it behind a chat interface. See Add a chat UI.
- FastAPI endpoint: For API-first agents, expose your agent behind a REST endpoint with
FastAPIAppEnvironmentso other services or agents can call it programmatically. - Model serving:
Serve open-weight LLMs on GPUs behind an OpenAI-compatible API with
VLLMAppEnvironmentorSGLangAppEnvironment.
See Build Apps and Configure Apps for more details on hosting services on Union.ai.