Niels Bantilan

UnionML 0.2.0 Integrates with BentoML

Ship cloud-agnostic model prediction services

One of the most challenging aspects of building machine learning-driven applications is what I like to call “the deployment chasm.”

I remember one of the first major ML models I deployed: It was a fairly complex beast, requiring substantial data- and feature-engineering work, not to mention the model-training process itself. When it came time to deploy the model, my team and I needed to figure out how best to do it in a way that we could maintain while giving us visibility into model and endpoint health metrics. We decided to use Sagemaker inference endpoints since the rest of the company used AWS and our product required on-demand, low-latency predictions.

We had to rewrite many parts of our research code to conform to Sagemaker’s API (par for the course in many ML projects). That wasn’t ideal because it created two separate implementations of the code: the research version and the production version. If we weren’t careful about how we organized our codebase, it would create code skew, where we’d have to remember to update both places in the codebase if we ever needed to revise certain parts of the inference logic.

But what if I told you that you can write that code once and deploy it to a wide variety of cloud platforms?

I’m excited to announce that the ✨ 0.2.0 Harmony release of UnionML ✨ is out, and that we’ve integrated with BentoML to give you a seamless deployment experience. UnionML reduces the boilerplate code needed to build models and mitigates the risk of code skew when transitioning from development to production.

How Does it Work?

UnionML organizes machine learning systems as apps. When you define a UnionML app, you need to implement a few core components. As you can see in the diagram below, UnionML then bundles these components together into meaningful services that you can deploy to some target infrastructure.

With UnionML and BentoML, you can easily create web-based and serverless prediction APIs

The core abstraction of BentoML is the Bento, which is a file archive containing all the source code, models, data and configuration needed to run a model prediction service. Using the BentoML integration is simple: first, you need to bind a `BentoMLService` to a `unionml.Model`:

Copied to clipboard!
# unionml_app.py
import pandas as pd
from sklearn.datasets import load_digits
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

from unionml import Dataset, Model

from unionml.services.bentoml import BentoMLService

dataset = Dataset(name="digits_dataset", test_size=0.2, shuffle=True, targets=["target"])
model = Model(name="digits_classifier", init=LogisticRegression, dataset=dataset)
service = BentoMLService(model, framework="sklearn")

# define app components
@dataset.reader
def reader(...): ...

@model.trainer
def trainer(...): ...

@model.predictor
def predictor(...): ...

@model.evaluator
def evaluator(...): ...

Then, you can use the `model` object to train and save a model locally:

Copied to clipboard!
if __name__ == "__main__":
    model_object, metrics = model.train(hyperparameters={"C": 1.0, "max_iter": 10000})
    saved_model = service.save_model(model.artifact.model_object)
    print(f"BentoML saved model: {saved_model}")

Finally, you can create a `service.py` file that defines the underlying bentoml.Service object you’ll ultimately use to deploy the prediction service. The neat thing is that since the UnionML app already defines the feature processing and prediction logic, creating the service only takes a few lines of code ✨

Copied to clipboard!
# service.py
from digits_classifier_app import service

service.load_model("latest")
service.configure(
    enable_async=False,
    supported_resources=("cpu",),
    supports_cpu_multi_threading=False,
    runnable_method_kwargs={"batchable": False},
)

From here, all you need to do is build a Bento and deploy it in your infrastructure of choice!

Get Started with UnionML

With the 0.2.0 release of UnionML, you can now train a model with UnionML locally or on a Flyte Cluster at scale, then deploy it to any of the supported cloud targets like AWS Lambda, Sagemaker, Google Cloud Run and Azure Functions. To learn more, check out the following resources:

Join the Slack community if you have any questions!

Update
Article