David Espejo

Building Secure AI Systems with Union's Defense-in-Depth Approach

Introduction 

As machine learning systems become deeply embedded in software infrastructure, their security vulnerabilities pose critical risks. This post explores two prominent attack vectors—pickle-based model exploits and data poisoning—and demonstrates how Union mitigates these threats.

<div class="button-group is-center"><a class="button" href="https://www.union.ai/consultation">Book a free consultation with an AI engineer from Union to explore how to adopt security best practices in your AI pipelines.</a></div>

The Model Supply Chain Challenge: Insecure Deserialization

Follow the example Notebook

ML engineers must recognize that models are code in three key ways: they execute programs (processing inputs to outputs), persist as code (often serialized via formats like pickle), and, in the case of LLMs, generate code. These traits create exploitable attack surfaces.

The pickle vulnerability is particularly dangerous. While serialization enables essential workflows like model reuse and sharing, Python’s default pickle format can permit remote code execution. Popular frameworks like PyTorch use it by default, creating widespread risk.

  • In the example Notebook,  the attacker serializes a malicious payload into a model file:
Copied to clipboard!
class PickleAttack:
    def __init__(self): 

    def __reduce__(self):
        # os.system will execute the command
        import os
        return (os.system, ('echo "👋 Hello there, I\'m a pickle attack! 🥒"',))


fake_model = PickleAttack() #invoking the class will call the insecure __reduce__ method
fake_model_path ="./fake_model.joblib"
with open(fake_model_path, "wb") as f:
    joblib.dump(fake_model, f) #pickling the trained model into a file
  • When loaded during inference, the compromised model executes arbitrary code::
Copied to clipboard!
execution = serverless.execute(
    batch_predict, inputs={"model": fake_model_path, "data": features}
)
execution
  • Resulting in Remote Code Execution:
The model training task ends up executing arbitrary code inserted in the pickled file.

The simplified anatomy of the pickle attack is described in the following diagram:

Anatomy of a pickle attack

This attack becomes especially dangerous in production environments where model files are automatically loaded on a schedule. If an attacker identifies where model files are stored in cloud storage and gains write access, they could replace legitimate models with malicious ones, creating a persistent threat.

Hugging Face, one of the most prevalent collaboration platforms for AI developers, recently mitigated a vulnerability that allowed users to craft and upload a pickle-serialized model that could execute code remotely and compromise their entire serving infrastructure.

Mitigation techniques

Hashing

Simple yet effective, hashing verifies model integrity before training.

Follow the example Notebook

  1. We start by defining a custom Model type based on FlyteFile, a Union-native abstraction that automates the lifecycle and transformations for files that are passed between tasks in a pipeline. The Model type includes a method to calculate and validate the hash of the input file:
Copied to clipboard!
@dataclass
class Model:
    file: FlyteFile
    md5hash: str

    def __post_init__(self):
        with open(self.file, "rb") as f:
            md5hash = hashlib.md5(f.read()).hexdigest()
        if md5hash != self.md5hash:
            raise ValueError(
                "⛔️ Model md5hash mismatch: expected "
                f"{self.md5hash}, found {md5hash}."
            )
  1. Then, using the Model dataclass as the output type of the training task, and trying out the pickling attack, you get:
Copied to clipboard!
Message:

    ValueError: ⛔️ Model md5hash mismatch: expected b087efd0595a961982db5d35bce8a690, found ba5ae6d89cf25e4c33d78041062ba110.

Hashing prevents corrupted or tampered models from consuming compute resources.

Use secure serialization formats

Adopt formats like ONNX, which avoid arbitrary code execution by design. ONNX stores only model weights and configurations in a protobuf structure.

Union enables you to easily convert Pytorch, Scikit-learn, or Tensorflow models to the ONNX format with a plugin.

Follow the example Notebook

Migrating our pipeline to use the ONNX format involves a few steps:

  1. Update dependencies 
Copied to clipboard!
#---existing imports minus joblib (not needed anymore)---#
...
import onnxruntime as rt
from flytekitplugins.onnxscikitlearn import ScikitLearn2ONNX, ScikitLearn2ONNXConfig
from skl2onnx.common.data_types import FloatTensorType
from typing_extensions import Annotated

image = union.ImageSpec.from_env(
    name="secure-formats",
    packages=[
        "bandit",
        "flytekit>=1.14.0",
        "joblib",
        "openai",
        "pandas",
        "pyarrow",
        "scikit-learn",
        "union==0.1.138",
        "flytekitplugins-onnxscikitlearn", # Union-maintained plugin for scikit-learn 
        "onnxruntime", # needed to run inference on an ONNX model 
        "numpy",  
    ],
)
  1. Add metadata to the model training output tuple using Annotated:
Copied to clipboard!
TrainingOutput = NamedTuple(
    "TrainingOutput",
    [
        (
            "TrainedModel",
            Annotated[
                ScikitLearn2ONNX, #indicates the data type
                ScikitLearn2ONNXConfig( #configures the conversion
                    initial_types=[("float_input", FloatTensorType([None, None]))],
                    target_opset=12,
                ),
            ],
        ),
        ("accuracy", float),
    ],
)
  1. Update the model training task:
Copied to clipboard!
@task
def train_model(X_train: pd.DataFrame, y_train: pd.Series) -> TrainingOutput:
    model = RandomForestClassifier(n_estimators=100, random_state=42)
    model.fit(X_train, y_train)
    # instead of returning model as a file, return the tuple and convert to ONNX
    return TrainingOutput(TrainedModel=ScikitLearn2ONNX(model), accuracy=0.0)
  1. Update model evaluation task:
Copied to clipboard!
@task(enable_deck=True)
def evaluate_model(model: ONNXFile, X_test: pd.DataFrame, y_test: pd.Series) -> float:
#starts an inference session on the ONNX-formated model
    session = rt.InferenceSession(model.download())
    input_name = session.get_inputs()[0].name
    label_name = session.get_outputs()[0].name
    y_pred = session.run(None, {input_name: X_test.values.astype(np.float32)})[0]
    accuracy = accuracy_score(y_test, y_pred)
    deck = union.Deck(name="Accuracy Report", html=MarkdownRenderer().to_html(f"# Test accuracy: {accuracy}"))
    union.current_context().decks.insert(0, deck)
    return accuracy

With the above changes, the pickling attack is now prevented as the format is not compatible with what the pipeline expects: a protobuf-serialized file with the exact information about the trained model:

Copied to clipboard!
Message:

    InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from /tmp/flyteydhl4_8o/local_flytekit/3b56143dc3a02373c197731f6cf6084f/model.joblib failed:Protobuf parsing failed.

The Water Tastes Weird: Model and Data Poisoning

It was March 2016 when Microsoft released Tay, a Twitter chatbot meant to emulate the style and slang of a teenage girl. However, it came with a caveat: it would use its interactions with Twitter users as training data to improve its conversations. 

Unfortunately, without appropriate guardrails, the bot learned the Internet language and its values. 

Multiple Twitter users coordinated to exploit the bot’s feedback loop by inundating it with offensive and discriminatory language. As a result, Microsoft had to take it down less than 24 hours after its inception– a textbook case of data poisoning.

The now-deactivated Tay bot account. Credit: NY Times

According to the OWASP, data and model poisoning is one of the top 10 security vulnerabilities in LLMs. It occurs when pre-training, fine-tuning or embedding data is manipulated to introduce backdoors or biases that eventually degrade model performance,  making the model produce toxic/biased content or exploiting downstream systems.

While LLMs trained on vast web data are resilient to small-scale attacks, techniques like model distillation (which uses up to 80% less data) heighten vulnerability.

Mitigation techniques

Data validation

Follow the example Notebook

Poisoning is an attack that happens at development time, much before the model hits a particular deployment platform. Hence, mitigation requires data-aware controls more than platform-level configurations.

The ATLAS matrix (Adversarial Threat Landscape for Artificial-Intelligence Systems) categorizes “Poison Training Data”  as an attack that happens early in the AI development process. Source: https://atlas.mitre.org/ 

From the controls recommended by OWASP, data validation stands out.

Data validation is the process of ensuring data integrity and quality through systematic checks. Standard quality controls (like hashing) can help set a secure baseline, but they are not enough for AI systems, as models could be compromised by more nuanced changes to the datasets. Techniques like statistical deviation analysis and pattern recognition must be implemented proactively during training to prevent irreversible model corruption. 

Pandera is an open-source Python library for data validation. It allows you to encode domain-specific rules into a schema to perform type and column validations or more complex statistical deviation checks.

The example Notebook shows how to trigger and mitigate a data poisoning attack on the UCI Heart Disease dataset, saving compute resources by preventing the pipeline from running the training step.

  1. We start by encoding some of the statistical properties of the dataset, a step that requires previous knowledge of the data, probably as a result of an Exploratory Data Analysis phase:
Copied to clipboard!
class RawData(pa.DataFrameModel):
    age: int = pa.Field(in_range={"min_value": 0, "max_value": 200})
    sex: int = pa.Field(isin=[0, 1]) 
    cp: int = pa.Field( #known chest pain types
        isin=[
            1,  # typical angina
            2,  # atypical angina
            3,  # non-anginal pain
            4,  # asymptomatic
        ]
    )
...
  1. Register custom checks to validate relationships between the features and the target:
Copied to clipboard!
class TrainingData(ParsedData):
    @pa.dataframe_check(error="Patients with heart disease should have higher average cholesterol")
    def validate_cholesterol(cls, df: pd.DataFrame) -> bool:
        healthy_chol = df[df.target == 0].chol.mean()
        disease_chol = df[df.target == 1].chol.mean()
        return disease_chol > healthy_chol
...
  1. Poison the training data set via “label flipping”, a technique where an attacker alters a subset of labels for (in this case) disease-positive cases as “healthy” by changing 1s to 0s in the target dataframe, leading to faulty predictions and degraded accuracy:
Copied to clipboard!
@union.task(container_image=custom_image)
def poison_training_data(
    training_set: DataFrame[ParsedData],
    poison_fraction: float, #fraction of input data to poison
    random_state: int
) -> DataFrame[TrainingData]:
    if poison_fraction <= 0:
        return training_set
    print("POISONING DATA")
    poisoned = training_set.copy()
    n_poison = int(len(poisoned) * poison_fraction)
    poisoned_indices = poisoned.sample(n=n_poison, random_state=random_state).index
    poisoned.loc[poisoned_indices, 'target'] = 1 - poisoned.loc[poisoned_indices, 'target'] #flip 1s to 0s (patients with disease labeled as healthy)
    return poisoned
  1. Running the workflow with 20% poisoning fraction fails Pandera schema checks::
Copied to clipboard!
poison_training_data(parsed_data, 0.2, 101)

The label flipping attack effectively changes the statistical relationship between a feature (cholesterol level) and the target and Pandera prevents the pipeline from completing the training phase.

Building a Safer AI Supply Chain

A multi-layered approach is necessary to protect against attacks that seek to alter, tamper, with, or manipulate the ML development lifecycle at different stages.

Union implements a defense-in-depth approach with native controls at multiple levels:

  • Container Isolation: at the heart of Union’s security model is its container-based architecture. Every node in a workflow's directed acyclic graph (DAG) runs in an isolated container with restricted permissions. This design choice provides natural security boundaries - if one container is compromised, the blast radius is limited to that specific step in the workflow. 
  • Immutable Data Storage: Union uses blob storage (like S3, GCS, or Azure Blob Storage) with immutable data handling. This is particularly important for ML pipelines because it prevents attackers from overwriting existing model files or training data. Even if an attacker gains access to write new files, they cannot modify existing ones, helping maintain the integrity of your ML artifacts.
  • Strong Typing:  Union implements a type system that handles the serialization and deserialization of objects between workflow steps. While it doesn't completely prevent pickle files (since many data scientists rely on them), it uses secure alternatives by default. The type system can be extended with custom types that implement features like model file hashing verification.
  • Workflow Approval Gates: Union provides built-in support for human-in-the-loop verification through approval nodes. This is particularly valuable for LLM workflows where you want human verification of generated code before execution. The approval interface shows the content that will be executed and requires explicit human approval before proceeding.
Union provides a gate node type that lets you insert approvals in your workflow
  • Secrets management: Secrets are part of Union’s Task API so you can request them directly from the Task configuration:
Copied to clipboard!
import union

@union.task(secret_requests=[union.Secret(key="my_secret")])
def t1():
    secret_value = union.current_context().secrets.get(key="my_secret")
    # do something with the secret. For example, communication with an external API.
    ...

Union comes with a native Secrets manager that enables you to create and request Secrets. You can also consume Secrets from environment variables, files, or from third-party Secrets managers.

  • Configurable Security Policies: The platforms allow fine-grained control over security policies at multiple levels:
    • Container-level policies for resource access
    • Network policies to control external communications
    • Storage access policies to manage data access
    • Authentication and authorization for workflow execution
  • Audit Trail and Monitoring: the Union platform maintains detailed logs of workflow executions, making it easier to trace and audit what happened in each step of your ML pipeline. This is crucial for security incident investigation and compliance requirements.
  • Compliance: The Union platform is SOC 2 Type I and Type II compliant. This certification verifies that the platform implements properly designed controls to securely handle customer data and ensures that those controls effectively protect the confidentiality, integrity, and availability of sensitive information.

Conclusion

While these features provide a strong security foundation, it's important to note that they work best as part of a comprehensive security strategy. You still need to implement appropriate security practices in your code, manage secrets properly, and follow security best practices for your specific use case. 

<div class="button-group is-center"><a class="button" href="https://www.union.ai/consultation">Book a free consultation with an AI engineer from Union to explore how to adopt security best practices in your AI pipelines.</a></div>

Security