Creating an endpoint on AWS Sagemaker with Pulumi

1 minute read

In the previous post, I mentioned Pulumi - an emerging open-source IaC tool. To further understand Pulumi’s functionalities, I have used Pulumi to create a real-time endpoint to serve a machine learning (ML) model on AWS Sagemaker, and now I would like to walk you through the steps involved.

In this blog post, we will cover everything from setting up the necessary infrastructure to provisioning endpoints with industry’s best practices. By the end of this post, you will have a good understanding of how to use Pulumi and SageMaker together to manage your machine learning models like a pro. So, let’s dive in!

Prerequisites:

  • An active AWS account with developer permissions
  • A new Pulumi project with your AWS configuration.
  • A ML model created on AWS

To create an endpoint, 3 resources are required, namely S3 bucket, endpoint configuration, and endpoint. In addition, CloudWatch log group resource is a good to have. We will explain the details further below.

S3 Bucket

This is needed to store your endpoint e.g. input data from users, output predictions etc.

import pulumi_aws as aws

s3_bucket = aws.s3.Bucket(
    resource_name="endpoint-bucket",
    bucket="endpoint-bucket",
    acl="private",
)

Endpoint Configuration

It is highly recommended to enable data capture to record information that can be used for training, debugging, and monitoring model. Amazon SageMaker Model Monitor automatically parses this captured data and compares metrics from this data with a baseline that you create for the model, which is useful for detecting model and data drift. For more information, refer to this video.

s3_uri = f"s3://endpoint-bucket/endpoint-data-capture-logs/" # from s3 bucket created previously
endpoint_configuration = aws.sagemaker.EndpointConfiguration(
    resource_name=model_name,
    name=model_name,
    data_capture_config=aws.sagemaker.EndpointConfigurationDataCaptureConfigArgs(
        destination_s3_uri=s3_uri,
        initial_sampling_percentage=100, # A lower value is recommended for Endpoints with high traffic.
        enable_capture=True,
        capture_options=[
            aws.sagemaker.EndpointConfigurationDataCaptureConfigCaptureOptionArgs(capture_mode="Output"),
            aws.sagemaker.EndpointConfigurationDataCaptureConfigCaptureOptionArgs(capture_mode="Input"),
        ],
        capture_content_type_header=aws.sagemaker.EndpointConfigurationDataCaptureConfigCaptureContentTypeHeaderArgs(
            csv_content_types=["text/csv"], json_content_types=["application/json"]
        ),
    ),
    production_variants=[
        aws.sagemaker.EndpointConfigurationProductionVariantArgs(
            variant_name='version_1'
            model_name=[model name],
            initial_instance_count=1,
            instance_type="ml.m5.xlarge",
        )
    ],
)

Endpoint

This resource is created by referring to the endpoint configuration created previously.

endpoint = aws.sagemaker.Endpoint(
    resource_name=model_name,
    name=model_name,
    endpoint_config_name=endpoint_configuration.id,
)

Cloudwatch Log Group

With a log group, warnings and error messages logged to stdout can be recorded, which is helpful for debugging and is considered industry best practice.

cloudwatch_logs = aws.cloudwatch.LogGroup(
    resource_name=f"/aws/sagemaker/Endpoints/{model_name}",
    name=f"/aws/sagemaker/Endpoints/{model_name}",
    retention_in_days=30,
)

After adding these pulumi resources, the end point would be successfully created on AWS by running pulumi up.

Updated:

Comments