You're seconds away from unlocking all AI/ML resources

First Name
Last Name
Company Name
CDN Province
US State
India State
AU State
Phone Number
Job Role
Job Title
Postal Code
This information is associated with my:
Compliance Opt-in
Thank you!
Error - something went wrong!

Optimizing Amazon SageMaker endpoints using serverless deployments and instance recommendations

March 22, 2022

Many customers have ML applications with intermittent or unpredictable traffic patterns. Selecting a compute instance with the best price performance for deploying machine learning (ML) models is a complicated, iterative process that can take weeks of experimentation. Rather than provisioning for peak capacity upfront, which can result in idle capacity or building complex workflows to shut down idle instances, you can now use Amazon SageMaker serverless inference and Amazon SageMaker Inference Recommender. In this session, learn to select serverless when deploying your ML model and how Amazon SageMaker automatically provisions, scales, and turns off compute capacity based on the volume of inference requests. Use Amazon SageMaker Inference Recommender to load test and automatically select the right compute instance type, instance count, container parameters, and model optimizations for inference to maximize performance and minimize cost. Dive deep into these new features, available in preview.

Speaker: Kapil Pendse, Principal Solutions Architect, AWS

Download slides

Previous Video
Bias detection and explainability in ML
Bias detection and explainability in ML

Machine learning is increasingly used to assist decision making in financial services, education, transport...

Next Video
Implementing MLOps with Amazon SageMaker
Implementing MLOps with Amazon SageMaker

MLOps practices help data scientists and IT operations professionals collaborate and manage the production ...