Amazon SageMaker

  • A fully managed service that allows data scientists and developers to easily build, train, and deploy machine learning models at scale.
  • Provides built-in algorithms that you can immediately use for model training.
  • Also supports custom algorithms through docker containers.
  • One-click model deployment.

Concepts

  • Hyperparameters
    • It refers to a set of variables that controls how a model is trained.
    • You can think of them as “volume knobs” that you can tune to acquire your model’s objective.
  • Automatic Model Tuning
    • Finds the best version of a model by automating the training job within the limits of the hyperparameters that you specified.
  • Training
    • The process where you create a machine learning model.
  • Inference
    • The process of using the trained model to make predictions.
  • Local Mode
    • Allows you to create and deploy estimators to your local machine for testing.
    • You must install the Amazon SageMaker Python SDK on your local environment to use local mode.
IT Certification Category (English)728x90

Common Training Data Formats For Built-in Algorithms

  • CSV
  • Protobuf RecordIO
  • JSON
  • Libsvm
  • JPEG
  • PNG

Input modes for transferring training data

  • File mode
    • Downloads data into the SageMaker instance volume before model training commences.
    • Slower than pipe mode
    • Used for Incremental training
  • Pipe mode
    • Directly stream data from Amazon S3 into the training algorithm container.
    • There’s no need to procure large volumes to store large datasets.
    • Provides shorter startup and training times.
    • Higher I/O throughputs
    • Faster than File mode.
    • You MUST use protobuf RecordIO as your training data format before you can take advantage of the Pipe mode.

Two methods of deploying a model for inference

  • Amazon SageMaker Hosting Services
    • Provides a persistent HTTPS endpoint for getting predictions one at a time.
    • Suited for web applications that need sub-second latency response.
  • Amazon SageMaker Batch Transform
    • Doesn’t need a persistent endpoint
    • Get inferences for an entire dataset

Optimization

  • Convert training data into a protobuf RecordIO format to make use of Pipe mode.
  • Use Amazon FSx for Lustre to accelerate File mode training jobs.

Monitoring

  • You can publish SageMaker instance metrics to the CloudWatch dashboard to gain a unified view of its CPU utilization, memory utilization, and latency.
  • You can also send training metrics to the CloudWatch dashboard to monitor model performance in real-time.
  • Amazon CloudTrail helps you detect unauthorized SageMaker API calls.

Pricing

  • The building, training, and deploying of ML models are billed by the second, with no minimum fees and no upfront commitments.

References:
https://aws.amazon.com/sagemaker/faqs/
https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html
https://aws.amazon.com/sagemaker/pricing/

New Year Sale – Upgrade Your Skills and Get a Chance to Win FREE Courses

NEW Course – AWS Certified Data Analytics Specialty Practice Exams

AWS Certified Data Analytics Sepcialty

Pass your AWS and Azure Certifications with the Tutorials Dojo Portal

Tutorials Dojo portal

Our Bestselling AWS Certified Solutions Architect Associate Practice Exams

AWS Certified Solutions Architect Associate Practice Exams

Enroll Now – Our AWS Practice Exams with 95% Passing Rate

AWS Practice Exams Tutorials Dojo

Enroll Now – Our Azure Certification Exam Reviewers

azure reviewers tutorials dojo

Tutorials Dojo Study Guide and Cheat Sheets eBooks

Tutorials Dojo Study Guide and Cheat Sheets-2

FREE Intro to Cloud Computing for Beginners

FREE AWS Practice Test Samplers

Browse Other Courses

Generic Category (English)300x250

Recent Posts