AWS Certified Machine Learning – Specialty Exam Study Guide
The AWS Machine Learning — Specialty Certification is intended for individuals who are responsible for developing data science or applied machine learning projects on the AWS Cloud. This specialty certification is quite different from any other AWS exam. If you already have prior experience with other AWS certifications, you’re probably expecting to be heavily tested on AWS services and how they can be architected to build solutions that can solve different business problems. However, this is not the case in the ML-Specialty certification. Aside from Amazon SageMaker, most of the questions that you’ll encounter have nothing to do with AWS services at all.
The exam covers a wide area of general machine learning concepts. One should at least have a high-level understanding of different stages in machine learning such as choosing the correct algorithm for a specific use case, data collection, feature engineering, test-train splitting, tuning, training, and deploying a model for inference. The exam also expects you to have knowledge on the common issues that arise from model training (e.g., overfitting, unbalanced dataset, missing values in the dataset) and the methods to fix them (e.g., regularization/early stopping, oversampling/adding noise to data, data imputation).
Machine Learning is more on math concepts rather than software engineering. Although not specifically required, it would be advantageous if you have a background in statistics or college math (Linear algebra, Differential calculus) to understand how an algorithm works behind the scenes. Also, It would be best to gain hands-on experience first by building simple models. This will allow you to learn quickly and get used to the jargon in machine learning.
We recommend checking out the following materials
- Machine Learning Terminology and Process
- Machine Learning Algorithms
- Math for Machine Learning
- AWS Foundations: Machine Learning Basics
- AWS Machine Learning Lens
- Machine Learning Best Practices in Financial Services
- Neural Networks
- Introduction to Artificial Intelligence
- Amazon Sagemaker
Other helpful materials
- AWS Machine Learning and AI Services Cheat Sheets
- Mike Chamber’s ML – Specialty Course
- Introduction to Machine Learning with Python
- Deep Learning with Python
AWS SERVICES TO FOCUS ON
- Data ingestion techniques (Batch and Stream processing)
- Data cleaning
- ETL Pipeline
- Building a data lake on Amazon S3
- Available data storages for training with Amazon SageMaker
- Amazon S3
- Amazon EFS
- Amazon FSx for Lustre
- Amazon EBS
- Amazon S3 lifecycle configuration
- Amazon S3 data storage options
Exploratory Data Analysis
- Data Cleaning
- Data labeling (for supervised models)
- Using RecordIO protobuf format to leverage SageMaker’s Pipe mode for training
- Data Visualization and Analysis
- Scatter plot
- Box plots
- Confusion matrix
- Feature Engineering
- Data imputation techniques for filling missing values
- Oversampling/Undersampling methods to fix unbalanced dataset
- Dimensionality Reduction
- Principal Component Analysis (PCA)
- t-Distributed Stochastic Neighbor Embedding (t-SNE)
- One-hot encoding
- Label encoding
- Test-train splitting with randomization
- Amazon SageMaker
- Amazon SageMaker Automatic Model Tuning
- Amazon SageMaker Python SDK
- Amazon Comprehend
- Amazon Rekognition
- Amazon Transcribe
- Amazon Polly
- Amazon Translate
- Amazon Lex
- AWS DeepLens
Amazon SageMaker built-in algorithms
- Linear regression
- Logistic regression
- K-means clustering
- Principal component analysis (PCA)
- Factorization machines
- Neural topic modeling
- Latent Dirichlet allocation
- Time-series forecasting
- Object detection
- Image classification
- Semantic segmentation
- Automated hyperparameter tuning
- Supervised, Unsupervised models, Reinforcement learning
- Managed Spot Training
- Deep Learning
- Convolutional Neural Network (CNN), Recurrent Neural Networks (RNN)
- Weights and biases
- Activation functions
- Rectified Linear Unit (ReLu)
- Network layers (flatten layer, convolutional layer, pooling layer, output layer)
- Dropout regularization
- Model pruning
- Solving overfitting and underfitting problems
- Training SageMaker models on local mode
- Early Stopping
- Metrics for confusion matrix (true positives, false positives, false negatives, true negatives)
- Model evaluation
- ROC / AUC
- F1 Score
Machine Learning Implementation and Operations
- Amazon Elastic Inference
- Amazon SageMaker Inference Pipeline
- Amazon SageMaker Neo
- Amazon Augmented AI (A2I)
- Amazon CloudWatch
- AWS CloudTrail
- Real-time and batch inference
- Monitoring model metrics using CloudWatch
- Monitoring SageMaker API logs using CloudTrail
- Using Amazon Augmented A2I to involve human-reviewers in a machine learning workflow.
- Multi-model endpoints
- Encrypting data with AWS KMS
- Lifecycle configuration script
- Optimizing model for edge-devices using SageMaker Neo
Validate Your Knowledge
For high-quality practice exams, you can use our AWS Certified Machine Learning Specialty Practice Exams. These practice tests will help you boost your preparedness for the real exam. It contains multiple sets of questions that cover almost every area that you can expect from the real certification exam. We have also included detailed explanations and adequate reference links to help you understand why the option with the correct answer is better than the rest of the options. This is the value that you will get from our course. Practice exams are a great way to determine which areas you are weak in, and they will also highlight the important information that you might have missed during your review.
Sample Practice Test Questions:
A Machine Learning Specialist is training a regression model to predict house prices in different locations. The Specialist wants to test the quality of the test data by identifying whether the model is underestimating or overestimating the target price.
Which visualization technique should the Specialist use?
- Residual plots
- Confusion matrix
- Correlation matrix
- Root Mean Square Error (RMSE)
A Machine Learning Specialist is preparing the dataset to be used for training a linear learner model in Amazon SageMaker. During exploratory data analysis, he has detected multiple feature columns that have missing values. The percentage of missing data across the whole training dataset is about 10%. The Specialist is worried that this might cause bias to his model that can lead to inaccurate results.
Which approach will MOST likely yield the best result in reducing the bias caused by missing values?
- Drop the columns that include missing values because they only account for 10% of the training data.
- Use supervised learning methods to estimate the missing values for each feature.
- Compute the mean of non-missing values in the same row and use the result to replace missing values.
- Compute the mean of non-missing values in the same column and use the result to replace missing values.
Click here for more AWS Certified Machine Learning Specialty practice exam questions.
Check out our other AWS practice test courses here:
Machine Learning plays a major role in almost all industries. It provides numerous business benefits such as forecasting sales, predicting medical diagnosis, simplifying time-consuming data entry tasks, etc. With the proliferation of machine learning and AI applications, it’s not difficult to see how it will impact job demands in the market. The need for machine learning talent to build efficient and effective models at scale will definitely continue growing for years to come. And pairing your skills with the AWS Machine Learning — Specialty certification would absolutely make your resume stand out and boost your earning potential.
We hope that our guide has helped you achieved that goal, and we would love to hear back from your exam. We wish you the best of results.