Last updated on October 25, 2024
Here are 10 AWS Certified Machine Learning Specialty MLS-C01 practice exam questions to help you gauge your readiness for the actual exam.
Question 1
A trucking company wants to improve situational awareness for its operations team. Each truck has GPS devices installed to monitor their locations.
The company requires to have the data stored in Amazon Redshift to conduct near real-time analytics, which will then be used to generate updated dashboard reports.
Which workflow offers the quickest processing time from ingestion to storage?
- Use Amazon Kinesis Data Stream to ingest the location data. Load the streaming data into the cluster using Amazon Redshift Streaming ingestion.
- Use Amazon Managed Streaming for Apache Kafka (MSK) to ingest the location data. Use Amazon Redshift Spectrum to deliver the data in the cluster.
- Use Amazon Data Firehose to ingest the location data and set the Amazon Redshift cluster as the destination.
- Use Amazon Data Firehose to ingest the location data. Load the streaming data into the cluster using Amazon Redshift Streaming ingestion.
Question 2
A Machine Learning Specialist is training an XGBoost-based model for detecting fraudulent transactions using Amazon SageMaker. The training data contains 5,000 fraudulent behaviors and 500,000 non-fraudulent behaviors. The model reaches an accuracy of 99.5% during training.
When tested on the validation dataset, the model shows an accuracy of 99.1% but delivers a high false-negative rate of 87.7%. The Specialist needs to bring down the number of false-negative predictions for the model to be acceptable in production.
Which combination of actions must be taken to meet the requirement? (Select TWO.)
- Increase the model complexity by specifying a larger value for the
max_depth
hyperparameter. - Increase the value of the
rate_drop
hyperparameter to reduce the overfitting of the model. - Adjust the balance of positive and negative weights by configuring the
scale_pos_weight
hyperparameter. - Alter the value of the
eval_metric
hyperparameter to MAP (Mean Average Precision). - Alter the value of the
eval_metric
hyperparameter to Area Under The Curve (AUC).
Question 3
A manufacturing company wants to aggregate data in Amazon S3 and analyze it using Amazon Athena. The company needs a solution that can both ingest and transform streaming data into Apache Parquet format.
Which AWS Service meets the requirements?
- Amazon Data Streams
- AWS Batch
- Amazon Data Firehose
- AWS Database Migration Service
Question 4
A Data Scientist uses an Amazon Data Firehose stream to ingest data records produced from an on-premises application. These records are compressed using GZIP compression. The Scientist wants to perform SQL queries against the data stream to gain real-time insights.
Which configuration will enable querying with the LEAST latency?
- Transform the data with Amazon Kinesis Client Library and deliver the results to an Amazon OpenSearch cluster.
- Use a Kinesis Data Analytics application configured with AWS Lambda to transform the data.
- Use a streaming ETL job in AWS Glue to transform the data coming from the Firehose stream.
- Store the data records in an Amazon S3 bucket and use Amazon Athena to run queries.
Question 5
A financial company is receiving hundreds of credit card applications daily and is looking for ways to streamline its manual review process. The company’s machine learning (ML) specialist has been given a CSV dataset with a highly imbalanced class.
The specialist must train a prototype classifier that predicts whether to approve or reject an application. The company wants the model to be delivered as soon as possible.
How can the ML specialist meet the requirement with the LEAST operational overhead?
- Upload the dataset to an Amazon S3 bucket. Create an Amazon SageMaker AutoPilot job and specify the bucket location as the source for the job. Choose the best version of the model.
- Upload the dataset to an Amazon S3 bucket. Use the built-in XGBoost algorithm in Amazon SageMaker to train the model. Run an automatic model tuning job with early stopping enabled. Select the best version of the model.
- Upload the dataset to an Amazon S3 bucket. Perform feature engineering on the data using Amazon SageMaker Data Wrangler. Train the model using the built-in XGBoost algorithm in Amazon SageMaker.
- Upload the dataset to an Amazon S3 bucket. Create an Amazon SageMaker Ground Truth labeling job. Select
Text Classification (Single Label)
as the task selection. Add the company’s credit officers as workers.
Question 6
A Data Scientist launches an Amazon SageMaker notebook instance to develop a model for forecasting sales revenue. The scientist wants to load test the model to figure out the right instance size to deploy in production.
How can the scientist assess and visualize CPU utilization, GPU utilization, memory utilization, and latency as the load test runs?
- Create a CloudWatch dashboard to build a unified operational view of the metrics generated by the notebook instance.
- Create a custom CloudWatch Logs and stream the data into an Amazon OpenSearch cluster. Visualize the logs with Kibana.
- Create a log stream in CloudWatch Logs and subscribe to it an Amazon Data Firehose stream to send the data into an Amazon OpenSearch cluster. Visualize the logs with Kibana.
- Export the generated log data to an Amazon S3 bucket. Use Amazon Athena and Amazon QuickSight to visualize the SageMaker logs.
Question 7
A Machine Learning Specialist has graphed the results of a K-means model fitted through a range of k-values. The Specialist needs to select the optimal k parameter.
Based on the graph, which k-value is the best choice?
- 4
- 9
- 3
- 6
Question 8
A Machine Learning Specialist is migrating hundreds of thousands of records in CSV files into an Amazon S3 bucket. Each file has 150 columns and is about 1 MB in size. Most of the queries will span a minimum of 5 columns. The data must be transformed to minimize the query runtime.
Which transformation method will optimize query performance?
- Transform the files to XML data format.
- Transform the files to Apache Parquet data format.
- Transform the files to gzip-compressed CSV data format.
- Transform the files to JSON data format.
Question 9
A healthcare organization has a large repository of medical documents they want to categorize and manage efficiently. The specific topics are yet to be determined, but the company aims to utilize the terms within each document to assign it to a relevant medical category. To solve this problem, a Machine Learning specialist uses Amazon SageMaker to develop a model.
Which built-in algorithm in Amazon SageMaker would be the most suitable choice?
- BlazingText algorithm in Text Classification mode
- Latent Dirichlet Allocation (LDA) Algorithm
- Semantic Segmentation Algorithm
- CatBoost Algorithm
Question 10
A Business Process Outsourcing (BPO) company uses Amazon Polly to translate plaintext documents to speech for its voice response system. After testing, some acronyms and business-specific terms are being pronounced incorrectly.
Which approach will fix this issue?
- Use a
viseme
Speech Mark. - Use pronunciation lexicons.
- Convert the scripts into Speech Synthesis Markup Language (SSML) and use the
pronunciation
tag. - Convert the scripts into Speech Synthesis Markup Language (SSML) and use the
emphasis
tag to guide the pronunciation.
For more practice questions like these and to further prepare you for the actual AWS Certified Machine Learning Specialty MLS-C01 exam, we recommend that you take our top-notch AWS Certified Machine Learning Specialty Practice Exams, which have been regarded as the best in the market.
Also, check out our AWS Certified Machine Learning Specialty MLS-C01 exam study guide here.