Ends in
00
days
00
hrs
00
mins
00
secs
LEARN MORE

SALE! SysOps (Newly Updated), SAA, CDA Practice Exams - $11.99 instead of $14.99 USD

AWS Certified Data Analytics – Specialty Exam Study Path

The AWS Certified Data Analytics – Specialty exam is intended for people who have experience in designing, building, securing, and maintaining analytics solutions on AWS. The exam will test your technical skills on how different AWS analytics services integrate with each other. You also need to know how they fit in the data lifecycle of collection, storage, processing, and visualization.

This specialty certification exam is on par with the other AWS Professional level tests so you need to allocate ample time for your preparation. With the help of the official exam study guide, you can determine the areas that you need to focus on. It will show you the specific knowledge areas and domains that you must review to pass the exam.

Study Materials

Before taking the actual exam, we recommend checking out these study materials for AWS Certified Data Analytics Specialty. These resources will help you understand the concepts and strategies that you will need for you to pass the exam.

  1. Free Exam Readiness: AWS Certified Data Analytics – Specialty – this is an interactive course that has responsive image maps, accordions, sample problem sets, section-based quizzes, and a practice test in the end.
  2. AWS FAQs – can help you grasp every service briefly. The responses you will find here are commonly asked questions, use cases, and comparison of various AWS services.
  3. Tutorials Dojo’s AWS Cheat Sheets – can help you understand the lengthy concepts found in the AWS FAQs. These cheat sheets are presented in a bullet point format to help you digest the information easily. This page summarizes all the analytics services of AWS.
  4. AWS Knowledge Center – you can use this website to find and understand the most frequent questions and requests AWS receives from its customers.
  5. AWS Documentation and Whitepapers – this document will help you expand your knowledge on various AWS services with its detailed information. You can focus on the following whitepapers:
  6. IT Certification Category (English)728x90
  7. Tutorials Dojo’s AWS Certified Data Analytics Specialty Practice Exams (coming soon!) – this provides a comprehensive reviewer with complete and detailed explanations to help you pass your AWS Data Analytics exam on your first try. The Tutorials Dojo practice exams are well-regarded as the best AWS practice test reviewers in the market.

AWS Services to Focus On

The AWS Certified Data Analytics Specialty has five domains: Collection, Storage and Data Management, Processing, Analysis and Visualization, and Security. To comprehend the different scenarios in the exam, you should have a thorough understanding of the following services:

  1. Amazon Athena – learn how you can analyze the data in the S3 bucket and how you can configure and optimize Athena’s performance.
  2. Amazon CloudSearch –  know the use case and features of the service.
  3. Amazon Elasticsearch – learn how you can integrate Elasticsearch and Kibana in different AWS services.
  4. Amazon EMR – understand the security, hardware, and software configurations of the EMR cluster and how you can use AWS Glue Data Catalog for table metadata.
  5. Amazon Kinesis – know the use case of each Kinesis service (Data Streams, Data Firehose, and Data Analytics) and how they differ from each other.
  6. Amazon QuickSight – learn how you can integrate QuickSight into your solution, how you can publish dashboards, reports, analytics, and how you can refresh your datasets.
  7. Amazon Redshift – understand the different SQL commands, the use case of Redshift cluster, Redshift Spectrum, and how you can analyze the data in the data warehouse.
  8. AWS Data Pipeline – learn the concepts and components of the pipeline.
  9. AWS Glue – understand the concepts of the data catalog, crawlers, workflows, triggers, jobs, job bookmarks, and job metrics.

You must know how these services interact to develop a complete data analytics solution in AWS. Also, prepare to see various Apache technologies, such as Apache Parquet, ORC, Avro, Oozie, Sqooq, HBase, and many more.

Common Exam Scenarios

Scenario

Solution

Collection

A near-real-time solution is needed that only collects non-confidential data from sensitive streaming data and stores it in durable storage.

Use Amazon Kinesis Data Firehose to ingest streaming data and enable record transformation to utilize AWS Lambda for excluding sensitive data. Store the processed data in Amazon S3.

Large files are compressed into a single GZIP file and uploaded into an S3 bucket. You have to speed up the COPY process to load data into Amazon Redshift.

Split the GZIP file into smaller files and make sure that their number is a multiple of the number of the Redshift cluster’s slices.

An Amazon EMR cluster needs to use a centralized metadata layer that will expose data in Amazon S3 as tables.

Enable EMRFS consistent view.

Ways to fix Amazon Kinesis Data Streams throttling issues on write requests.

  • Increase the number of shards using the UpdateShardCount API command.

  • Use random partition keys

A company needs a cost-effective solution for detecting anomalous data coming from an Amazon Kinesis Data stream.

Create a Kinesis Data Analytics application and use the RANDOM_CUT_FOREST function for anomaly detection.

Storage and Data Management

A company wants a cost-effective solution that will enable them to query a subset of data from a CSV file.

Use Amazon S3 Select

You need to populate a data catalog using data stored in Amazon S3, Amazon RDS, and Amazon DynamoDB.

Use an AWS Glue crawler schedule

A Data Analyst used the COPY command to migrate CSV files into a Redshift cluster. However, no data was imported and no errors were found after the process was finished.

  • The CSV files uses carriage returns as a line terminator.

  • The IGNOREHEADER parameter was included in the COPY command.

What is a cost-effective solution to save Redshift query results to an external storage?

Use the Amazon Redshift UNLOAD command

A company is using Amazon S3 Standard-IA and Amazon S3 Glacier as its data storage.

Some data cannot be accessed with Amazon Athena queries. Which best explains this event?

Amazon Athena is trying to access data stored in Amazon S3 Glacier.

Processing

A company uses an Amazon EMR cluster to process 10 batch jobs every day. Each job takes about 20 minutes to complete. A solution to lower down the cost of the EMR cluster must be implemented.

Use transient Amazon EMR clusters

An Amazon Kinesis Client Library (KCL) application is processing data in a DynamoDB table that has provisioned write capacity. The application’s latency increases during peak times and it must be resolved immediately.

Increase the DynamoDB tables’ write throughput.

Thousands of files are being loaded in a central fact table hosted on Amazon Redshift. You need to optimize the cluster resource utilization when loading data into the fact table.

Use a single COPY command to load data.

A Lambda function is used to process data from a Kinesis Data stream. Results are delivered into Amazon ES. During peak hours, the processing time slows down.

Use multiple Lambda function to prococess data concurrently.

A Data Analyst needs to join data stored in Amazon Redshift and data stored in Amazon S3. The Analyst wants a serverless solution that will reduce the workload of the Redshift cluster.

Create an external table using Amazon Redshift Spectrum for the S3 data and use Redshift SQL queries for join operations.

Analysis and Visualization

A company requires an out-of-the-box solution for visualizing complex real-world scenarios and forecasting trends.

Use ML-powered forecasting with Amazon QuickSight

A Data Analyst needs to use Amazon QuickSight for creating daily reports based on the dataset stored in Amazon S3.

Create a daily schedule refresh for the dataset.

A company has encountered an import into SPICE error after using Amazon QuickSight to query a new Amazon Athena table that is associated with a new S3 bucket.

Configure the correct permissions for the new S3 bucket from the QuickSight Console.

A company needs a cost-effetive solution for ad-hoc analyses and data visualizations.

Use Amazon Athena and Amazon QuickSight.

A company needs to visualize and analyze web logs in near-real time.

Use Amazon Kinesis Data Firehose to stream logs into Amazon Elasticsearch. Visualize logs using Kibana.

Security

Root device volume encryption must be enabled on all nodes of an EMR cluster. AWS CloudFormation is required for creating new resources.

Create a custom AMI with encrypted root device volume and place the AMI ID under the CustomAmild property within the CloudFormation template.

A solution is needed to encrypt data stored in an EBS volume that is attached to an EMR cluster

Use Linux Unified Key Setup (LUKS).

A company is having trouble accessing data in a Redshift cluster using Amazon QuickSight.

Create a new inbound rule for the cluster’s security group that allows access from the IP address range that Amazon QuickSight uses.

A company wants to prevent any user from creating EMR clusters that is accessible from the public Internet.

Enable the ‘block public access’ setting in the Amazon EMR Console.

A company wants data in a Kinesis Data stream to be encrypted. The company wants to manage the key rotation.

Specify a Customer Master Key when enabling server-side encryption for the Kinesis Data stream.

 

Validate Your Knowledge

After you’ve reviewed the materials above, the next resource that you should check is the FREE AWS sample questions for AWS Data Analytics Specialty. Although this sample exam is not on the same level of difficulty as one might expect on the real exam, it is still a helpful resource for your review. Be sure to check the sample questionnaire often since AWS may upload a new version of it.

For high-quality practice exams, you can use our AWS Certified Data Analytics Specialty Practice Exams. These practice tests will help you boost your preparedness for the real exam. It contains multiple sets of questions that cover almost every area that you can expect from the real certification exam. We have also included detailed explanations and adequate reference links to help you understand why the option with the correct answer is better than the rest of the options. This is the value that you will get from our course. Practice exams are a great way to determine which areas you are weak in, and it will also highlight the important information that you might have missed during your review.

AWS Certified Data Analytics Sepcialty

Sample Practice Test Questions:

Question 1

A company provides insights into user behaviors of its social media platform using Amazon Athena. The Data Analysts from different teams run ad-hoc queries on the data stored on Amazon S3 buckets. However, some data contains sensitive information that must adhere to certain security policies. The query history and execution must be separated among different users and teams for compliance purposes.

Which of the following should be implemented to meet the above requirements?

  1. Set up an S3 bucket for each team and assign bucket policies that grant appropriate permissions to individual IAM users. Enable S3 server access logging on the buckets to store historical queries in another S3 bucket.
  2. Set up an Athena workgroup for each team and apply tags to each workgroup. Using these tags, grant appropriate permissions to the workgroup with IAM policies. Have the members use their assigned Athena workgroup.
  3. Set up an IAM group for each team, create an appropriate IAM policy to use Athena, and attach it to the IAM group. Add the individual users to the IAM group and use this permission to query on Athena.
  4. Set up an IAM group for each team, grant specific Athena permissions to each IAM group. Create an AWS Glue Data Catalog resource policy for each IAM group to record the Athena queries.

Correct Answer: 2

AWS recommends using Athena workgroups to isolate queries for teams, applications, or different workloads. For example, you may create separate workgroups for two different teams in your organization. You can also separate workloads. For example, you can create two independent workgroups, one for automated scheduled applications, such as report generation, and another for ad-hoc usage by analysts. You can switch between workgroups.

With Athena Workgroups, you can:

  • Isolate users, teams, applications, or workloads into groups.
  • Enforce costs constraints.
  • Track query-related metrics for all workgroup queries in CloudWatch.

Setting up workgroups involves creating them and establishing permissions for their usage. Each workgroup that you create shows saved queries and query history only for queries that ran in it, and not for all queries in the account. This separates your queries from other queries within an account and makes it more efficient for you to locate your own saved queries and queries in history.

Hence, the correct answer is: Set up an Athena workgroup for each team and apply tags to each workgroup. Using these tags, grant appropriate permissions to the workgroup with IAM policies. Have the members use their assigned Athena workgroup.

The option that says: Set up an S3 bucket for each team and assign bucket policies that grant appropriate permissions to individual IAM users. Enable S3 server access logging on the buckets to store historical queries in another S3 bucket is incorrect. Although this is technically possible, this type of setup is not recommended. The S3 access logs will record all user activity on the bucket, not just Athena queries. Additionally, you will have to configure each bucket for permissions for each IAM group.

The option that says: Set up an IAM group for each team, create an appropriate IAM policy to use Athena, and attach it to the IAM group. Add the individual users to the IAM group and use this permission to query on Athena is incorrect because using IAM groups is not a suitable solution for isolating queries and tracking query history for teams, applications, or different workloads in Athena. A better solution is to use Athena Workgroups.

The option that says: Set up an IAM group for each team, grant specific Athena permissions to each IAM group. Create an AWS Glue Data Catalog resource policy for each IAM group to record the Athena queries is incorrect because an AWS Glue resource policy can only be used to manage permissions for Data Catalog resources, not Athena.

References:
https://aws.amazon.com/about-aws/whats-new/2019/02/athena_workgroups/
https://docs.aws.amazon.com/athena/latest/ug/workgroups.html
https://docs.aws.amazon.com/athena/latest/ug/user-created-workgroups.html

Check out this Amazon Athena Cheat Sheet:
https://tutorialsdojo.com/amazon-athena/

Question 2

A digital marketing company uses Amazon DynamoDB and highly-available Amazon EC2 instances for one of its solutions. Its application logs are pushed to Amazon CloudWatch logs. The team of data analysts wants to enrich these logs with data from DynamoDB in near-real-time and use the output for further study.

Which among these steps will enable collection and enrichment based on the requirements stated above?

  1. Export the EC2 application logs to Amazon S3 on an hourly basis using AWS CLI. Use AWS Glue crawlers to catalog the logs. Configure an AWS Glue connection to the DynamoDB table and an AWS Glue ETL job to enrich the data. Store the enriched data in an Amazon S3 bucket.
  2. Write an AWS Lambda function that will enrich the data in the DynamoDB table. Create an Amazon Kinesis Data Firehose delivery stream, configure it to subscribe to Amazon CloudWatch Logs, and set an Amazon S3 bucket as its destination. Create a CloudWatch Logs subscription that sends log events to your delivery stream.
  3. Write an AWS Lambda function that will export the EC2 application logs to Amazon S3 on an hourly basis. Use Apache Spark SQL on Amazon EMR to read the logs from Amazon S3 and enrich the records with the data from DynamoDB. Store the enriched data in an Amazon S3 bucket.
  4. Tutorials Dojo Study Guide and Cheatsheet
  5. Install Amazon Kinesis Agent on the EC2 instance. Configure the application to write the logs in a local filesystem and configure Amazon Kinesis Agent to send the data to Amazon Kinesis Data Streams. Configure a Kinesis Data Analytics SQL application with the Kinesis data stream as the source and enrich it with data from the DynamoDB table. Store the enriched output stream in an Amazon S3 bucket using Amazon Kinesis Data Firehose.

Correct Answer: 2

Amazon Kinesis Data Firehose captures, transforms, and loads streaming data from sources such as a Kinesis data stream, the Kinesis Agent, or Amazon CloudWatch Logs into downstream services such as Kinesis Data Analytics or Amazon S3. You can write Lambda functions to request additional, customized processing of the data before it is sent downstream. AWS Lambda can perform data enrichment like looking up data from a DynamoDB table, and then produce the enriched data onto another stream. Lambda is commonly used for preprocessing the analytics app to handle more complicated data formats.

There are blueprints that you can use to create a Lambda function for data transformation. It includes a blueprint that reads CloudWatch Logs. Data sent from CloudWatch Logs to Amazon Kinesis Data Firehose is already compressed with gzip level 6 compression, so you do not need to use compression within your Kinesis Data Firehose delivery stream.

Thus, the correct answer is: Write an AWS Lambda function that will enrich the data in the DynamoDB table. Create an Amazon Kinesis Data Firehose delivery stream, configure it to subscribe to Amazon CloudWatch Logs, and set an Amazon S3 bucket as its destination. Create a CloudWatch Logs subscription that sends log events to your delivery stream.

The option that says: Export the EC2 application logs to Amazon S3 on an hourly basis using AWS CLI. Use AWS Glue crawlers to catalog the logs. Configure an AWS Glue connection to the DynamoDB table and an AWS Glue ETL job to enrich the data. Store the enriched data in an Amazon S3 bucket is incorrect. It does not fulfill the near real-time analysis requirement since the data is only exported on an hourly basis.

The option that says: Write an AWS Lambda function that will export the EC2 application logs to Amazon S3 on an hourly basis. Use Apache Spark SQL on Amazon EMR to read the logs from Amazon S3 and enrich the records with the data from DynamoDB. Store the enriched data in an Amazon S3 bucket is incorrect. This does not fulfill the near real-time analysis requirement. For cost-saving matters, it is more strategic to avoid using Amazon EMR as it entails additional costs to run its underlying EC2 instances.

The option that says: Install Amazon Kinesis Agent on the EC2 instance. Configure the application to write the logs in a local filesystem and configure Amazon Kinesis Agent to send the data to Amazon Kinesis Data Streams. Configure a Kinesis Data Analytics SQL application with the Kinesis data stream as the source and enrich it with data from the DynamoDB table. Store the enriched output stream in an Amazon S3 bucket using Amazon Kinesis Data Firehose is incorrect. Installing Kinesis agent to the EC2 instance is unwarranted since there is already a CloudWatch Logs integration that can deliver the logs. Creating a Kinesis Data Analytics SQL application is also unnecessary and quite costly.

References:
https://docs.aws.amazon.com/firehose/latest/dev/writing-with-cloudwatch-logs.html
https://aws.amazon.com/blogs/big-data/joining-and-enriching-streaming-data-on-amazon-kinesis/
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html

Check out this Amazon Kinesis Cheat Sheet:
https://tutorialsdojo.com/amazon-kinesis

Click here for more AWS Certified Data Analytics Specialty practice exam questions.

Check out our other AWS practice test courses here:Tutorials Dojo AWS Practice Tests

Final Remarks

To understand a service at a higher level, we recommend that you get hands-on experience. A lot of questions in the exam try to validate whether you’ve seen a particular error or issue during your practice. To prepare yourself for the actual exam, you can use the AWS Free Tier account to simulate different scenarios. With the combination of theoretical and practical knowledge, you can pass the test with flying colors.

We hope that our guide has helped you achieve your goal, and we would love to hear back from you after your exam. Remember that the most important thing before the day of your exam is to get some well-deserved rest. Good luck, and we wish you all the best.

SysOps Practice Tests Updated to SOA-C02. SALE on SysOps, SAA, CDA Practice Exams!

Pass your AWS, Azure, and Google Cloud Certifications with the Tutorials Dojo Portal

Tutorials Dojo portal

Our Bestselling AWS Certified Solutions Architect Associate Practice Exams

AWS Certified Solutions Architect Associate Practice Exams

Enroll Now – Our AWS Practice Exams with 95% Passing Rate

AWS Practice Exams Tutorials Dojo

Enroll Now – Our Azure Certification Exam Reviewers

azure reviewers tutorials dojo

Enroll Now – Our Google Cloud Certification Exam Reviewers

Tutorials Dojo Exam Study Guide eBooks

Tutorials Dojo Study Guide and Cheat Sheets-2

Subscribe to our YouTube Channel

Tutorials Dojo YouTube Channel

FREE Intro to Cloud Computing for Beginners

FREE AWS, Azure, GCP Practice Test Samplers

Browse Other Courses

Generic Category (English)300x250

Recent Posts

AWS, Azure, and GCP Certifications are consistently among the top-paying IT certifications in the world, considering that most companies have now shifted to the cloud. Earn over $150,000 per year with an AWS, Azure, or GCP certification!

Follow us on LinkedIn, YouTube, Facebook, or join our Slack study group. More importantly, answer as many practice exams as you can to help increase your chances of passing your certification exams on your first try!

View Our AWS, Azure, and GCP Exam Reviewers

Our Community

~98%
passing rate
Around 95-98% of our students pass the AWS Certification exams after training with our courses.
200k+
students
Over 200k enrollees choose Tutorials Dojo in preparing for their AWS Certification exams.
~4.8
ratings
Our courses are highly rated by our enrollees from all over the world.

What our students say about us?

error: Content is protected !!