Ends in
00
days
00
hrs
00
mins
00
secs
ENROLL NOW

Get $4 OFF in AWS Solutions Architect & Data Engineer Associate Practice Exams for $10.99 ONLY!

AWS Analytics Services

Home » AWS Cheat Sheets » AWS Analytics Services

Building Data Pipelines with No-Code ETL Using AWS Glue Studio

2024-04-03T00:33:32+00:00

Introduction Welcome to the dynamic world of AWS Data Engineering! This beginner-friendly guide introduces you to the essentials of data staging and transformation within the AWS ecosystem without needing to code. By exploring the foundational use of Amazon S3 and AWS Glue, this guide provides a practical starting point for understanding how AWS data is handled and processed. Whether you're aiming for certification or looking to apply these skills in practical scenarios, this guide sets the groundwork for your future in data engineering. Preparation: Navigating Through Sample Datasets In this article, we'll work with 3 main datasets for a fictional [...]

Building Data Pipelines with No-Code ETL Using AWS Glue Studio2024-04-03T00:33:32+00:00

AWS Glue

2024-03-27T07:36:42+00:00

Bookmarks Use Cases Concepts Populating the AWS Glue Data Catalog Authoring Jobs Glue DataBrew Monitoring Security Pricing Validate Your Knowledge AWS Glue Cheat Sheet A fully managed service to extract, transform, and load (ETL) your data for analytics. Discover and search across different AWS data sets without moving your data. AWS Glue consists of: Central metadata repository ETL engine Flexible scheduler Use Cases Run queries against an Amazon S3 data lake You can use AWS Glue to make your data available for analytics without moving your data. Analyze [...]

AWS Glue2024-03-27T07:36:42+00:00

Kinesis Scaling, Resharding and Parallel Processing

2023-03-20T03:29:22+00:00

Kinesis Resharding enables you to increase or decrease the number of shards in a stream in order to adapt to changes in the rate of data flowing through the stream. Resharding is always pairwise. You cannot split into more than two shards in a single operation, and you cannot merge more than two shards in a single operation. The Kinesis Client Library (KCL) tracks the shards in the stream using an Amazon DynamoDB table, and adapts to changes in the number of shards that result from resharding. When new shards are created as a result of resharding, the KCL discovers [...]

Kinesis Scaling, Resharding and Parallel Processing2023-03-20T03:29:22+00:00

Amazon QuickSight

2023-06-23T08:06:48+00:00

Bookmarks Features SPICE Concepts Validate Your Knowledge Amazon QuickSight Cheat Sheet  Amazon QuickSight is a cloud-powered business analytics service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from their data, anytime, on any device. Features Provides ML Insights for discovering hidden trends and outliers, identify key business drivers, and perform powerful what-if analysis and forecasting. Has a wide library of visualizations, charts, and tables; You can add interactive features like drill-downs and filters, and perform automatic data refreshes to build interactive dashboards. Allows you to schedule [...]

Amazon QuickSight2023-06-23T08:06:48+00:00

Amazon OpenSearch Service (formerly Amazon ElasticSearch)

2023-06-12T06:48:45+00:00

Amazon OpenSearch Service Cheat Sheet Amazon OpenSearch lets you search, analyze, and visualize your data in real-time. This service manages the capacity, scaling, patching, and administration of your Elasticsearch clusters for you, while still giving you direct access to the Elasticsearch APIs. The service offers open-source Elasticsearch APIs, managed Kibana, and integrations with Logstash and other AWS Services. This combination is often coined as the ELK Stack. Amazon OpenSearch Concepts An Amazon OpenSearch domain is synonymous with an Elasticsearch cluster. Domains are clusters with the settings, instance types, instance counts, and storage resources that you specify. You can create multiple [...]

Amazon OpenSearch Service (formerly Amazon ElasticSearch)2023-06-12T06:48:45+00:00

Amazon Kinesis

2023-07-29T06:16:04+00:00

Bookmarks Kinesis Video Streams Kinesis Data Stream Kinesis Data Firehose Kinesis Data Analytics Amazon Kinesis-related Cheat Sheets Validate Your Knowledge Amazon Kinesis Cheat Sheet Makes it easy to collect, process, and analyze real-time, streaming data. Kinesis can ingest real-time data such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications. Kinesis Video Streams A fully managed AWS service that you can use to stream live video from devices to the AWS Cloud, or build applications for real-time video processing or batch-oriented video [...]

Amazon Kinesis2023-07-29T06:16:04+00:00

Amazon EMR

2023-06-23T08:02:25+00:00

Bookmarks Features Components EMR Architecture Data Processing Scaling Deployment EMR Notebooks Managing Clusters High Availability Monitoring Security Pricing Amazon EMR Cheat Sheet A managed cluster platform that simplifies running big data frameworks, such as Apache Hadoop and Apache Spark, on AWS to process and analyze vast amounts of data. You can process data for analytics purposes and business intelligence workloads using EMR together with Apache Hive and Apache Pig. You can use EMR to transform and move large amounts of data into and out of [...]

Amazon EMR2023-06-23T08:02:25+00:00

AWS Data Pipeline

2023-06-23T08:08:48+00:00

Bookmarks Features Components Pipeline Definition Task Runners AWS Data Pipeline vs Amazon Simple WorkFlow Pricing AWS Data Pipeline Cheat Sheet A web service for scheduling regular data movement and data processing activities in the AWS cloud. Data Pipeline integrates with on-premise and cloud-based storage systems. A managed ETL (Extract-Transform-Load) service. Native integration with S3, DynamoDB, RDS, EMR, EC2, and Redshift. Features You can quickly and easily provision pipelines that remove the development and maintenance effort required to manage your daily data operations, letting you focus on generating insights from that data. [...]

AWS Data Pipeline2023-06-23T08:08:48+00:00

Amazon CloudSearch

2023-06-23T07:59:15+00:00

Bookmarks Features Scaling Fault Tolerance Monitoring Pricing Amazon CloudSearch Cheat Sheet A fully-managed service in the AWS Cloud that makes it easy to set up, manage, and scale a search solution for your website or application. Features You can use CloudSearch to index and search both structured data and plain text. Full text search with language-specific text processing Boolean search Prefix searches Range searches Term boosting Faceting Highlighting Autocomplete Suggestions You can get search results in JSON or XML, sort and filter results based on field values, and sort results alphabetically, numerically, [...]

Amazon CloudSearch2023-06-23T07:59:15+00:00

Amazon Athena

2024-02-14T12:52:39+00:00

Bookmarks Features Queries Athena Federated Queries Optimizing Query Performance Cost Controls Security Pricing Validate Your Knowledge Amazon Athena Cheat Sheet An interactive query service that makes it easy to analyze data directly in Amazon S3 and other data sources using SQL. Features Athena is serverless. Has a built-in query editor. Uses Presto, an open source, distributed SQL query engine optimized for low latency, ad hoc analysis of data. Athena supports a wide variety of data formats such as CSV, JSON, ORC, Avro, or Parquet. Athena automatically executes queries in [...]

Amazon Athena2024-02-14T12:52:39+00:00

AWS, Azure, and GCP Certifications are consistently among the top-paying IT certifications in the world, considering that most companies have now shifted to the cloud. Earn over $150,000 per year with an AWS, Azure, or GCP certification!

Follow us on LinkedIn, Facebook, or join our Slack study group. More importantly, answer as many practice exams as you can to help increase your chances of passing your certification exams on your first try!