Last updated on August 15, 2025
Amazon DataZone Cheat Sheet
-
Amazon DataZone is a fully managed data management service by AWS.
-
Facilitates cataloging, discovery, sharing, and data governance across AWS, on-premises, and third-party sources.
-
Enables organizations to implement a data mesh architecture, promoting decentralized data ownership and self-service analytics.
-
Integrates seamlessly with AWS services like Amazon Redshift, Amazon Athena, AWS Glue, and AWS Lake Formation.
Features
-
Business Data Catalog: Organizes data assets within the business context, making them easily discoverable.
-
Data Products: Groups related data assets into cohesive units for specific business use cases, simplifying access and management.
-
Automated Metadata Generation: Utilizes large language models (LLMs) to auto-generate business names and descriptions for data assets.
-
Faceted Search: Allows users to search and filter data assets using business terms and metadata.
-
Data Lineage: Provides visibility into the data’s origin, transformations, and consumption.
-
Governed Data Sharing: Ensures secure and compliant data access across organizational boundaries.
-
Fine-Grained Access Controls: Implements row and column-level filters to restrict data access based on user roles.
-
Integration with BI Tools: Supports connections with tools like Tableau and Power BI for data visualization.
Use Cases
-
Enables users to find and access relevant data assets quickly.
-
Empowers business users to analyze data without heavy reliance on IT.
-
Ensures compliance and security through controlled data access and auditing.
-
Facilitates cross-team collaboration by sharing data assets and insights.
-
Supports decentralized data architecture, promoting domain-specific data ownership.
Security
-
IAM Integration: Utilizes AWS Identity and Access Management for user authentication and authorization.
-
Lake Formation Integration: Leverages AWS Lake Formation for fine-grained data access controls.
-
Audit Trails: Provides logging and monitoring capabilities for data access and usage.
-
Compliance: Aligns with AWS’s compliance programs to meet regulatory requirements.
Pricing
-
Pay-As-You-Go Model: Charges based on resource usage without upfront fees or long-term commitments.
-
Free Tier: Offers 20 MB of metadata storage, 4,000 API requests, and 0.2 compute units per month at no cost.
-
Pricing Dimensions:
-
Requests: $10 per 100,000 requests.
-
Metadata Storage: $0.4 per GB.
-
Compute: $1.776 per compute unit.
-
Recommendations: $0.015 per 1,000 input tokens and $0.075 per 1,000 output tokens.
-
-
Additional Costs: Users may incur charges for services like Amazon Athena, Amazon Redshift, and Amazon S3 when accessing data through Amazon DataZone.
References:
https://aws.amazon.com/datazone/
https://docs.aws.amazon.com/datazone/latest/userguide/what-is-datazone.html