Last updated on November 30, 2025
Amazon Macie Cheat Sheet
-
Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in Amazon S3.
-
Primary Focus: S3 Bucket security posture (Encryption/Public Access) and Sensitive Data Discovery (PII, PHI, Credentials).
-
Legacy Note: Macie Classic features (CloudTrail anomaly detection, user behavior analytics) have been removed. Use Amazon GuardDuty for threat detection and AWS CloudTrail Insights for anomaly detection.
-
Key Features
1. Automated Sensitive Data Discovery (New)
-
Continuous Sampling: Once enabled, Macie automatically and continually samples objects from your S3 buckets to inspect them for sensitive data.
-
Heatmap: Generates an interactive data map (heatmap) in the console, showing which buckets contain sensitive data (e.g., “High Sensitivity” due to credit card numbers found).
-
Cost-Effective: Designed to provide broad visibility at a fraction of the cost of a full scan by inspecting only a representative sample of data.
2. Targeted Sensitive Data Discovery Jobs
-
Deep Analysis: You can create specific Jobs to perform deep, full scans of specific buckets.
-
Schedules: Jobs can be One-time (for compliance audits) or Scheduled (daily/weekly/monthly) to scan only new/modified objects.
-
Scope: You can filter scans by object prefix (folder), tags, or file size.
3. S3 Bucket Inventory & Assessment
-
Security Posture: Automatically evaluates all S3 buckets for security configurations.
-
Policy Findings: Alerts you to unencrypted buckets, publicly accessible buckets, or buckets shared with external AWS accounts.
Concepts
Data Identifiers Macie uses “Identifiers” to recognize sensitive data. The old “Themes/Regex/SVM” terminology is largely retired in favor of:
-
Managed Data Identifiers: A built-in library of patterns for PII (names, addresses), Financials (credit cards, bank accounts), and Credentials (AWS Secret Keys, Private Keys).
-
Includes: “Strict” vs. “High Confidence” variations to control false positives.
-
-
Custom Data Identifiers: You define your own proprietary patterns using Regular Expressions (Regex). (e.g., specific Employee IDs:
EMP-[0-9]{5}).
Allow Lists
-
Exceptions: Defines specific text or patterns that Macie should ignore (e.g., “Sample Data” or public reference numbers that look like PII but aren’t).
Findings Macie generates two types of findings:
-
Policy Findings: Issues with bucket security (e.g.,
Policy:IAMUser/S3BucketPublic). -
Sensitive Data Findings: Specific data detected inside an object (e.g.,
SensitiveData:S3Object/Personal).-
Details: Includes the location of the data (line number, column) within the file.
-
Supported Data Sources
-
Amazon S3 Only: Macie scans S3 objects. It supports various file formats including:
-
Text/Code:
.txt,.csv,.json,.xml,.html, source code (Java, Python, etc.). -
Documents: MS Office (
.docx,.xlsx), PDF. -
Archives:
.zip,.tar,.gz(Macie unzips and scans the contents). -
Big Data: Apache Parquet, Avro.
-
Management & Integration
-
Multi-Account: Integrated with AWS Organizations. A Delegated Administrator account can manage Macie for all member accounts, viewing findings centrally.
-
Findings Export: Findings are sent to Amazon EventBridge (for automation) and AWS Security Hub (for centralized posture management).
-
Finding Retention: Findings are stored in Macie for 90 days. For long-term retention, you must export them to an S3 bucket.
Use Cases
-
Regulatory Compliance: Discovering PII (GDPR/CCPA) or PHI (HIPAA) in S3 Data Lakes to ensure it is encrypted or removed.
-
Data Migration Validation: Scanning data before migrating it to a lower-security environment.
-
Credential Monitoring: Detecting accidental uploads of AWS Secret Keys or private certificates to S3 buckets.
Amazon Macie Pricing
- Cost Estimation: The console provides a usage estimator to predict job costs before you run them. Additional monthly fees will be incurred if you choose the optional Extended Data Retention feature.
- Macie pricing has three dimensions (Old CloudTrail pricing does not apply):
| Dimension | Cost (US East 1) | Notes |
| 1. S3 Bucket Assessment | $0.10 per bucket / month | Evaluates encryption/public status. First 30 days free. |
| 2. Automated Discovery | $0.01 per 100k objects | Charges for the monitoring/sampling logic. |
| 3. Sensitive Data Discovery | $1.00 per GB processed | Charged for the actual bytes scanned (via Automated or Jobs). Volume discounts apply (drops to $0.50/GB after 50TB). |
Note: If you are studying for the AWS Certified Security Specialty exam, we highly recommend that you take our AWS Certified Security – Specialty Practice Exams and read our Security Specialty exam study guide.
Amazon Macie Cheat Sheet References:
https://aws.amazon.com/macie/
https://docs.aws.amazon.com/macie/latest/userguide/what-is-macie.html
https://aws.amazon.com/macie/faq/
https://www.youtube.com/watch?v=LCjX2rsQ2wA












