Ends in
00
days
00
hrs
00
mins
00
secs
ENROLL NOW

💪 25% OFF on ALL Reviewers to Start Your 2026 Strong with our New Year, New Skills Sale!

Amazon Bedrock

Last updated on December 23, 2025

Amazon Bedrock Cheat Sheet

  • Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies via a single API.
  • It enables you to build and expand applications powered by generative AI that generate text, images, audio, and synthetic data in response to specific prompts.

 

Key Features

  • Unified API:
    • Easily experiment with, compare, and evaluate different models for your specific use case using a single API.
  • Tutorials dojo strip
  • Data Privacy:
    • Your content is not used to improve the base models and is encrypted in transit and at rest.
  • Agents:
    • You can build agents that execute tasks using your enterprise systems, APIs, and data sources. Agents can reason, plan, and take actions across multiple steps.
  • Serverless:
    • Since Amazon Bedrock is serverless, you don’t have to manage, scale, or provision infrastructure.
  • Integration:
    • Securely incorporate and implement generative AI features into your applications using AWS services such as Lambda, S3, IAM, CloudWatch, and VPC.

 

Model Choice Capability

Unlock your freedom to innovate with flexible AI by choosing the right model for performance, latency, and cost.

  • Access

    • Access leading models on demand with no infrastructure hassle.
  • Discover

    • Model Catalog:
      • Amazon Bedrock offers access to over 100 foundation models from both leading and emerging AI providers. You can choose either a serverless or Marketplace model tailored to your specific objectives.
    • API Keys:
      • A simplified authentication method for developers to make API requests without configuring complex IAM roles.
        • Short-term API keys: These keys expire automatically when your console session ends (typically after 12 hours). We recommend that you use short-term keys for setups that require a higher level of security.
        • Long-term API keys: These keys can be set to last longer than 12 hours. We strongly recommend using long-term keys only for exploratory purposes to avoid security risks in production.
  • Import

    • Protect your competitive advantage. Bring your proprietary, customized models to Amazon Bedrock to run alongside existing, out-of-the-box FMs through a single, serverless, unified API.
    • Access your imported models on demand without the need to provision or manage underlying infrastructure.
    • Accelerate development by integrating your supported custom models with native Bedrock tools such as Knowledge Bases, Guardrails, and Agents.
    • Maintain full control over model versioning and deployment while leveraging existing AI investments with enterprise-grade security and scalability.
  • Evaluate

    • Model Evaluation
      • How it works:
        • Create and review model evaluation jobs.
      • Automatic: The automatic approach offers 2 options for evaluation:
        • Programmatic: Evaluate performances using just the model and metrics you select.
        • LLM as a judge: A pre-trained model evaluates your model’s responses using metrics you’ve selected.
      • Human:
        • Bring your own work team: Evaluate responses from up to 2 models using your own work team. You can define evaluation metrics specific to your job.
    • RAG Evaluation
      • How it works:
        • Evaluate the accuracy and relevance of your Knowledge Bases (Retrieval Augmented Generation applications).
      • Metrics: Automatically scores your application on key RAG pillars using an “LLM as a judge”
      • Datasets: Requires a test dataset containing prompts and (optionally) ground truth responses.
  • Featured Model Providers:

    • Amazon
    • Anthropic
    • DeepSeek
    • Mistal AI
    • Meta
    • Meta AI

 

Test

  • Chat / Text Playground:

    • Interactive environment to test prompts and model responses.
  • Tokenizer:

    • Inference charges are based on the number of tokens in your input and output. The tokenizer lets you estimate token counts in your input and adjust them as needed to manage costs and stay within token quotas.

 

Inference

  • Cross-region Inference:

    • Cross-region inference improves throughput and resiliency by routing requests across multiple AWS Regions during peak times, using inference profiles.
    • How it works:
      • Select a system-defined inference profile from the provided table to view supported models and the Regions where requests can be routed.
      • For cross-region inference, specify the profile ID or ARN when running model inference or using an Amazon Bedrock resource. Optionally, create an application-specific inference profile to monitor costs and usage.
  • Batch Inference:

    • Batch inference enables you to process multiple requests asynchronously, making it ideal for large datasets. Results are saved as a JSON file in the S3 location you specified.
    • How it works:
      • Prepare your input data in the required format and choose the appropriate model to create a batch inference job.
      • After processing, retrieve the batch inference results from the designated S3 bucket and integrate them into your workflow or application.
  • Provisioned Throughput:

    • Allows you to reserve dedicated, consistent inference capacity for deploying your models on Amazon Bedrock. This feature is essential for high-volume, production workloads that require predictable performance and low latency.

 

Optimize Cost Capability

Tools designed to control spend and improve efficiency for generative AI workloads.

  • Prompt Router Models:

    • Enables Intelligent Prompt Routing, where user prompts are automatically sent to the most cost-efficient and performant model within a chosen family. This functionality optimizes for cost and performance, eliminating the need to write and maintain complex manual routing logic.
    • Prompt routers can dynamically select between supported models (such as Claude Sonnet 3.5 versus Claude Haiku, or Llama 3.1 7B versus Llama 3.1 80B) based on pricing, capacity, and latency.
    • How it Works:
      • Experiment with default routers: Use built-in prompt routers to test routing between different models in a family. Default routers are pre-configured for popular model pairings, allowing you to benchmark and compare responses or costs.
      • Configure custom routing: Set up prompt routers to include only the models you want, giving you full control over which models are used in your applications. Routing rules can be customized based on criteria such as price, latency, or specific use case requirements.
      • Production-ready integration: Integrate prompt routers into your generative AI applications using Bedrock’s Invoke and Converse API operations. This streamlines deployment and saves time on testing and orchestration.
  • Marketplace Model Deployments:

    • Lets you discover, subscribe to, and deploy over 100 emerging and specialized foundation models from a broad range of providers, including both large-scale and niche AI developers.
    • Models available via the Marketplace are deployed on fully managed Bedrock endpoints, just like standard Bedrock models, and can be integrated into your workflows with minimal configuration.
    • How it works:
      • Discover: Browse the model catalog to find models that suit your application’s needs, including specialized models for tasks like document understanding, code generation, and image processing.
      • Subscribe and deploy: Subscribe to a model via the Marketplace. After subscribing, you can deploy one or more endpoints for scalable, production-ready inference.
      • Manage: Monitor endpoint usage, adjust capacity as needed, and leverage Bedrock’s integration with Amazon SageMaker by registering existing SageMaker endpoints, allowing for unified management of your AI infrastructure within Bedrock.
  • Prompt Caching:

    • Saves and reuses large prefixes of prompt context (like documents or extensive instructions) across multiple requests.
    • Significantly reduces latency and cost (up to 90% cheaper for cached tokens) for repetitive tasks.
  • Prompt Management:

    • A centralized library to create, save, version, and share prompts.
    • Allows you to switch between model versions or providers without rewriting application code.
  • Model Distillation:

    • Allows you to create smaller, faster, and more cost-efficient “Student” models by transferring knowledge from larger, high-performing “Teacher” models.
    • Distilled models can be up to 500% faster and 75% less expensive than their original counterparts, while maintaining use-case-specific accuracy nearly identical to that of advanced models, typically with less than 2% loss in scenarios like Retrieval-Augmented Generation (RAG).
    • Enables you to deploy models that closely match the performance of top-tier foundation models but with significantly reduced computational resources, cost, and latency, making them ideal for production environments that demand efficiency without sacrificing quality.

 

Customization Capability

Make AI uniquely yours by adapting generic models to your specific business data.
  • Knowledge Bases

    • Fully managed Retrieval Augmented Generation (RAG) workflow.
    • Connects FMs to your data sources (S3, Confluence, Salesforce, etc.) to provide accurate, up-to-date responses based on your proprietary data.
    • Automatically handles vector database setup (embedding and indexing).
  • Supervised Fine-Tuning

    • Adapt Amazon Titan and other supported models using your own datasets (prompt-response pairs) in a secure, managed environment.
    • Ideal for teaching the model a specific voice, format, or industry-specific terminology.
  • Reinforcement Fine-Tuning

    • Further refine models using feedback loops (Reward Models) rather than relying solely on static data.
    • Aligns the model with complex human preferences or specific business goals that are hard to define with simple examples.
  • Continued Pre-training

    • Train models on vast amounts of unlabeled data (e.g., domain-specific text) to familiarize them with new terminology or concepts before fine-tuning for specific tasks.
  • Data Automation

    • Automates the transformation of unstructured data (files, images, videos) into structured formats suitable for RAG.
    • Uses generative AI blueprints to extract and organize data before indexing.

 

Agent Development Capability

Free AWS Courses
  • Amazon Bedrock Agents

    • Fully managed agents that can execute multi-step tasks by breaking down prompts into a sequence of actions.
    • Action Groups:
      • Define actions the agent can take, which are mapped to your enterprise APIs.
    • Reasoning:
      • Agents can reason, plan, and take actions across multiple steps (e.g., “Book a flight,” then “Book a hotel”).
    • Multi-Agent Orchestration:
      • Agents can invoke other specialized agents to execute complex workflows.
    • For a deep dive, see the Amazon Bedrock AgentCore Cheat Sheet.

 

Safety and Guardrails Capability

  • Data protection

    • PII Redaction:
      • Detect and redact Personally Identifiable Information (PII) in model responses to ensure privacy compliance.
    • Encryption:
      • Data is encrypted in transit and at rest. Bedrock is HIPAA eligible and GDPR compliant.
  • Responsible AI

    • Content Filters:
      • Configure thresholds to filter harmful content across categories like hate, violence, and sexual content.
    • Denied Topics:
      • Block the model from responding to specific topics (e.g., “financial advice”).
    • Word Filters:
      • Block specific custom words or phrases (e.g., competitor names).
  • Hallucination controls:

    • Contextual Grounding:
      • Detects and filters hallucinations by verifying if the response is actually based on the source data (RAG) or user prompt.
  • Governance:

    • Logging and metrics:
      • Monitor evaluation results and model invocations over time for auditing and compliance.
    • Model versioning:
      • Controlled deployment, rollback, and testing of model versions.
    • Watermarking:
      • All images generated by Amazon Titan Image Generator include an invisible watermark to identify them as AI-generated.

 

Multi-Agent Orchestration

    • Agents can invoke other agents to execute multi-step workflows across systems.
    • Supports task decomposition for complex workflows.
    • Integrates with AgentCore for reasoning, tool usage, and execution monitoring.
    • Maintains context and memory across chained agent executions.
    • Supports parallel execution of tasks when possible to improve efficiency.
    • Error handling and fallback mechanisms for multi-agent workflows.

 

Pricing

  • On-Demand:
    • Pay only for what you use. Charges are based on the number of input and output tokens processed for text models, per-token pricing for embedding models, and per-image pricing for image generation models. Pricing is determined by the region where the request is processed.
  • Batch Inference:
    • Allows you to submit multiple prompts in bulk, with asynchronous processing and results stored in Amazon S3.
    • Offers a lower cost per request than On-Demand pricing for supported models, making it suitable for large-scale workloads.
  • Provisioned Throughput:
    • Designed for high-volume, consistent inference workloads with guaranteed capacity.
    • You can purchase model units with 1-month or 6-month commitment terms.
    • Charges are based on provisioned throughput rather than individual requests.
      • Note: Customized models are accessed through this pricing model.
  • Model Customization:
    • Training: Pricing is based on the number of tokens processed during training (including epochs).
    • Storage: Charged monthly for storing the customized model copy.
    • Inference: Inference for customized models is billed under Provisioned Throughput.
  • Model Evaluation:
    • Automatic: Billed at standard inference rates with no additional charge for algorithmic scoring.
    • Human: Billed per completed human task, with pricing determined by evaluation scope and requirements.
  • Guardrails:
    • Charged per 1,000 text characters or per image evaluated.
    • Configuring guardrails is free; you pay only for the processing of requests against them.

 

Amazon Bedrock Cheat Sheet References:

https://aws.amazon.com/bedrock/

https://aws.amazon.com/bedrock/pricing/

https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html

Learn AWS with our PlayCloud Hands-On Labs

$2.99 AWS and Azure Exam Study Guide eBooks

tutorials dojo study guide eBook

New AWS Generative AI Developer Professional Course AIP-C01

AIP-C01 Exam Guide AIP-C01 examtopics AWS Certified Generative AI Developer Professional Exam Domains AIP-C01

Learn GCP By Doing! Try Our GCP PlayCloud

Learn Azure with our Azure PlayCloud

FREE AI and AWS Digital Courses

FREE AWS, Azure, GCP Practice Test Samplers

Subscribe to our YouTube Channel

Tutorials Dojo YouTube Channel

Follow Us On Linkedin

Written by: Nestor Mayagma Jr.

Nestor is a cloud engineer and content creator at Tutorials Dojo. He's been an active AWS Community Builder since 2022, with a growing interest in multi-cloud technologies across AWS, Azure, and Google Cloud. In his leisure time, he indulges in playing FPS games.

AWS, Azure, and GCP Certifications are consistently among the top-paying IT certifications in the world, considering that most companies have now shifted to the cloud. Earn over $150,000 per year with an AWS, Azure, or GCP certification!

Follow us on LinkedIn, YouTube, Facebook, or join our Slack study group. More importantly, answer as many practice exams as you can to help increase your chances of passing your certification exams on your first try!

View Our AWS, Azure, and GCP Exam Reviewers Check out our FREE courses

Our Community

~98%
passing rate
Around 95-98% of our students pass the AWS Certification exams after training with our courses.
200k+
students
Over 200k enrollees choose Tutorials Dojo in preparing for their AWS Certification exams.
~4.8
ratings
Our courses are highly rated by our enrollees from all over the world.

What our students say about us?