Amazon Bedrock Knowledge Bases Cheat Sheet

Bookmarks

Features
Use Cases
Implementation
Security
Best Practices
Pricing

A fully managed Retrieval-Augmented Generation (RAG) service that securely connects foundation models to your company’s private data sources. It automates the entire pipeline—from ingestion and indexing to retrieval and source attribution—enabling accurate, contextual, and verifiable AI responses without building custom data pipelines.

Amazon Bedrock Knowledge Bases Features

Fully Managed RAG Pipeline
- Automates the end-to-end workflow from data ingestion to indexed storage. Handles parsing of complex documents (text, tables, images), intelligent chunking, vector embedding generation, and storage in your chosen vector database.
Broad Data Source & Storage Support
- Data Sources: Amazon S3, Confluence, Salesforce, SharePoint, web crawlers. Supports programmatic ingestion via API for custom sources.
- Vector Stores: Managed Amazon OpenSearch Serverless (easiest), or bring your own: Amazon Aurora, Pinecone, Redis Enterprise Cloud, MongoDB. Can also connect to an Amazon Kendra index for hybrid (keyword + semantic) search.
- Structured Data: Connects to data warehouses/lakes (e.g., Amazon Redshift). Uses natural language-to-SQL to query data in place without movement or duplication.

Advanced Customization for Accuracy
- Multimodal Parsing: Configure to use Bedrock Data Automation or a foundation model to parse image-rich documents (charts, diagrams, tables).
- Flexible Chunking: Choose from semantic, hierarchical, or fixed-size strategies. Inject custom logic via AWS Lambda or use components from LangChain/LlamaIndex.
- Vector Flexibility: Supports standard floating-point (float32) vectors for high precision and binary vectors (1 bit/dimension) for storage efficiency with supported models/stores.
- Enhanced Retrieval: Optionally uses re-ranking models to improve the final relevance order of retrieved text chunks before response generation.
- Source Attribution: Automatically provides citations for all retrieved text and visual content, ensuring transparency and auditability.

Amazon Bedrock Knowledge Bases Use Cases

Enterprise Internal Knowledge Q&A
- Deploy a chatbot that provides accurate, sourced answers from internal documents, manuals, and wikis, drastically reducing information search time.
Customer Support with Proprietary Data
- Power support agents or self-service portals with an AI grounded in product documentation, release notes, and guides, ensuring consistent, correct responses.
Business Intelligence via Natural Language
- Enable analysts to query structured databases using plain English (e.g., “Show me Q3 sales trends for Product X”) and get answers generated from live data.

Amazon Bedrock Knowledge Bases Implementation

Core Architecture: The Two-Phase RAG Pipeline

Ingestion & Indexing (Pre-processing):
[Your Data Source] → (Parse & Chunk) → [Text Chunks] → (Generate Embeddings) → [Vector Embeddings] → (Store) → [Your Vector Index]
Runtime Query & Generation:
[User Query] → (Convert to Embedding) → [Query Vector] → (Semantic Search) → [Relevant Text Chunks] → (Augment Prompt) → [Foundation Model] → [Final Response + Citations]

Setup Checklist

Select & Prepare Data Source: Identify S3 bucket, SaaS app, or database. Ensure IAM roles (AmazonBedrockFullAccess, source-specific policies) are configured.
Choose Embedding Model: Select from Bedrock models (e.g., Titan Embeddings G1 - Text). Decide on float32 (default, high precision) or binary (storage-efficient) vectors.
Configure Vector Store: Provision a managed OpenSearch Serverless collection or provide connection details for your existing store (e.g., Pinecone API key).
Define Processing: Configure parsing for complex documents (e.g., enable Bedrock Data Automation for tables). Select a chunking strategy.
Sync Data: Run the initial ingestion job to populate the knowledge base. Monitor in the console.
Integrate: Use the Retrieve or RetrieveAndGenerate API in your application or connect it to an Amazon Bedrock Agent.

Key APIs for Integration

Retrieve: Fetches relevant text chunks and sources. Use this if you want to customize the prompt or post-processing.
RetrieveAndGenerate: Combines retrieval and generation in one call. Returns a final AI response based on retrieved context.

Amazon Bedrock Knowledge Bases Security

Data Encryption and IAM Access Control
- All data is encrypted at rest and in transit. Use granular IAM policies to control access to the Knowledge Base, data sources (e.g., S3 GetObject permissions), and vector stores.
Private Network Connectivity
- Deploy your vector store (e.g., OpenSearch) within a VPC. Configure the Knowledge Base to access data sources and stores via VPC Endpoints (AWS PrivateLink) to keep all traffic within the AWS network.
Source Attribution for Auditability
- The built-in citation system creates an audit trail, allowing verification of every piece of information in a response against the original source document.

Amazon Bedrock Knowledge Bases Best Practices

Start Simple with Managed Services
- For prototypes, use the console to create a managed OpenSearch Serverless vector store. Let AWS handle provisioning, scaling, and maintenance.

Optimize Chunking Strategically
- Semantic Chunking: Best for general Q&A on coherent paragraphs.
- Hierarchical Chunking: Ideal for long documents where preserving context across sections is key.
- Test Iteratively: Experiment with chunk sizes and overlap; small changes can significantly impact retrieval quality.
Choose the Right Retrieval Engine
- For general semantic similarity, a vector store is sufficient.
- For content needing high-precision keyword matching (e.g., product codes, error IDs), connect to Amazon Kendra.
Design for Transparency
- Always display source citations in your application UI. This builds user trust and allows subject-matter experts to validate answers quickly.
Implement a Data Freshness Strategy
- Use the StartIngestionJob API or console to sync changes. For dynamic data, trigger syncs via Amazon EventBridge on a schedule or in response to source change events.

Amazon Bedrock Knowledge Bases Pricing

Knowledge Bases uses a consumption-based pricing model with no upfront costs. You are charged for the underlying resources used. For the most current pricing, always check the official AWS Bedrock Pricing page.

Component	What You Pay For (Example Region: US East Ohio)	Notes & Considerations
Data Ingestion	Embedding Tokens (per 1k tokens) & Parsing (per page if enabled).	You pay for the embedding model to process text into vectors, not for raw storage.
Vector Storage	Storage & Compute of the chosen vector store.	For the managed Amazon OpenSearch Serverless option, you pay for OCU usage. For “bring-your-own” stores (e.g., Pinecone), you pay their standard fees.
Data Retrieval	Vector store compute to run the semantic search query.	There is no simple “per KB” charge. You pay for the database resources (e.g., OCUs) consumed during the search operation.
Structured Data Retrieval (SQL Generation)	$2.00 per 1000 queries.	Charged per request to the `GenerateQuery` API, which converts a natural language question into a SQL command for databases like Amazon Redshift.
Rerank Models	Per query. For example: Amazon-rerank-v1.0 at $1.00 per 1000 queries.	Used to improve the relevance of retrieved chunks. A query is counted per 100 document chunks (max 512 tokens each). Exceeding these limits counts as multiple queries.
Foundation Model Inference	Per-Token Charges of the selected model.	When using the `RetrieveAndGenerate` API, standard Bedrock model pricing (e.g., for Claude 3.5 Sonnet) for input/output tokens applies.

Important Note: Use the Retrieve API and a separate, optimized model call if you need more control over costs or prompt engineering, as RetrieveAndGenerate combines both retrieval and generation charges.