RAG Archives

AWS Data and AI Journey: Applying Generative AI Across the Enterprise
Gallery

AWS Data and AI Journey: Applying Generative AI Across the Enterprise

April Joy Deang2026-05-25T12:49:16+00:00

Stage 4 of the AWS Data and AI Journey: Applying Generative AI Across the Enterprise Applying generative AI across the enterprise is no longer just an experiment, it's a strategic priority for organizations ready to turn their data into real business intelligence. This is where generative AI enters the picture. With trusted, connected, and governed data in place, organizations can confidently apply large language models, retrieval systems, and AI agents to real business problems. Generative AI shifts data from being a record of what happened into an active driver of decisions, automation, and customer experience. Stage 4 of the AWS [...]

AWS Data and AI Journey: Applying Generative AI Across the EnterpriseApril Joy Deang2026-05-25T12:49:16+00:00

How I Built My First RAG API with FastAPI, Free & Local
Gallery

How I Built My First RAG API with FastAPI, Free & Local

Ashley Nicole Santos2026-02-12T17:21:42+00:00

I've always been curious about how AI-powered tools actually work behind the scenes. How does ChatGPT know when to search the web? How do enterprise chatbots answer questions about company documents they've never "seen" before? The answer is RAG, and building one myslef turned out to be more accessible than I expected. This article documents my experience and a hands-on tutorial that walks you through creating your very first AI API. I'm sharing the context that they don't they teach you, the "why" behind each tool, and adjustments What makes this guide different: Beginner-friendly explanations of every buzzword and tool [...]

How I Built My First RAG API with FastAPI, Free & LocalAshley Nicole Santos2026-02-12T17:21:42+00:00

Amazon Nova: Engineering the Future of Agentic AI
Gallery

Amazon Nova: Engineering the Future of Agentic AI

Dearah Mae Barsolasco2026-02-03T13:45:47+00:00

The generative AI (GenAI) revolution has fundamentally changed how organizations extract value from data. Large language models (LLMs) excel at understanding and generating human-like text, but their true enterprise value emerges only when they can access proprietary data and take real-world action. While vector databases and retrieval-augmented generation (RAG) gave LLMs memory, Amazon Nova provides execution and specialization. In this article, we break down the Amazon Nova model family, with a deep focus on Nova Act and Nova Forge, and explain how they enable a shift from passive chatbots to autonomous, enterprise-grade AI agents. What Is the Amazon Nova Model [...]

Amazon Nova: Engineering the Future of Agentic AIDearah Mae Barsolasco2026-02-03T13:45:47+00:00

AWS Vector Databases Explained: Semantic Search and RAG Systems
Gallery

AWS Vector Databases Explained: Semantic Search and RAG Systems

Dearah Mae Barsolasco2025-12-21T03:02:32+00:00

The generative AI (GenAI) revolution has transformed how organizations extract value from their data. While large language models (LLMs) demonstrate remarkable capabilities in understanding and generating human-like text, their true enterprise potential is unlocked only when they can access proprietary, domain-specific information. This necessity has propelled vector databases from a specialized niche into an essential pillar of modern AI infrastructure. But First, What Are Vector Databases? A vector database, as its name suggests, is a type of database designed to store, index, and efficiently search vector embeddings. These vectors are high-dimensional points that represent meaning. At its core, a vector [...]

AWS Vector Databases Explained: Semantic Search and RAG SystemsDearah Mae Barsolasco2025-12-21T03:02:32+00:00

Amazon Bedrock + Promptfoo: Rethinking LLM Evaluation Methods
Gallery

Amazon Bedrock + Promptfoo: Rethinking LLM Evaluation Methods

Ashley Nicole Santos2026-02-08T14:26:06+00:00

I discovered something embarrasing about my LLM development workflow last month. After spending hours crafting what I thought was the perfect prompt for a customer service chatbot on Amazon Bedrock, I deployed it and called it done. My validation process? I asked it five questions, nodded approvingly at the responses, and moved on. Sound familiar? This "vibe-based prompting" approach worked fine until the chatbot confidently told a user that our fictional company offers "24/7 phone support," a feature that never existed. The model hallucinated, and I had no automated way to catch it. That experience sent me down a rabbit [...]

Amazon Bedrock + Promptfoo: Rethinking LLM Evaluation MethodsAshley Nicole Santos2026-02-08T14:26:06+00:00

Zero-Infrastructure Vector Search with Amazon S3 Vectors
Gallery

Zero-Infrastructure Vector Search with Amazon S3 Vectors

Rafael Miguel2025-08-22T14:27:33+00:00

The world of generative AI is evolving at a rapid pace and one of the most powerful and practical applications is Retrieval-Augmented Generation (RAG). RAG enhances Large Language Models (LLMs) by giving them access to external, up-to-date knowledge bases. This allows them to generate more accurate and context-aware responses. Traditionally, building a RAG system required setting up and managing a separate vector database that adds complexity, cost, and a new layer of infrastructure to maintain however with the introduction of Amazon S3 Vector Buckets a new paradigm has emerged: zero-infrastructure vector search. What is Zero-Infrastructure Vector Search? Amazon S3 [...]

Zero-Infrastructure Vector Search with Amazon S3 VectorsRafael Miguel2025-08-22T14:27:33+00:00

What is Retrieval Augmented Generation (RAG) in Machine Learning?
Gallery

What is Retrieval Augmented Generation (RAG) in Machine Learning?

Nikee Tomas2025-06-30T03:46:57+00:00

Retrieval-Augmented Generation (RAG) Cheat Sheet Retrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) outputs by incorporating information from external, authoritative knowledge sources. Instead of relying solely on pre-trained data, RAG retrieves relevant content at inference time to ground its responses. LLMs (Large Language Models) are trained on massive datasets and use billions of parameters to perform tasks like: Question answering Language translation Text completion RAG extends LLM capabilities to domain-specific or private organizational knowledge without requiring model retraining. It provides a cost-efficient way to improve the relevance, accuracy, and utility of LLM outputs in dynamic or [...]

What is Retrieval Augmented Generation (RAG) in Machine Learning?Nikee Tomas2025-06-30T03:46:57+00:00

How Content Chunking Works in Amazon Bedrock Knowledge Bases: How AI Really Reads Your Documents
Gallery

How Content Chunking Works in Amazon Bedrock Knowledge Bases: How AI Really Reads Your Documents

Ashley Nicole Santos2026-02-02T20:15:27+00:00

Modern generative AI systems often appear to “read” entire documents instantly, returning precide answers form long PDFs or dense technical manuals. In reality, large language models do not consume documents holistically. Instead, they rely on carefully prepared context that is retrieved and supplied at query time. One of the most critical and often misunderstood mechanisms behind this process is content chunking. At its core, content chunking determines how raw documents such as PDFs, webpages, or text files are transformed into smaller, meaningful units that can be indexed, embedded, and retrieved efficiently. Understanding how chunking works and how to configure it [...]

How Content Chunking Works in Amazon Bedrock Knowledge Bases: How AI Really Reads Your DocumentsAshley Nicole Santos2026-02-02T20:15:27+00:00

Retrieval-Augmented Generation (RAG) for Foundation Model Customization
Gallery

Retrieval-Augmented Generation (RAG) for Foundation Model Customization

Nestor Mayagma Jr.2024-12-02T06:01:45+00:00

Artificial Intelligence (AI) has rapidly advanced, pushing the limits of what machines can accomplish. However, one significant challenge remains: ensuring that AI responses are both accurate and contextually relevant while being up-to-date. This is where Retrieval-Augmented Generation (RAG) comes in—a cutting-edge approach that integrates the capabilities of data retrieval with advanced AI generation techniques. In this blog, we will explore the details of RAG, discussing its benefits, applications, and how to implement it using AWS. Understanding Retrieval-Augmented Generation (RAG) RAG (Retrieval-Augmented Generation) incorporates real-time data retrieval into the generative process. Unlike traditional models that depend solely on pre-trained data, RAG [...]

Retrieval-Augmented Generation (RAG) for Foundation Model CustomizationNestor Mayagma Jr.2024-12-02T06:01:45+00:00