Amazon Bedrock + Promptfoo: Rethinking LLM Evaluation Methods
Ashley Nicole Santos2026-02-08T14:26:06+00:00I discovered something embarrasing about my LLM development workflow last month. After spending hours crafting what I thought was the perfect prompt for a customer service chatbot on Amazon Bedrock, I deployed it and called it done. My validation process? I asked it five questions, nodded approvingly at the responses, and moved on. Sound familiar? This "vibe-based prompting" approach worked fine until the chatbot confidently told a user that our fictional company offers "24/7 phone support," a feature that never existed. The model hallucinated, and I had no automated way to catch it. That experience sent me down a rabbit [...]
