Overview
Summary
Section titled “Summary”These tutorials provide complete, working examples that you can adapt for your own use cases. Each tutorial includes:
- Full working code
- Explanation of key concepts
- Common pitfalls and how to avoid them
- Extensions and variations
Available Tutorials
Section titled “Available Tutorials” 🔍 RAG Evaluation Test retrieval quality, groundedness, and answer relevance in RAG
systems
🤖 Testing Agents Validate multi-step agent workflows with tool usage and reasoning
💬 Chatbot Testing Test conversational flows, context handling, and response quality
🛡️ Content Moderation Implement safety checks and content filtering
What You’ll Learn
Section titled “What You’ll Learn”Through these tutorials, you’ll learn how to:
- Design effective test suites for different AI application types
- Combine built-in and custom checks for comprehensive validation
- Handle both single-turn and multi-turn scenarios
- Use LLM-as-a-judge for nuanced evaluation
- Track metrics and analyze test results
- Integrate checks into CI/CD pipelines
Prerequisites
Section titled “Prerequisites”Before starting these tutorials, you should:
- Have completed the Install & Configure guide
- Be familiar with the Core Concepts
- Have basic Python and async/await knowledge
- Have access to an LLM API (OpenAI, Anthropic, or compatible)
Getting Help
Section titled “Getting Help”If you run into issues with these tutorials:
- Check the Core Concepts for concept clarifications
- Review the Custom Checks guide for check creation patterns
- Look at the API reference for detailed documentation
- Open an issue on GitHub if you find bugs or have suggestions