What are Giskard Checks?
Giskard Checks is a lightweight Python library for testing and evaluating non-deterministic applications such as LLM-based systems.
Introduction
Section titled “Introduction”Giskard Checks provides a flexible and powerful framework for testing AI applications including RAG systems, agents, summarization models, and more. Whether you’re building chatbots, question-answering systems, or complex multi-step workflows, Giskard Checks helps you ensure quality and reliability.
Key Features
Section titled “Key Features”- Built-in Check Library: Ready-to-use checks including LLM-as-a-judge evaluations, string matching, equality assertions, and more
- Flexible Testing Framework: Support for both single-turn and multi-turn scenarios with stateful trace management
- Type-Safe & Modern: Built on Pydantic for full type safety and validation
- Async-First: Native async/await support for efficient concurrent testing
- Highly Customizable: Easy extension points for custom checks and interaction patterns
- Serializable Results: Immutable, JSON-serializable results for easy storage and analysis
Quick Links
Section titled “Quick Links” 🚀 Quickstart Installation, configuration, and your first test
📚 AI Testing Guide Learn core concepts, single-turn and multi-turn testing
💡 Tutorials Practical examples for RAG, agents, and more
🔧 API Reference Complete API documentation
Use Cases
Section titled “Use Cases”Giskard Checks is designed for:
- RAG Evaluation: Test groundedness, relevance, and context usage in retrieval-augmented generation systems
- Agent Testing: Validate multi-step agent workflows with tool calls and complex reasoning
- Quality Assurance: Ensure consistent output quality across model updates and deployments
- LLM Guardrails: Implement safety checks, content moderation, and compliance validation
- Regression Testing: Track model behavior changes over time with reproducible test suites