Skip to content
GitHubDiscord

What are Giskard Checks?

Giskard Checks is a lightweight Python library for testing and evaluating non-deterministic applications such as LLM-based systems.

Giskard Checks provides a flexible and powerful framework for testing AI applications including RAG systems, agents, summarization models, and more. Whether you’re building chatbots, question-answering systems, or complex multi-step workflows, Giskard Checks helps you ensure quality and reliability.

  • Built-in Check Library: Ready-to-use checks including LLM-as-a-judge evaluations, string matching, equality assertions, and more
  • Flexible Testing Framework: Support for both single-turn and multi-turn scenarios with stateful trace management
  • Type-Safe & Modern: Built on Pydantic for full type safety and validation
  • Async-First: Native async/await support for efficient concurrent testing
  • Highly Customizable: Easy extension points for custom checks and interaction patterns
  • Serializable Results: Immutable, JSON-serializable results for easy storage and analysis

Giskard Checks is designed for:

  • RAG Evaluation: Test groundedness, relevance, and context usage in retrieval-augmented generation systems
  • Agent Testing: Validate multi-step agent workflows with tool calls and complex reasoning
  • Quality Assurance: Ensure consistent output quality across model updates and deployments
  • LLM Guardrails: Implement safety checks, content moderation, and compliance validation
  • Regression Testing: Track model behavior changes over time with reproducible test suites