Core API

Base classes and fundamental types for building checks and scenarios.

`Check`

Base class for all checks. Subclass and register to create custom validation logic.

Module: giskard.checks.core.check

Description

The Check class provides the foundation for all validation logic in Giskard. Create custom checks by subclassing and registering with a unique kind identifier using the @Check.register("kind") decorator.

Attributes

Attribute	Type	Default	Description
`name`	`str \| None`	`None`	Optional check name for reporting
`description`	`str \| None`	`None`	Human-readable description of what the check validates

Key Methods

`run(trace: Trace) -> CheckResult`

Execute the check logic against the provided trace.

Parameters:

trace (Trace): The trace containing interaction history. Access the current interaction via trace.last or trace.interactions[-1]

Returns:

CheckResult: Success, failure, error, or skip result

Creating Custom Checks

from giskard.checks import Check, CheckResult, Trace

@Check.register("my_custom_check")
class MyCustomCheck(Check):
    threshold: float = 0.8

    async def run(self, trace: Trace) -> CheckResult:
        # Your validation logic
        score = self._calculate_score(trace)

        if score >= self.threshold:
            return CheckResult.success(
                message=f"Score {score} meets threshold",
                metrics={"score": score}
            )
        else:
            return CheckResult.failure(
                message=f"Score {score} below threshold {self.threshold}",
                metrics={"score": score}
            )

    def _calculate_score(self, trace: Trace) -> float:
        # Custom scoring logic
        return 0.85

`CheckResult`

Immutable result produced by running a check.

Module: giskard.checks.core.result

Description

CheckResult encapsulates the outcome of a check execution, including the status (pass/fail/error/skip), optional message, metrics, and additional details.

Attributes

Attribute	Type	Description
`status`	`CheckStatus`	Outcome status (PASS, FAIL, ERROR, SKIP)
`message`	`str \| None`	Optional short message to surface to users
`metrics`	`list[Metric]`	List of auxiliary metrics captured by the check
`details`	`dict[str, Any]`	Arbitrary structured payload with additional context

Creating Results

Use the static factory methods to create results:

from giskard.checks import CheckResult

# Success - check passed
result = CheckResult.success(
    message="All validations passed",
    details={"score": 0.95, "threshold": 0.8}
)

# Failure - check did not pass
result = CheckResult.failure(
    message="Score below threshold",
    details={"score": 0.65, "threshold": 0.8, "reason": "low confidence"}
)

# Error - unexpected exception or condition
result = CheckResult.error(
    message="Failed to connect to API",
    details={"error": "Connection timeout"}
)

# Skip - precondition not met
result = CheckResult.skip(
    message="Skipped: No outputs available",
    details={"reason": "empty_response"}
)

Instance Properties

Property	Type	Description
`passed`	`bool`	True if status is PASS
`failed`	`bool`	True if status is FAIL
`errored`	`bool`	True if status is ERROR
`skipped`	`bool`	True if status is SKIP

result = CheckResult.success(message="Check passed")
if result.passed:
    print("✓ Check succeeded")

`CheckStatus`

Enumeration of possible check execution outcomes.

Module: giskard.checks.core.result

Values

Status	Description
`PASS`	Check validation succeeded
`FAIL`	Check validation failed
`ERROR`	Unexpected error during check execution
`SKIP`	Check was skipped (e.g., precondition not met)

from giskard.checks import CheckStatus

# Use in conditional logic
if result.status == CheckStatus.PASS:
    print("Success!")
elif result.status == CheckStatus.FAIL:
    print("Validation failed")

`Interaction`

A single exchange between inputs and outputs.

Module: giskard.checks.core.trace

Description

An interaction represents one exchange in a conversation or workflow, capturing the inputs provided, the outputs produced, and optional metadata.

Attributes

Attribute	Type	Description
`inputs`	`InputType`	Input values for this interaction (e.g., user message, API request)
`outputs`	`OutputType`	Output values produced in response (e.g., assistant reply, API response)
`metadata`	`dict[str, Any]`	Optional metadata (timing, tool calls, intermediate states, etc.)

Examples

from giskard.checks import Interaction

# Simple text interaction
interaction = Interaction(
    inputs="What is the capital of France?",
    outputs="The capital of France is Paris.",
    metadata={"model": "gpt-4", "tokens": 15}
)

# Structured interaction
interaction = Interaction(
    inputs={"query": "weather", "location": "Paris"},
    outputs={"temperature": 20, "conditions": "sunny"},
    metadata={"api": "weather_service", "latency_ms": 120}
)

`Trace`

Immutable history of all interactions in a scenario.

Module: giskard.checks.core.trace

Description

A trace accumulates all interactions that have occurred during scenario execution. It is passed to checks for validation and to interaction specs for generating subsequent interactions.

The trace is immutable (frozen), ensuring that checks and specs cannot accidentally modify the history.

Attributes

Attribute	Type	Description
`interactions`	`list[Interaction]`	Ordered list of all interactions. Most recent at `[-1]`
`last`	`Interaction \| None`	Computed property returning the last interaction, or None if empty

Accessing Trace Data

from giskard.checks import scenario

# Create a scenario with multiple interactions
test_scenario = (
    scenario("example_trace")
    .interact(inputs="Hello", outputs="Hi!")
    .interact(inputs="How are you?", outputs="I'm well!")
)

# Access trace in checks
@Check.register("trace_check")
class TraceCheck(Check):
    async def run(self, trace: Trace) -> CheckResult:
        # Access last interaction
        last_interaction = trace.last

        # Access all interactions
        all_interactions = trace.interactions

        # Count interactions
        count = len(trace.interactions)

        return CheckResult.success(
            message=f"Processed {count} interactions"
        )

`InteractionSpec`

Declarative specification for generating interactions.

Module: giskard.checks.core.interaction

Description

InteractionSpec defines how to generate interactions in a scenario. It can use static values or callables that compute values based on the current trace.

Creating Interaction Specs

from giskard.checks import scenario

# Static values
test_case = (
    scenario("static_example")
    .interact(
        inputs="test input",
        outputs="test output"
    )
)

# Callable outputs - dynamic generation
test_case = (
    scenario("dynamic_example")
    .interact(
        inputs="test query",
        outputs=lambda inputs: my_model(inputs)
    )
)

# Access trace context
test_case = (
    scenario("context_example")
    .interact(
        inputs=lambda trace: f"Previous: {trace.last.outputs if trace.last else 'None'}",
        outputs=lambda inputs: generate_response(inputs)
    )
)

`Scenario`

Ordered sequence of interaction specs and checks with shared trace.

Module: giskard.checks.core.scenario

Description

A Scenario represents a multi-step test workflow combining interactions and checks. Each interaction is executed in order, building up the trace, and checks can be inserted at any point to validate the state.

Building Scenarios

from giskard.checks import scenario, from_fn

# Create a multi-step scenario
test_scenario = (
    scenario("multi_step_flow")
    # First interaction
    .interact(inputs="Hello", outputs="Hi there!")
    # Check after first interaction
    .check(from_fn(lambda trace: "Hi" in trace.last.outputs, name="greeting_check"))
    # Second interaction
    .interact(inputs="What's the weather?", outputs="It's sunny!")
    # Final validation
    .check(from_fn(lambda trace: len(trace.interactions) == 2, name="count_check"))
)

# Run the scenario
result = await test_scenario.run()
print(f"Status: {result.status}")
print(f"Interactions: {len(result.trace.interactions)}")

Methods

`interact(inputs, outputs, metadata=None)`

Add an interaction spec to the scenario.

Parameters:

inputs: Static value or callable (trace) -> value
outputs: Static value or callable (inputs) -> value or (trace, inputs) -> value
metadata: Optional metadata dict

Returns: The scenario (for chaining)

`check(check: Check)`

Add a check to the scenario.

Parameters:

check: A Check instance to validate the trace at this point

Returns: The scenario (for chaining)

`run() -> ScenarioResult`

Execute the scenario and return results.

Returns: ScenarioResult with status, trace, and check results

`TestCase`

Container for running multiple test scenarios.

Module: giskard.checks.core.testcase

Description

A TestCase groups multiple scenarios for batch execution. Each scenario runs independently with its own trace.

Using Scenarios (Recommended)

from giskard.checks import scenario, from_fn

# Create individual scenarios (recommended approach)
test_scenario = (
    scenario("my_test")
    .interact(inputs="test", outputs="result")
    .check(from_fn(lambda trace: True, name="check1"))
    .check(from_fn(lambda trace: True, name="check2"))
)

# Run the scenario
result = await test_scenario.run()

`Extractors`

Base classes for extracting values from traces.

Module: giskard.checks.core.extraction

Extractor

Base class for extracting values from traces.

from giskard.checks.core.extraction import Extractor

class CustomExtractor(Extractor):
    def extract(self, trace: Trace) -> Any:
        # Custom extraction logic
        return trace.last.outputs if trace.last else None

JsonPathExtractor

Extract values using JSONPath expressions.

Module: giskard.checks.core.extraction

from giskard.checks import JsonPathExtractor

# Extract from trace using JSONPath
extractor = JsonPathExtractor(key="trace.last.outputs.answer")
value = extractor.extract(trace)

# Extract nested values
extractor = JsonPathExtractor(key="trace.interactions[0].metadata.model")
model_name = extractor.extract(trace)

Common JSONPath Patterns:

Pattern	Description
`trace.last.inputs`	Last interaction inputs
`trace.last.outputs`	Last interaction outputs
`trace.last.metadata.key`	Metadata from last interaction
`trace.interactions[0]`	First interaction
`trace.interactions[-1]`	Last interaction (same as trace.last)

`Configuration`

Global configuration functions for the checks system.

Module: giskard.checks

set_default_generator

Set the default LLM generator for LLM-based checks.

from giskard.agents.generators import Generator
from giskard.checks import set_default_generator

# Configure default generator
set_default_generator(Generator(model="openai/gpt-4"))

# Now LLM checks will use this generator by default
from giskard.checks import Groundedness

check = Groundedness()  # Uses the default generator

Parameters:

generator (Generator): The generator instance to use as default

get_default_generator

Get the currently configured default generator.

from giskard.checks import get_default_generator

generator = get_default_generator()
print(f"Using generator: {generator.model}")

Returns:

Generator: The current default generator instance

`ScenarioRunner`

Low-level runner for executing scenarios.

Module: giskard.checks.core.scenario

Description

ScenarioRunner provides the execution engine for running scenarios. Most users should use the higher-level scenario().run() API rather than using ScenarioRunner directly.

Core API

Check

Description

Attributes

Key Methods

run(trace: Trace) -> CheckResult

Creating Custom Checks

CheckResult

Description

Attributes

Creating Results

Instance Properties

CheckStatus

Values

Interaction

Description

Attributes

Examples

Trace

Description

Attributes

Accessing Trace Data

InteractionSpec

Description

Creating Interaction Specs

Scenario

Description

Building Scenarios

Methods

interact(inputs, outputs, metadata=None)

check(check: Check)

run() -> ScenarioResult

TestCase

Description

Using Scenarios (Recommended)

Extractors

Extractor

JsonPathExtractor

Configuration

set_default_generator

get_default_generator

ScenarioRunner

Description

See Also

`Check`

`run(trace: Trace) -> CheckResult`

`CheckResult`

`CheckStatus`

`Interaction`

`Trace`

`InteractionSpec`

`Scenario`

`interact(inputs, outputs, metadata=None)`

`check(check: Check)`

`run() -> ScenarioResult`

`TestCase`

`Extractors`

`Configuration`

`ScenarioRunner`