Skip to content

Write agent tests

Monocle provides two testing APIs for validating agent behavior: the MonocleValidator framework and the Fluent API.

The test scenario

The test sends this prompt:

"Book a flight from SFO to BOM next week and a Marriott hotel in Mumbai on 10/15/25"

Expected behavior:

  1. adk_book_flight should NOT be called — the date "next week" is vague
  2. adk_book_hotel should be called — the hotel date is specific
  3. adk_hotel_booking_agent should be invoked
  4. The final response should NOT say "flight booked"

Fluent API tests

The Fluent API provides a chainable assertion interface:

@pytest.mark.asyncio
async def test_flight_booking_fluent(monocle_trace_asserter):
    await monocle_trace_asserter.run_agent_async(
        root_agent, "google_adk", PROMPT
    )

    # Tool should NOT be called (vague date)
    monocle_trace_asserter.does_not_call_tool("adk_book_flight")

    # Tool SHOULD be called
    monocle_trace_asserter.called_tool("adk_book_hotel", "adk_hotel_booking_agent")

    # Agent SHOULD be invoked
    monocle_trace_asserter.called_agent("adk_hotel_booking_agent")

    # Output should NOT contain this text
    monocle_trace_asserter.does_not_contain_output(
        "A flight from SFO to BOM has been booked"
    )

Adding quality checks

The Fluent API also supports Okahu Cloud quality assessments:

    # Run hallucination check on the full trace
    monocle_trace_asserter.with_evaluation("okahu") \
        .check_eval("hallucination", "no_hallucination")

This sends the trace to Okahu and runs a hallucination assessment. The test fails if the agent's output is classified as hallucinated.

Available assertion methods

Method Description
called_tool(name, agent) Assert a tool was called by a specific agent
does_not_call_tool(name) Assert a tool was NOT called
called_agent(name) Assert an agent was invoked
contains_output(text) Assert the final output contains text
does_not_contain_output(text) Assert the final output does NOT contain text
with_evaluation(provider) Chain Okahu quality assessments
.check_eval(metric, expected) Assert a quality check result

Running tests locally

pip install -r tests/requirements.txt
pytest tests/test_adk_flight_booking_fluent.py -v

Running tests

Test results show pass/fail status for each assertion:

Pass and fail tests

Negative tests are critical

Notice that half the assertions are negative — verifying the agent did NOT do something. For AI agents, testing what shouldn't happen is as important as testing what should.