Write agent tests¶

Monocle provides two testing APIs for validating agent behavior: the MonocleValidator framework and the Fluent API.

The test scenario¶

The test sends this prompt:

"Book a flight from SFO to BOM next week and a Marriott hotel in Mumbai on 10/15/25"

Expected behavior:

adk_book_flight should NOT be called — the date "next week" is vague
adk_book_hotel should be called — the hotel date is specific
adk_hotel_booking_agent should be invoked
The final response should NOT say "flight booked"

The full source code is in okahu-demos/adk-travel-agent-with-cicd. The test file referenced below is tests/test_adk_flight_booking_fluent.py.

Fluent API tests¶

The Fluent API provides a chainable assertion interface (source, line 11):

@pytest.mark.asyncio
async def test_flight_booking_fluent(monocle_trace_asserter):
    await monocle_trace_asserter.run_agent_async(
        root_agent, "google_adk", PROMPT
    )

    # Tool should NOT be called (vague date)
    monocle_trace_asserter.does_not_call_tool("adk_book_flight")

    # Tool SHOULD be called
    monocle_trace_asserter.called_tool("adk_book_hotel", "adk_hotel_booking_agent")

    # Agent SHOULD be invoked
    monocle_trace_asserter.called_agent("adk_hotel_booking_agent")

    # Output should NOT contain this text
    monocle_trace_asserter.does_not_contain_output(
        "A flight from SFO to BOM has been booked"
    )

Adding quality checks¶

The Fluent API also supports Okahu Cloud quality assessments (source, line 27):

    # Run hallucination check on the full trace
    monocle_trace_asserter.with_evaluation("okahu") \
        .check_eval("hallucination", "no_hallucination")

This sends the trace to Okahu and runs a hallucination assessment. The test fails if the agent's output is classified as hallucinated.

Available assertion methods¶

Method	Description
`called_tool(name, agent)`	Assert a tool was called by a specific agent
`does_not_call_tool(name)`	Assert a tool was NOT called
`called_agent(name)`	Assert an agent was invoked
`contains_output(text)`	Assert the final output contains text
`does_not_contain_output(text)`	Assert the final output does NOT contain text
`with_evaluation(provider)`	Chain Okahu quality assessments
`.check_eval(metric, expected)`	Assert a quality check result

Running tests locally¶

pip install -r tests/requirements.txt
pytest tests/test_adk_flight_booking_fluent.py -v

Test results show pass/fail status for each assertion:

Negative tests are critical

Notice that half the assertions are negative — verifying the agent did NOT do something. For AI agents, testing what shouldn't happen is as important as testing what should.