Write agent tests¶
Monocle provides two testing APIs for validating agent behavior: the MonocleValidator framework and the Fluent API.
The test scenario¶
The test sends this prompt:
"Book a flight from SFO to BOM next week and a Marriott hotel in Mumbai on 10/15/25"
Expected behavior:
adk_book_flightshould NOT be called — the date "next week" is vagueadk_book_hotelshould be called — the hotel date is specificadk_hotel_booking_agentshould be invoked- The final response should NOT say "flight booked"
Fluent API tests¶
The Fluent API provides a chainable assertion interface:
@pytest.mark.asyncio
async def test_flight_booking_fluent(monocle_trace_asserter):
await monocle_trace_asserter.run_agent_async(
root_agent, "google_adk", PROMPT
)
# Tool should NOT be called (vague date)
monocle_trace_asserter.does_not_call_tool("adk_book_flight")
# Tool SHOULD be called
monocle_trace_asserter.called_tool("adk_book_hotel", "adk_hotel_booking_agent")
# Agent SHOULD be invoked
monocle_trace_asserter.called_agent("adk_hotel_booking_agent")
# Output should NOT contain this text
monocle_trace_asserter.does_not_contain_output(
"A flight from SFO to BOM has been booked"
)
Adding quality checks¶
The Fluent API also supports Okahu Cloud quality assessments:
# Run hallucination check on the full trace
monocle_trace_asserter.with_evaluation("okahu") \
.check_eval("hallucination", "no_hallucination")
This sends the trace to Okahu and runs a hallucination assessment. The test fails if the agent's output is classified as hallucinated.
Available assertion methods¶
| Method | Description |
|---|---|
called_tool(name, agent) |
Assert a tool was called by a specific agent |
does_not_call_tool(name) |
Assert a tool was NOT called |
called_agent(name) |
Assert an agent was invoked |
contains_output(text) |
Assert the final output contains text |
does_not_contain_output(text) |
Assert the final output does NOT contain text |
with_evaluation(provider) |
Chain Okahu quality assessments |
.check_eval(metric, expected) |
Assert a quality check result |
Running tests locally¶
Test results show pass/fail status for each assertion:
Negative tests are critical
Notice that half the assertions are negative — verifying the agent did NOT do something. For AI agents, testing what shouldn't happen is as important as testing what should.

