🍓pythea/strawberry

Detect hallucinations
before they ship

MCP server for Claude Code and Codex. Uses KL divergence to measure if citations actually support claims.

Get Started GitHub

2 MCP tools

5 Codex skills

MIT

strawberry

Two MCP Tools

Choose the right tool for your use case.

detect_hallucination

Quick, automatic verification

Auto-splits text into claims
Extracts citations automatically
Fast and convenient

result = detect_hallucination(
  answer="TLS 1.3 is enabled [S0]",
  spans=[{"sid": "S0", "text": "..."}]
)

Best for: Quick checks

audit_trace_budget

Lower-level, precise control

You provide atomic claims
Explicit cite IDs
More reliable results

result = audit_trace_budget(
  steps=[{idx: 0, claim: "...",
         cites: ["S0"], confidence: 0.95}],
  spans=[{"sid": "S0", "text": "..."}]
)

Best for: CI/CD pipelines

Works where you work

Seamless integration with Claude Code, Codex, or your Python scripts.

# Register the MCP server with Claude Code
claude mcp add hallucination-detector \
  -e OPENAI_API_KEY=$OPENAI_API_KEY -- \
  python -m strawberry.mcp_server

# Then in any conversation, just ask:
# "Check my last response for hallucinations"

Pre-built Codex Skills

Ready-to-use AI agents with hallucination gating built in.

Evidence-first debugging: reproduce → evidence → hypotheses → Strawberry-verified ROOT_CAUSE → plan → execute

$rca-fix-agent

Plan-driven proof repair/synthesis for LaTeX + formal backstop (Lean/Coq)

$proof-repair-agent

Focused, anti-repeat brute-force loop for stuck proof gaps with Attempt Ledger

$proof-attack-agent

Evidence-first hierarchical planning with local forensics + web lookup

$planning-agent

Strawberry-gated plan execution: accepts steps only when success is provably true

$execution-agent

Invoke any agent in Codex:

@$hallucination-detector or @$rca-fix-agent

Math, not magic

Information-theoretic verification using KL divergence to measure evidence support.

// Input: LLM response with citations

"The config uses TLS 1.3 [S0][S1] for all connections."

The math:

BudgetGap = KL(Ber(τ) || Ber(p₀)) − KL(Ber(p₁) || Ber(p₀))

Get started in 60 seconds

Install the package and register the MCP server.

$ pip install pythea

View on GitHub Read the Docs

Detect hallucinations
before they ship

Two MCP Tools

detect_hallucination

audit_trace_budget

Works where you work

Pre-built Codex Skills

$rca-fix-agent

$proof-repair-agent

$proof-attack-agent

$planning-agent

$execution-agent

Math, not magic

1. Extract Claims & Citations

2. Scrub Evidence

3. Measure Confidence Gap

4. Flag Insufficient Evidence

Get started in 60 seconds

Detect hallucinationsbefore they ship

Two MCP Tools

detect_hallucination

audit_trace_budget

Works where you work

Pre-built Codex Skills

$rca-fix-agent

$proof-repair-agent

$proof-attack-agent

$planning-agent

$execution-agent

Math, not magic

1. Extract Claims & Citations

2. Scrub Evidence

3. Measure Confidence Gap

4. Flag Insufficient Evidence

Get started in 60 seconds

Detect hallucinations
before they ship