Live runtime · no signup

Run a live Wauldo Agent.

Pick a preset, fire a prompt. Watch the state machine, see every claim verified, get a numeric support score.

91%

Median pass

+48pts

vs LangChain

5.7s

Avg latency

Hallucinations

1 · Preset

Specialists run 5 LLM states (3 of them in parallel). Verdict, support score and per-claim checks are inline.

2 · Mode

Factual Injection

3 · Prompt — or leave empty

Try:

Demo corpus loaded: France · BM25/dense retrieval · Wilson 95% CI · prompt injection defenses. Ask about these for grounded verification.

Hits POST /v1/tasks · quota-capped

Inside each run

Classify

Input and retrieved context are tagged data vs. instruction. Injection markers stripped before the LLM sees them.

Generate

Preset-gated workflow runs. Only the tools each state declares allowed_tools can fire.

Verify

Every claim checked against sources. Verdict, support score, and per-claim breakdown returned.

Ready to ship

Same engine in your stack.

This sandbox uses the exact POST /v1/tasks endpoint your API key hits. Same runtime, same verifier, same verdict surface.

Product page Ablation data

Free tier

300 req/month

SDKs

Py · TS · Rust

No lock-in

OpenAI-compat