How to catch your first hallucination?
3 steps. Under 3 minutes. See Guard block a wrong answer.
pip install wauldonpm install wauldoNo install needed. Just curl.
from wauldo import HttpClient
client = HttpClient(
base_url="https://smart-rag-api.p.rapidapi.com",
api_key="YOUR_RAPIDAPI_KEY"
)
# Your LLM said this. Is it true?
result = client.guard(
text="Returns are accepted within 60 days of purchase",
source_context="Our return policy allows returns within 14 days."
)
print(result) # verdict: rejected, reason: numerical_mismatch
import { HttpClient } from 'wauldo';
const client = new HttpClient({
baseUrl: 'https://smart-rag-api.p.rapidapi.com',
apiKey: 'YOUR_RAPIDAPI_KEY'
});
// Your LLM said this. Is it true?
const result = await client.guard(
'Returns are accepted within 60 days of purchase',
'Our return policy allows returns within 14 days.'
);
console.log(result); // verdict: rejected, reason: numerical_mismatch
curl -X POST https://smart-rag-api.p.rapidapi.com/v1/fact-check \
-H "X-RapidAPI-Key: YOUR_KEY" \
-H "X-RapidAPI-Host: smart-rag-api.p.rapidapi.com" \
-H "Content-Type: application/json" \
-d '{"text":"Returns are accepted within 60 days of purchase","source_context":"Our return policy allows returns within 14 days.","mode":"lexical"}'
{
"verdict": "rejected",
"action": "block",
"reason": "numerical_mismatch",
"confidence": 0.03,
"supported": false
}
60 days vs 14 days. Guard caught it. Your user never sees the wrong answer.
Next: Upload your own documents and verify answers against them.
API Documentation
Wauldo is a RAG API that returns verified answers with source citations and confidence scores. OpenAI SDK compatible. Zero hallucinations.
https://api.wauldo.com
REST + SSE Streaming
RapidAPI Key or JWT
How does Wauldo authentication work?
Two authentication methods are supported:
Option 1 — RapidAPI recommended
Get your API key from RapidAPI and include it in every request:
// Headers
X-RapidAPI-Key: your_api_key
X-RapidAPI-Host: smart-rag-api.p.rapidapi.com
Option 2 — Direct API Key
For self-hosted deployments, use a long-lived API key:
# Pass your API key as a Bearer token
curl -X POST https://api.wauldo.com/v1/fact-check \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "60 days", "source_context": "14 days", "mode": "lexical"}'
How to get started with Wauldo quickly?
Upload a document and get a verified answer in 2 API calls:
Upload your document
curl -X POST https://api.wauldo.com/v1/upload \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"content": "Section 4.2: Late payments incur a 2% monthly fee...",
"filename": "contract.txt"
}'
Ask a question
curl -X POST https://api.wauldo.com/v1/query \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"query": "What is the late payment fee?", "top_k": 5}'
Get a verified answer
{
"answer": "The contract specifies a 2% monthly late payment fee (Section 4.2).",
"sources": [
{ "content": "Section 4.2: Late payments incur a 2% monthly fee...", "score": 0.92 }
],
"audit": {
"confidence": 0.92,
"grounded": true,
"model": "auto"
}
}
How does Wauldo's OpenAI SDK compatibility work?
Wauldo is a drop-in replacement for the OpenAI API. Just change the base_url — your existing code works as-is.
from openai import OpenAI
# Just swap the base_url — everything else is the same
client = OpenAI(
base_url="https://api.wauldo.com/v1",
api_key="your_jwt_token"
)
response = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Explain quantum computing"}],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.wauldo.com/v1',
apiKey: 'your_jwt_token',
});
const stream = await client.chat.completions.create({
model: 'auto',
messages: [{ role: 'user', content: 'Explain quantum computing' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
curl https://api.wauldo.com/v1/chat/completions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{"role": "user", "content": "Explain quantum computing"}],
"stream": true
}'
How to upload documents to Wauldo?
/v1/upload
Upload text content to be chunked, indexed, and available for queries.
Request Body
| Parameter | Type | Description |
|---|---|---|
| content required | string | Document text content (max 10MB) |
| filename optional | string | Filename for source tracking (e.g. report.txt) |
Response 200
{
"status": "success",
"chunks_count": 12,
"source": "report.txt"
}
How to upload files to Wauldo?
/v1/upload/file
Upload a file directly using multipart form data.
Supported formats
curl -X POST https://api.wauldo.com/v1/upload/file \
-H "Authorization: Bearer $TOKEN" \
-F "file=@contract.txt"
Response 200
{
"status": "success",
"chunks_count": 24,
"source": "contract.txt",
"file_size": 15234
}
How to query Wauldo for verified answers?
/v1/query
Ask a question against your uploaded documents. Returns a verified answer with sources, confidence score, and full audit trail.
Request Body
| Parameter | Type | Description |
|---|---|---|
| query required | string | Your question |
| top_k optional | integer | Number of source chunks to retrieve (default: 5, max: 20) |
| stream optional | boolean | Enable SSE streaming — see Streaming guide |
| debug optional | boolean | Include retrieval funnel diagnostics — see Audit Trail |
| quality_mode optional | string | fast, balanced, or premium — see Quality Modes |
Response 200
{
"answer": "The contract specifies a 2% monthly late payment fee (Section 4.2).",
"sources": [
{
"content": "Section 4.2: Late payments incur a 2% monthly fee...",
"score": 0.92,
"source": "contract.txt"
}
],
"audit": {
"confidence": 0.92,
"confidence_label": "high",
"grounded": true,
"retrieval_path": "BM25Reranked",
"model": "auto",
"latency_ms": 1420,
"sources_used": 2,
"sources_evaluated": 5
}
}
How does Wauldo Chat Completions work?
/v1/chat/completions
OpenAI-compatible chat endpoint. Works with any OpenAI SDK. Supports streaming.
Request Body
| Parameter | Type | Description |
|---|---|---|
| messages required | array | Array of {"role": "user"|"system"|"assistant", "content": "..."} |
| model optional | string | Model name or "auto" (default: auto-selected) |
| stream optional | boolean | Enable SSE streaming (recommended for UX) |
| temperature optional | number | Sampling temperature, 0.0 to 2.0 (default: 0.7) |
| max_tokens optional | integer | Maximum tokens in the response |
How to list available models in Wauldo?
/v1/models
Returns available models. OpenAI SDK compatible.
curl https://api.wauldo.com/v1/models \
-H "Authorization: Bearer $TOKEN"
How to use Wauldo collections?
/v1/collections
List all document collections for the authenticated tenant.
/v1/collections/{name}
Delete a collection and all its chunks. Useful for re-uploading updated documents.
How does Wauldo fact-check work?
/v1/fact-check
Verify text claims against source context. Returns a structured verdict with actionable decisions per claim.
Request
curl -X POST https://api.wauldo.com/v1/fact-check \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{
"text": "Returns are accepted within 60 days. Gift cards never expire.",
"source_context": "Our policy allows returns within 14 days. Gift cards expire after 12 months.",
"mode": "lexical"
}'
Response
{
"verdict": "rejected",
"action": "block",
"hallucination_rate": 1.0,
"claims": [
{
"text": "Returns are accepted within 60 days.",
"verdict": "rejected",
"action": "block",
"confidence": 0.3,
"confidence_label": "very_low",
"reason": "numerical_mismatch"
},
{
"text": "Gift cards never expire.",
"verdict": "rejected",
"action": "block",
"confidence": 0.2,
"confidence_label": "very_low",
"reason": "negation_conflict"
}
]
}
Verification Modes
| Mode | Speed | Accuracy | Requires |
|---|---|---|---|
lexical | <1ms | Good (catches numbers, negations) | Nothing |
hybrid | ~50ms | Better (adds semantic similarity) | Embedding model |
semantic | ~100ms | Best (full embeddings) | Embedding model |
Verdicts & Actions
| Verdict | Action | Meaning |
|---|---|---|
verified | allow | Claim matches source (confidence ≥ 0.7) |
weak | review | Partial match, needs human review (0.4–0.7) |
rejected | block | Contradiction or no evidence (< 0.4) |
Rejection Reasons
| Reason | Example |
|---|---|
numerical_mismatch | "60 days" vs source says "14 days" |
negation_conflict | "never expire" vs source says "12 months" |
insufficient_evidence | Claim topic not found in source |
partial_match | Some overlap but not enough to verify |
Use Cases
- Customer support — verify agent responses before sending
- Compliance — check documents against policy
- Content moderation — detect false claims automatically
- AI pipelines — validate LLM outputs before downstream use
How does Wauldo citation verify work?
/v1/verify
Verify that AI-generated text properly cites its sources. Detects uncited sentences, phantom citations (references to non-existent sources), and calculates citation coverage.
Request
curl -X POST https://api.wauldo.com/v1/verify \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN" \
-d '{
"text": "Rust was released in 2010 [Source: rust_book]. It is fast [Source: fake_doc].",
"sources": [
{"name": "rust_book", "content": "Rust was first released in 2010 by Mozilla."}
],
"threshold": 0.5
}'
Response
{
"citation_ratio": 1.0,
"has_sufficient_citations": true,
"sentence_count": 2,
"citation_count": 2,
"uncited_sentences": [],
"citations": [
{"citation": "[Source: rust_book]", "source_name": "rust_book", "is_valid": true},
{"citation": "[Source: fake_doc]", "source_name": "fake_doc", "is_valid": false}
],
"phantom_count": 1,
"processing_time_ms": 0
}
Parameters
| Field | Type | Required | Description |
|---|---|---|---|
text | string | Yes | AI-generated text to verify (max 64 KB) |
sources | array | No | Source chunks to validate citations against |
threshold | number | No | Min citation ratio (0.0–1.0, default 0.5) |
Citation Formats Detected
| Format | Example |
|---|---|
| Source tag | [Source: doc1], [Ref: paper], [See: manual] |
| Numeric | [1], [2], [42] |
| Parenthetical | (Source: report), (Ref: study) |
| Footnote | ^1, ^12 |
Use Cases
- RAG pipelines — ensure LLM responses cite retrieved chunks
- Academic/legal — verify all claims are sourced
- Hallucination detection — flag phantom citations referencing missing sources
- Quality gates — block responses below citation threshold
How to get insights from Wauldo?
/v1/insights
Get ROI metrics for your API key: token savings, estimated cost reduction, policy distribution, and validation latency. Track exactly how much value the pipeline delivers.
Request
curl https://api.wauldo.com/v1/insights \
-H "Authorization: Bearer YOUR_TOKEN"
Response
{
"tig_key": "tig_abc123",
"total_requests": 1842,
"intelligence_requests": 1650,
"fallback_requests": 192,
"tokens": {
"baseline_total": 2450000,
"real_total": 1890000,
"saved_total": 560000,
"saved_percent_avg": 22.8
},
"cost": {
"estimated_usd_saved": 1.12
},
"policy": {
"reasoning_distribution": {"fast": 1200, "balanced": 380, "deep": 70},
"rag_distribution": {"full": 900, "light": 550, "none": 200}
},
"validation": {
"validation_distribution": {"strict": 500, "standard": 1150},
"total_validation_tokens": 85000,
"total_validation_latency_ms": 42000,
"avg_validation_latency_ms": 25.5
},
"period": {
"since": "2026-04-01T00:00:00Z",
"until": "now"
}
}
Response Fields
| Field | Description |
|---|---|
tokens.saved_total | Total tokens saved by the intelligence pipeline |
tokens.saved_percent_avg | Average savings percentage across all requests |
cost.estimated_usd_saved | Estimated cost savings in USD |
policy.reasoning_distribution | How many requests used each reasoning mode (fast/balanced/deep) |
validation.avg_validation_latency_ms | Average validation latency in milliseconds |
Shareable Card
GET /v1/insights/share returns a standalone HTML page with your savings metrics, optimized for sharing on LinkedIn and Twitter (Open Graph tags included).
How to use Wauldo analytics?
/v1/analytics
Cache performance, token savings, cost tracking, and system prompt deduplication metrics. Monitor your API usage and optimization in real time.
Request
curl "https://api.wauldo.com/v1/analytics?minutes=60" \
-H "Authorization: Bearer YOUR_TOKEN"
Response
{
"cache": {
"total_requests": 500,
"result_store_hits": 45,
"semantic_cache_hits": 120,
"cache_misses": 335,
"cache_hit_rate": 0.33,
"avg_latency_ms": 180.5,
"p95_latency_ms": 450.0,
"p99_latency_ms": 890.0
},
"tokens": {
"total_baseline": 125000,
"total_real": 98000,
"total_saved": 27000,
"avg_savings_percent": 21.6
},
"cost": {
"total_cost_usd": 0.25,
"estimated_cost_saved_usd": 0.054,
"cost_per_hour_usd": 0.25
},
"dedup": {
"unique_system_prompts": 3,
"total_requests": 500,
"total_tokens_saved": 15000
},
"uptime_secs": 86400
}
Parameters
| Field | Type | Required | Description |
|---|---|---|---|
minutes | integer | No | Time window in minutes for cost metrics (default: 60). Cache, token, and dedup stats are cumulative since server start. |
Traffic Monitoring
GET /v1/analytics/traffic returns per-tenant traffic stats: requests today, tokens used, success rate, average latency, and P95 latency. Useful for monitoring production workloads.
curl https://api.wauldo.com/v1/analytics/traffic \
-H "Authorization: Bearer YOUR_TOKEN"
Traffic Response
{
"total_requests_today": 3200,
"total_tokens_today": 1450000,
"top_tenants": [
{
"tenant_id": "user_abc",
"requests_today": 850,
"tokens_used": 380000,
"success_rate": 0.98,
"avg_latency_ms": 210
}
],
"error_rate": 0.02,
"avg_latency_ms": 180,
"p95_latency_ms": 450,
"uptime_secs": 86400
}
How to check Wauldo API health?
/health
Returns API health, RAG chunk count, Redis status, active provider, and uptime. No auth required.
{
"status": "ok",
"rag_chunks": 142,
"redis": "connected",
"provider": "openrouter",
"uptime_seconds": 86400
}
SSE Streaming
When stream: true is set on /v1/query, the response is delivered as Server-Sent Events (SSE). This lets you show sources and stream the answer token-by-token for a great UX.
Event sequence
Example: consume the stream
import requests, json
resp = requests.post(
"https://api.wauldo.com/v1/query",
headers={"Authorization": f"Bearer {token}"},
json={"query": "What is the late fee?", "stream": True},
stream=True
)
for line in resp.iter_lines():
if not line:
continue
data = line.decode().removeprefix("data: ")
if data == "[DONE]":
break
event = json.loads(data)
if "sources" in event:
print(f"Found {len(event['sources'])} sources")
elif "token" in event:
print(event["token"], end="")
elif "audit" in event:
print(f"\nConfidence: {event['audit']['confidence']}")
const resp = await fetch('https://api.wauldo.com/v1/query', {
method: 'POST',
headers: { 'Authorization': `Bearer ${token}`, 'Content-Type': 'application/json' },
body: JSON.stringify({ query: 'What is the late fee?', stream: true })
});
const reader = resp.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
for (const line of decoder.decode(value).split('\n')) {
if (!line.startsWith('data: ')) continue;
const data = line.slice(6);
if (data === '[DONE]') return;
const event = JSON.parse(data);
if (event.token) document.getElementById('answer').textContent += event.token;
}
}
curl -N https://api.wauldo.com/v1/query \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"query": "What is the late fee?", "stream": true}'
# Output:
# data: {"sources": [...]}
# data: {"token": "The"}
# data: {"token": " contract"}
# data: {"token": " specifies"}
# ...
# data: {"audit": {"confidence": 0.92, "grounded": true, ...}}
# data: [DONE]
Audit Trail
Every query response includes an audit object that makes the answer self-verifiable. Use it to build trust indicators in your UI, flag low-confidence answers, or debug retrieval issues.
Audit fields
Using audit in your app
# Show a trust badge based on confidence
audit = response["audit"]
if audit["grounded"] and audit["confidence_label"] == "high":
show_badge("Verified", color="green") # Safe to display
elif audit["confidence_label"] == "medium":
show_badge("Likely correct", color="yellow") # Show with caveat
else:
show_badge("Low confidence", color="red") # Warn the user
# Log for monitoring
log(model=audit["model"], latency=audit["latency_ms"], path=audit["retrieval_path"])
Quality Modes
Control the speed/quality tradeoff with the quality_mode parameter. If omitted, Wauldo auto-selects the best tier based on your query complexity and RAG confidence.
# Explicitly set quality mode
curl -X POST https://api.wauldo.com/v1/query \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": "Analyze the financial implications",
"quality_mode": "premium"
}'
Retrieval Paths
Wauldo picks a retrieval strategy per query based on signal strength. The retrieval_path in the audit trail tells you which was used.
Fast keyword matching. Used when the query closely matches document terms. Fastest path (~10ms retrieval).
Example: "What is the late payment fee?" against a contract with those exact terms.
Lexical retrieval + neural reranking. Best balance of speed and accuracy. Catches semantic matches that keyword search might miss.
Example: "How much extra do I pay if I'm late?" — paraphrased query, similar meaning.
Full dense vector search with rank fusion. Most thorough but slowest path. Used when the query is conceptually related but uses different vocabulary.
Example: "financial penalties for overdue invoices" against a doc that says "late payment fee".
Use Cases
Wauldo works best when you need verified, source-cited answers from your own documents.
Legal & Compliance
Upload contracts, policies, or regulations. Ask about specific clauses, obligations, or deadlines. Every answer cites the exact section.
Q: "What is the termination notice period?"
A: "60 days written notice (Section 12.3)"
confidence: 0.95 | grounded: true
Knowledge Base / Support
Upload product docs, FAQs, or runbooks. Build a support bot that gives accurate answers instead of hallucinating.
Q: "How do I reset my password?"
A: "Go to Settings > Security > Reset..."
confidence: 0.88 | grounded: true
Financial Analysis
Upload earnings reports, balance sheets, or market research. Extract specific numbers with source verification.
Q: "What was Q3 revenue?"
A: "$4.2M, up 23% YoY (page 3)"
confidence: 0.91 | grounded: true
Technical Documentation
Upload API specs, architecture docs, or code. Get precise technical answers grounded in your actual documentation.
Q: "What's the max payload size?"
A: "10MB per request (API limits doc)"
confidence: 0.93 | grounded: true
How to install Wauldo Python SDK from PyPI?
pip install wauldo
from wauldo import HttpClient
client = HttpClient(base_url="https://api.wauldo.com", api_key="YOUR_API_KEY")
# Guard — catch hallucinations in 3 lines
result = client.guard(
text="Returns accepted within 60 days.",
source_context="Our policy: returns within 14 days.",
)
print(result.verdict) # "rejected"
print(result.claims[0].reason) # "numerical_mismatch"
# RAG — upload, ask, verify
client.rag_upload(content="Your document text...", filename="doc.txt")
result = client.rag_query("What are the key points?")
print(result.answer)
print(result.sources)
How to install Wauldo TypeScript SDK from npm?
npm install wauldo
import { HttpClient } from 'wauldo';
const client = new HttpClient({
baseUrl: 'https://api.wauldo.com',
apiKey: 'YOUR_API_KEY',
});
// Guard — catch hallucinations
const result = await client.guard(
'Returns accepted within 60 days.',
'Our policy: returns within 14 days.',
);
console.log(result.verdict); // "rejected"
console.log(result.claims[0]?.reason); // "numerical_mismatch"
// RAG — upload, ask, verify
await client.ragUpload('Your document text...', 'doc.txt');
const answer = await client.ragQuery('What are the key points?');
console.log(answer.answer);
How to use Wauldo Rust SDK from crates.io?
cargo add wauldo
use wauldo::{HttpClient, ChatRequest, ChatMessage};
let client = HttpClient::with_key("https://api.wauldo.com", "YOUR_API_KEY")?;
// Guard — catch hallucinations
let result = client.guard(
"Returns accepted within 60 days.",
"Our policy: returns within 14 days.",
None,
).await?;
println!("Verdict: {}", result.verdict); // "rejected"
// RAG — upload, ask, verify
client.rag_upload("Your document text...", None).await?;
let result = client.rag_query("What are the key points?", None).await?;
println!("{}", result.answer);
How to deploy agents with Wauldo?
Create custom AI agents that verify every response before delivery. Upload documents, configure behavior, run queries — every answer is fact-checked.
Quick Start — Create and run an agent in 60 seconds
# 1. Create an agent
curl -X POST https://api.wauldo.com/v1/agents \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{
"name": "support-bot",
"description": "Customer support agent",
"wauldo_toml": "[agent]\nname = \"support-bot\"\n\n[model]\nprovider = \"openrouter\"\nname = \"auto\"",
"agents_md": "# Support Bot\nAnswer questions based ONLY on uploaded documents."
}'
# Returns: { "id": "ag_abc123", "name": "support-bot", ... }
# 2. Upload a document
curl -X POST https://api.wauldo.com/v1/upload \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"content": "Returns accepted within 14 days...", "filename": "policy.txt"}'
# 3. Run the agent
curl -X POST https://api.wauldo.com/v1/agents/ag_abc123/runs \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"input": "What is the return policy?", "verification_mode": "balanced"}'
# Returns: { "task_id": "t_xyz", "status": "queued" }
# 4. Get the verified result
curl https://api.wauldo.com/v1/tasks/t_xyz \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant"
# Returns:
# {
# "result": "Returns are accepted within 14 days.",
# "verification": { "verdict": "SAFE", "trust_score": 1.0 }
# }
Agent Configuration (wauldo.toml)
wauldo.toml declares the agent's identity: which model to call, how strict verification should be, where the agent runs, and whether it remembers prior runs. It's the single file that controls behavior across all /v1/agents/:id/runs calls. AGENTS.md (optional) layers natural-language behavior instructions on top.
Minimal example
Two required sections, defaults applied for everything else.
[agent]
name = "support-bot"
[model]
provider = "openrouter"
name = "auto"
Full example (all sections)
[agent]
name = "support-bot"
description = "Handles customer support questions"
instructions = "./AGENTS.md" # behavior file path
skills = "./skills/" # skills directory
mcp = "./mcp.json" # MCP server config
[model]
provider = "openrouter" # openrouter | openai | anthropic | ollama
name = "auto" # "auto" for smart routing, or any provider model id
fallback = ["auto"] # tried in order if primary fails
temperature = 0.2
[sandbox]
type = "none" # none | docker | daytona | modal | runloop
[verification]
mode = "balanced" # strict | balanced | permissive
min_trust_score = 0.6 # 0.0 – 1.0; reject below this
[deploy]
target = "local" # local | fly | render | selfhost
region = "cdg"
[memory]
enabled = true
namespace = "support"
auto_write = true
Field reference
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
| agent.name | string | required | — | [a-zA-Z0-9_-] only. Identifier shown in dashboards and logs. |
| agent.description | string | optional | "" | One-line summary of what the agent does. |
| agent.instructions | string | optional | "./AGENTS.md" | Path to the markdown file with behavior rules. |
| agent.skills | string | optional | "./skills/" | Directory of optional skill files. |
| agent.mcp | string | optional | "./mcp.json" | MCP server configuration file. |
| model.provider | string | required | — | One of openrouter, openai, anthropic, ollama. |
| model.name | string | required | — | Model identifier, or "auto" for cost-aware routing. |
| model.fallback | string[] | optional | [] | Models tried in order if the primary fails. |
| model.temperature | number | optional | null | Sampling temperature passed through to the provider. |
| sandbox.type | enum | optional | "none" | none | docker | daytona | modal | runloop. Where tool calls execute. |
| verification.mode | enum | optional | "balanced" | strict | balanced | permissive. Default for runs that don't override. |
| verification.min_trust_score | number | optional | 0.6 | In [0.0, 1.0]. Below this, results are flagged. |
| deploy.target | enum | optional | "local" | local | fly | render | selfhost. |
| deploy.region | string | optional | null | Deploy-target-specific region code. |
| memory.enabled | bool | optional | false | When true, the agent reads/writes a memory namespace across runs. |
| memory.namespace | string | optional | "" | Logical bucket separating memory between agents. |
| memory.auto_write | bool | optional | false | When true, every successful run is auto-saved to memory. |
Pass the file's contents as the wauldo_toml string field on POST /v1/agents. Validation runs server-side: missing required fields or out-of-range values return 400.
AGENTS.md (optional)
# Support Bot
You are a customer support agent for Acme Corp.
## Rules
- Answer questions based ONLY on the uploaded documents.
- If the answer is not in your sources, say "I don't have this information."
- Never invent facts or numbers.
- Be concise and professional.
Available Presets
A preset is a built-in multi-state workflow that shapes how the agent reasons before answering. Pass the preset name as the preset field on POST /v1/agents (set at agent creation) or as {"preset": "..."} on POST /v1/agents/:id/runs to override per-run. If omitted, runs default to general_task. wauldo.toml + AGENTS.md control identity, model, and tone — the multi-state workflow comes from the preset.
| preset | Description | Typical use case | States |
|---|---|---|---|
| general_task | Single-state grounded Q&A. No side effects unless explicitly asked. | Default chat, support bots, simple lookups | 1 |
| planner_executor | Plan-then-execute. Decomposes the query into ordered steps with tool hints + dependencies, executes each step, then synthesises a cited answer. ReAct-style autonomous decomposition. | Multi-step research, anything that benefits from explicit planning before tool calls | 3 |
| rust_backend_architect | Senior Rust engineer. Analysis → Tradeoffs → Mitigations → Implementation → Validation. | Backend design review, architecture critique | 5 |
| rag_data_engineer | RAG pipeline expert. Audit, chunking strategy, embeddings, retrieval, eval plan. | RAG tuning, retrieval quality work | 5 |
| security_auditor | Threat modelling, vulnerability assessment, OWASP/CWE-tagged mitigations. | Security review of code or architecture | 5 |
| data_analyst | Data profiling, exploratory analysis, statistical modelling, executive summary. | KPI dashboards, business insights from data | 5 |
| growth_hacker | Distribution strategy, channel ROI, launch sequences, pricing positioning. | OSS / dev-tool go-to-market planning | 5 |
Invoke a preset via the API
curl -X POST https://api.wauldo.com/v1/agents/ag_abc123/runs \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{
"input": "Critique my Axum service auth layer",
"preset": "rust_backend_architect",
"verification_mode": "balanced"
}'
# Returns: { "task_id": "t_xyz", "status": "queued" }
# Stream state-by-state via GET /v1/tasks/t_xyz/stream
Custom workflows
Beyond the six presets, you can send your own workflow inline via the custom_preset field on POST /v1/agents. Format matches the built-in presets: workflow.states (allowed_tools, parallel_group, required_outputs), transitions, system, guardrails. Server enforces hard limits — max 50 states, ≤256 KB JSON, no transition cycles, every state's allowed_tools must reference a tool currently registered.
| Limit | Value | Why |
|---|---|---|
| max_states | 50 | Cap memory + linear-scan validation cost |
| max_size | 256 KB | Reject payload bombs at parse time |
| cycles | rejected | DFS check across transitions avoids infinite loops |
| unknown tools | rejected | Defense-in-depth — agent silently skips unknowns at runtime, but the API rejects upfront so you get a clear error |
Inline a custom workflow
curl -X POST https://api.wauldo.com/v1/agents \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{
"name": "my-triage-agent",
"wauldo_toml": "[agent]\nname=\"triage\"\n[model]\nprovider=\"openrouter\"\nname=\"auto\"",
"custom_preset": {
"version": "2.0",
"metadata": { "name": "MyTriage", "strict_mode": false },
"system": { "role": "Issue triage assistant" },
"workflow": {
"default_steps": ["Classify", "Answer"],
"states": {
"Classify": { "allowed_tools": ["wikipedia"] },
"Answer": { "allowed_tools": [] }
},
"transitions": [
{ "from_state": "Classify", "to_state": "Answer", "condition_trigger": "always" }
]
},
"output_formats": { "default": { "schema": { "schema_type": "object", "required": [], "properties": {} } } },
"guardrails": { "forbidden_behaviors": ["leak credentials"] }
}
}'
# custom_preset takes precedence over the built-in `preset` field if both are set.
Need an even larger graph or stricter limits raised? contact@wauldo.com — quotas can be tuned per tenant.
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/agents | Create an agent. Body: {"name","description","wauldo_toml","agents_md"?,"preset"?} |
| GET | /v1/agents | List your agents |
| GET | /v1/agents/:id | Get agent details |
| PATCH | /v1/agents/:id | Update agent config. Body: partial fields |
| DELETE | /v1/agents/:id | Delete agent |
| POST | /v1/agents/:id/runs | Run agent. Body: {"input": "...", "verification_mode": "strict"|"balanced"|"off"} → returns {task_id, status} |
All authenticated endpoints require Authorization: Bearer YOUR_KEY and x-rapidapi-user: YOUR_TENANT_ID. Poll GET /v1/tasks/:id or stream GET /v1/tasks/:id/stream to retrieve the run result.
Verification Modes
Naming note — trust_score vs support_score. The API JSON returns the field as trust_score for backward compatibility with v0.x clients. The Python, TypeScript, and Rust SDKs expose it as support_score. Both refer to the same value: the 0–1 fraction of claims supported by the source documents you uploaded.
Every agent run goes through the verification pipeline. Control the strictness:
| Mode | Behavior |
|---|---|
| strict | Unverified answers are blocked. Safest. |
| balanced | Unverified answers marked as partial. Default. |
| permissive | All answers returned with support score. Most lenient. |
Verdict enum
Each completed task returns a verification.verdict. Pair it with verification.trust_score (0.0 – 1.0) and the optional verification.message for display.
| verdict | When | Recommended action |
|---|---|---|
| SAFE | trust_score ≥ 0.7, claims supported by uploaded sources | Deliver as-is |
| UNCERTAIN | 0.4 ≤ trust_score < 0.7 | Show with warning / human review |
| PARTIAL | Mix of supported + unsupported claims | Display scrubbed version; see stripped_claims |
| BLOCK | Hallucination detected OR prompt injection | Do not surface to users |
| CONFLICT | Contradictory numerical values in output | Review before delivery |
| UNVERIFIED | No source documents uploaded — or no claim found sufficient support in the sources | Upload docs via /v1/upload to enable real verification; treat as low-trust until then |
Note on UNVERIFIED: when verification_source = "prompt_only", the returned confidence and hallucination_rate reflect self-consistency of the LLM output against the prompt — not ground-truth fact-checking. trust_score is forced to 0.0 in that case. Treat verdict + trust_score + message as authoritative.
Python SDK
import json, time, urllib.request
BASE = "https://api.wauldo.com"
HEADERS = {
"Authorization": "Bearer YOUR_KEY",
"x-rapidapi-user": "my-tenant",
"Content-Type": "application/json",
}
# Create agent
agent = post("/v1/agents", {
"name": "my-bot",
"wauldo_toml": '[agent]\nname = "my-bot"\n\n[model]\nprovider = "openrouter"\nname = "auto"',
})
# Upload docs
post("/v1/upload", {"content": "Your document text...", "filename": "doc.txt"})
# Run agent
run = post(f"/v1/agents/{agent['id']}/runs", {"input": "What is the refund policy?"})
# Poll for result
while True:
task = get(f"/v1/tasks/{run['task_id']}")
if task["status"] == "completed":
print(task["result"]) # Verified answer
print(task["verification"]["verdict"]) # SAFE
print(task["verification"]["trust_score"]) # 1.0
break
time.sleep(3)
Streaming (SSE) — GET /v1/tasks/:id/stream
Instead of polling, subscribe to Server-Sent Events to receive each workflow state transition as it completes. Ideal for long-running multi-state agents (RustArchitect, SecurityAuditor, etc.) where you want to stream reasoning in the UI. Each data: line is a JSON-encoded StateTransition.
| Event field | Meaning |
|---|---|
| state_name | e.g. Analysis, Tradeoffs, Answer. Synthetic TASK_COMPLETED / TASK_FAILED for already-terminal tasks. |
| to_state | Next state name, or null on final state. |
| raw_output | Full LLM output for the state (truncated to 8k chars). |
| condition | "Sequential execution", "Parallel group '…'" etc. |
| duration_ms | Wall time spent in the LLM call for this state. |
| prompt_tokens / completion_tokens | Rough estimates (~4 chars/token). |
| repair_count | Number of JSON repair passes applied on the final-state output. |
| success | true if the state completed without validation errors. |
The stream closes when the task reaches a terminal status (completed / failed / cancelled). After closure, call GET /v1/tasks/:id once to fetch the full verification block and final result. Connection TTL is 30 minutes; reconnect if needed — already-emitted events are not replayed, so resubscribers only see subsequent transitions.
# Python — consume SSE with the stdlib
import json, urllib.request
req = urllib.request.Request(
f"https://api.wauldo.com/v1/tasks/{task_id}/stream",
headers={
"Authorization": "Bearer YOUR_KEY",
"x-rapidapi-user": "my-tenant",
"Accept": "text/event-stream",
},
)
with urllib.request.urlopen(req) as resp:
for raw in resp:
line = raw.decode().rstrip()
if line.startswith("data:"):
ev = json.loads(line[5:].strip())
print(f"{ev['state_name']:<16} {ev['duration_ms']:>5}ms {ev['completion_tokens']}tok")
// TypeScript — native EventSource (browser or bun/deno)
const headers = {
"Authorization": "Bearer YOUR_KEY",
"x-rapidapi-user": "my-tenant",
"Accept": "text/event-stream",
};
const resp = await fetch(`https://api.wauldo.com/v1/tasks/${taskId}/stream`, { headers });
const reader = resp.body!.getReader();
const decoder = new TextDecoder();
let buf = "";
while (true) {
const { value, done } = await reader.read();
if (done) break;
buf += decoder.decode(value, { stream: true });
for (const line of buf.split("\n")) {
if (line.startsWith("data:")) {
const ev = JSON.parse(line.slice(5).trim());
console.log(ev.state_name, ev.duration_ms, ev.completion_tokens);
}
}
buf = buf.slice(buf.lastIndexOf("\n") + 1);
}
How do Wauldo agent revisions work?
Every change to an agent's custom_preset mints an immutable, content-addressed revision (SHA-256). The agent points to one active revision; you can roll back or promote any past revision in O(1) — no LLM call, no rebuild. Modeled on AWS ECS task definitions: append-only history, atomic active pointer.
Why versioning matters
You tweak an agent the morning of a demo. It breaks. With revisions, rollback is one PATCH to the previous revision — your live runs flip back to the known-good prompt instantly. No re-validation, no re-deploy, no LLM cost.
| Trait | Behavior |
|---|---|
| immutable | Revisions are never mutated in place — content-addressed via SHA-256. |
monotone rev | Per-agent counter, never reused even after prune. |
| implicit mint | POST /v1/agents with custom_preset mints rev 1; subsequent PATCH /v1/agents/:id with custom_preset mints the next rev. |
| cap | 50 revisions per agent. Oldest non-active revisions auto-pruned. |
| tenant-scoped | Revisions live under the tenant. Cross-tenant reads are rejected. |
| cascade delete | DELETE /v1/agents/:id purges all revisions atomically. |
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/agents/:id/revisions | Mint a new revision (rate-limited 5/min/tenant) |
| GET | /v1/agents/:id/revisions | List revisions newest-first |
| GET | /v1/agents/:id/revisions/:rev | Fetch one revision verbatim |
| PATCH | /v1/agents/:id/active-revision | Promote / rollback in O(1) — body {"rev": <n>} |
# Mint a new revision (becomes active by default)
curl -X POST https://api.wauldo.com/v1/agents/AGENT_ID/revisions \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{
"custom_preset": { "version": "2.0", "workflow": { "states": [...] } },
"message": "tighten triage prompt",
"set_active": true
}'
# Returns: { "rev": 4, "sha256": "abc...", "active_rev": 4 }
# List revisions, newest first
curl https://api.wauldo.com/v1/agents/AGENT_ID/revisions \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant"
# Rollback to a previous revision (no LLM cost, instant)
curl -X PATCH https://api.wauldo.com/v1/agents/AGENT_ID/active-revision \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"rev": 3}'
SDK examples
# Python
from wauldo.agents import AgentsClient
agents = AgentsClient(base_url="https://api.wauldo.com", api_key="YOUR_KEY", tenant="my-tenant")
rev = agents.create_revision("AGENT_ID", custom_preset=preset_v2, message="tighten triage")
print(rev.rev, rev.sha256)
# Rollback in one line
agents.set_active_revision("AGENT_ID", rev=3)
// TypeScript
import { AgentsClient } from "wauldo";
const agents = new AgentsClient({ baseUrl: "https://api.wauldo.com", apiKey: "YOUR_KEY", tenant: "my-tenant" });
const rev = await agents.createRevision("AGENT_ID", { customPreset: presetV2, message: "tighten triage" });
await agents.setActiveRevision("AGENT_ID", 3);
// Rust
use wauldo::{AgentsClient, CreateRevisionRequest};
let agents = AgentsClient::new("https://api.wauldo.com")
.with_api_key("YOUR_KEY")
.with_tenant("my-tenant");
let rev = agents.create_revision("AGENT_ID", CreateRevisionRequest {
custom_preset: preset_v2,
message: Some("tighten triage".into()),
set_active: true,
}).await?;
agents.set_active_revision("AGENT_ID", 3).await?;
How to configure Wauldo webhooks?
Subscribe to verdict and lifecycle events. When a task crosses an attention threshold (BLOCK / CONFLICT / UNVERIFIED), Wauldo fires a signed POST to your endpoint within seconds. Built for Slack handlers, on-call alerts, audit ingestion.
Event types
| Event | Fired when |
|---|---|
task.completed | A task reaches a terminal verdict — payload carries verdict, support_score, halluc_rate, claims_count. |
task.failed | A task errored out — payload carries error. |
task.cancelled | A task was cancelled mid-execution. |
verification.alert | Auto-fired alongside task.completed when verdict is BLOCK / CONFLICT (severity high) or UNVERIFIED / INSUFFICIENT_CLAIMS / UNCERTAIN (severity medium). |
recommendation.new | A new Insights recommendation surfaced for the tenant. |
* | Wildcard subscription — receive every event. |
Reliability guarantees
| Trait | Behavior |
|---|---|
| at-least-once | Retries with exponential backoff (max 5 attempts, 1s → 60s). Your handler must be idempotent on X-Event-Id. |
| circuit breaker | Per-destination URL: 5 consecutive failures opens a 60s cooldown. A dead URL no longer drags every retry through 5 × 16 s of backoff. |
| DLQ | Final failures land in a dead-letter queue. Inspect via GET /v1/webhooks/dlq, replay via POST /v1/webhooks/dlq/:event_id/retry. |
| HMAC-SHA256 | When you register a secret, every POST carries X-Wauldo-Signature: sha256=<hex> over the raw body. Verify with the standard HMAC_SHA256(secret, raw_body) recipe. |
| SSRF guardrails | Private IPs (10/8, 172.16/12, 192.168/16, 127/8, IPv6 loopback / link-local / ULA / IPv4-mapped) are rejected at registration AND re-validated at DLQ retry. |
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/webhooks | Register a subscription |
| GET | /v1/webhooks | List subscriptions |
| DELETE | /v1/webhooks/:id | Remove a subscription |
| GET | /v1/webhooks/dlq | List failed deliveries |
| POST | /v1/webhooks/dlq/:event_id/retry | Replay a failed delivery |
| DELETE | /v1/webhooks/dlq/:event_id | Purge a DLQ entry |
# Register a webhook for verdict alerts
curl -X POST https://api.wauldo.com/v1/webhooks \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/wauldo-hook",
"events": ["verification.alert", "task.failed"],
"secret": "whsec_your_random_secret"
}'
# Verify a signature (Node.js example)
const crypto = require("crypto");
function verify(req) {
const got = req.headers["x-wauldo-signature"];
const expected = "sha256=" + crypto
.createHmac("sha256", process.env.WEBHOOK_SECRET)
.update(req.rawBody)
.digest("hex");
return crypto.timingSafeEqual(Buffer.from(got), Buffer.from(expected));
}
How do Wauldo workflows work?
Author multi-step pipelines as a state machine — Task / Choice / Wait / Pass / Fail / Succeed with explicit transitions. The runtime executes sequentially with bounded wall-clock, persists every transition, and exposes per-state observability. Today's executor supports tool:<name> resources ; agent chaining lands in a future release.
State types
| Type | Purpose | Required fields |
|---|---|---|
| Task | Invoke a registered tool. | resource, next |
| Choice | Branch on a JSONPath variable. | choices[], default |
| Wait | Pause up to 60 seconds. | seconds, next |
| Pass | Inject a constant payload. | result, next |
| Fail | Terminate with an error. | error |
| Succeed | Terminate with the current IO. | — |
Validation guarantees (at create time)
| Check | Behavior |
|---|---|
| cycle detection | DFS catches A→B→A and longer cycles before storage. start_at must reach every state. |
| transition targets | Every next must reference an existing state — no dangling pointers. |
| choice operators | Strict enum: eq, neq, gt, lt, contains. Unknown operators are 400 at create time, never at run time. |
| tenant cap | 100 workflows per tenant. Tenant-scoped — cross-tenant reads rejected. |
| durable | Persisted on the API host's local store — survives restart without re-uploading. |
Runtime guarantees (at execution time)
| Cap | Behavior |
|---|---|
| wall clock | Each run terminates within 60 seconds. Past the deadline the run is marked timed_out. |
| wait | A single Wait state cannot exceed 60 seconds. Longer values rejected at runtime. |
| transitions | Hard ceiling of 200 state visits per run — protects against runaway loops that slip past static cycle detection. |
| history | 5000 stored runs per tenant. Each run record is upserted on every state transition for full audit. |
| async | Runs are submit-and-poll. POST /runs returns 202 with an execution_id ; poll GET /runs/:execution_id for status. |
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/workflows | Create a workflow definition. |
| GET | /v1/workflows | List workflows for the calling tenant. |
| GET | /v1/workflows/:id | Fetch one definition. |
| DELETE | /v1/workflows/:id | Remove a definition. |
| POST | /v1/workflows/:id/runs | Start an asynchronous execution. Returns 202 with an execution_id. |
| GET | /v1/workflows/:id/runs/:execution_id | Fetch the current state and output of a run. |
Define a workflow
Three states : compute via tool:calculator, branch on the result, terminate. State type values use PascalCase ; transitions use next.
# Sequential pipeline: compute → branch → succeed
curl -X POST https://api.wauldo.com/v1/workflows \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "triage",
"start_at": "Compute",
"states": {
"Compute": {
"type": "Task",
"resource": "tool:calculator",
"next": "Route"
},
"Route": {
"type": "Choice",
"choices": [
{ "variable": "$.output", "operator": "contains", "value": "42", "next": "Done" }
],
"default": "Done"
},
"Done": { "type": "Succeed" }
}
}'
Run it and poll
Submit the run, capture the execution_id, and poll until status reaches a terminal value (succeeded, failed, timed_out).
# Start the run (202 Accepted)
curl -X POST https://api.wauldo.com/v1/workflows/$WF_ID/runs \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{ "input": { "operation": "add", "a": 21, "b": 21 } }'
# → { "execution_id": "wfr_...", "workflow_id": "wf_...", "status": "running" }
# Poll the run record
curl https://api.wauldo.com/v1/workflows/$WF_ID/runs/$EXECUTION_ID \
-H "Authorization: Bearer YOUR_KEY"
# → {
# "execution": {
# "id": "wfr_...",
# "status": "succeeded",
# "current_state": null,
# "input": { "operation": "add", "a": 21, "b": 21 },
# "output": { "output": "42" },
# "started_at": 1747...,
# "ended_at": 1747...,
# "error": null
# }
# }
SDK clients
Same create → start → poll loop without writing the HTTP yourself. The wait_for_run helper polls until terminal so you get the final WorkflowExecution back directly.
from wauldo.workflows import WorkflowsClient
wf = WorkflowsClient(base_url="https://api.wauldo.com", api_key="YOUR_KEY")
created = wf.create(
name="triage",
start_at="Compute",
states={
"Compute": {"type": "Task", "resource": "tool:calculator", "next": "Done"},
"Done": {"type": "Succeed"},
},
)
run = wf.start_run(created.id, input={"operation": "add", "a": 21, "b": 21})
final = wf.wait_for_run(created.id, run.execution_id)
print(final.status, final.output) # succeeded {'result': 42.0}
import { WorkflowsClient } from "wauldo";
const wf = new WorkflowsClient({ baseUrl: "https://api.wauldo.com", apiKey: "YOUR_KEY" });
const created = await wf.create({
name: "triage",
startAt: "Compute",
states: {
Compute: { type: "Task", resource: "tool:calculator", next: "Done" },
Done: { type: "Succeed" },
},
});
const run = await wf.startRun(created.id, { operation: "add", a: 21, b: 21 });
const final = await wf.waitForRun(created.id, run.execution_id);
console.log(final.status, final.output); // succeeded { result: 42 }
use wauldo::workflows::{CreateWorkflowRequest, WorkflowsClient};
use serde_json::json;
use std::collections::HashMap;
let wf = WorkflowsClient::new("https://api.wauldo.com")
.with_api_key("YOUR_KEY");
let mut states = HashMap::new();
states.insert("Compute".into(), json!({"type": "Task", "resource": "tool:calculator", "next": "Done"}));
states.insert("Done".into(), json!({"type": "Succeed"}));
let created = wf.create(CreateWorkflowRequest {
name: "triage".into(),
start_at: "Compute".into(),
states,
description: None,
}).await?;
let run = wf.start_run(&created.id, Some(json!({"operation": "add", "a": 21, "b": 21}))).await?;
let final_exec = wf.wait_for_run(&created.id, &run.execution_id, None, None).await?;
println!("{} {:?}", final_exec.status, final_exec.output);
Execution record fields
| Field | Description |
|---|---|
| id | Unique wfr_* identifier for this execution. |
| status | One of running, succeeded, failed, timed_out. |
| current_state | Name of the state being executed (while running). Null on terminal records. |
| input | The JSON body submitted to POST /runs. |
| output | Final IO value on success. Null when the run failed or timed out. |
| error | Human-readable error reason on terminal failure. Mirrors the Prometheus reason label. |
| started_at / ended_at | Unix seconds. ended_at is null while the run is in flight. |
JSONPath subset
Variables in Choice.variable and Task.output_path use a small JSONPath subset. Missing fields fall through to the default branch ; invalid syntax is rejected.
| Expression | Resolves to |
|---|---|
$ | The full current IO. |
$.field | A top-level field. |
$.a.b.c | A nested field path. |
$.arr[0] | An array element by index. |
$.users[0].name | Nested element field. |
How to use Wauldo OpenAI middleware?
Drop-in wrapper for the OpenAI Python SDK (and any OpenAI-compatible client). Three lines and every chat.completions.create() response carries a verified .wauldo namespace — verdict, support score, hallucination rate, claim count.
Install
pip install 'wauldo[openai]'
Pulls openai >= 1.0, < 2.0 as an extra. Doesn't conflict with an existing OpenAI install.
Usage
from openai import OpenAI
from wauldo.openai import with_verification
client = OpenAI() # or AsyncOpenAI, or any OpenAI-compat client
verified = with_verification( # wraps it
client,
wauldo_api_key="tig_live_...",
fact_check_mode="lexical", # or "hybrid", "semantic"
)
response = verified.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Capital of France?"}],
)
# Standard OpenAI ChatCompletion, plus:
response.wauldo.verdict # "SAFE" | "UNVERIFIED" | "CONFLICT" | ...
response.wauldo.support_score # 0.94
response.wauldo.halluc_rate # 0.0
response.wauldo.claims_count # 7
response.wauldo.fact_check_mode # "lexical"
response.wauldo.error # None when verdict attached cleanly
Behavior contract
| Trait | Behavior |
|---|---|
| non-invasive | Wraps the client via a thin proxy. Every other attribute (client.api_key, client.base_url, etc.) passes through unchanged. |
| async auto-detected | If you pass AsyncOpenAI, the wrapper returns an awaitable proxy automatically. No is_async= flag. |
| fail-open by default | Wauldo down / timeout / unexpected response shape attaches response.wauldo.error + logs a warning. The OpenAI response itself is never broken. |
raise_on_error=True | Opt in to bubble verification failures as WauldoVerificationError exceptions instead. |
| streaming pass-through | v1 : stream=True returns the OpenAI generator with response.wauldo = None. Post-stream verdict is a v2 feature. |
| no extra deps | Uses stdlib urllib for the /v1/fact-check POST — no requests / httpx pulled at runtime. |
How does Wauldo memory work?
Key-value memory with namespace isolation. Store conversation context, user preferences, or any structured data per tenant. Supports lexical search.
What your agent remembers
Your agent persists context across calls via /v1/memory/:namespace. The endpoint is a generic key-value store with semantic search — you can use any namespace name you want. To stay aligned with the broader agent ecosystem (LangChain, CrewAI), we recommend four conventional namespaces below. They are conventions only, not enforced by the API.
| Type | Namespace | TTL | Typical use case |
|---|---|---|---|
| Short-term | short_term | Client-managed | Chat history per session, windowed context. Conceptually FIFO — truncate or delete on the client side as the window fills. |
| Long-term | long_term | Persistent | User facts and preferences that should survive across sessions. Semantic search over keys + values. |
| Entity | entity | Persistent | Extracted entities (people, places, concepts) stored as structured JSON. Semantic search returns the closest entity record. |
| Contextual | contextual | Task-scoped | Scratch pad tied to a single task. Delete the keys explicitly when the task completes. |
# short_term — append a session message
curl -X POST https://api.wauldo.com/v1/memory/short_term \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"key": "session_abc:msg_017", "value": "user: how do I cancel my plan?", "tags": ["session_abc"]}'
# long_term — upsert a user preference
curl -X POST https://api.wauldo.com/v1/memory/long_term \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"key": "user_42:tone", "value": "prefers concise replies, no emoji", "tags": ["user_42"]}'
# entity — store a structured entity record
curl -X POST https://api.wauldo.com/v1/memory/entity \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"key": "person:ada_lovelace", "value": "{\"type\":\"person\",\"name\":\"Ada Lovelace\",\"role\":\"mathematician\",\"born\":1815}", "tags": ["person"]}'
# contextual — task-scoped scratch pad
curl -X POST https://api.wauldo.com/v1/memory/contextual \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"key": "task_t_xyz:plan", "value": "step 1: fetch invoice, step 2: parse total, step 3: refund", "tags": ["task_t_xyz"]}'
chat_history, user_profile, kb_facts, etc.). It also does not auto-rotate or auto-expire entries: TTL, FIFO truncation, and task-scoped cleanup are the client's responsibility via explicit DELETE /v1/memory/:namespace/:key calls.
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/memory/:namespace | Create or update an entry |
| GET | /v1/memory/:namespace | List entries (paginated) |
| GET | /v1/memory/:namespace/:key | Get a specific entry |
| POST | /v1/memory/:namespace/search | Search entries by query |
| DELETE | /v1/memory/:namespace/:key | Delete an entry |
# Store a memory entry
curl -X POST https://api.wauldo.com/v1/memory/support-context \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"key": "user_preference", "value": "prefers email over phone", "tags": ["contact"]}'
# Search memory
curl -X POST https://api.wauldo.com/v1/memory/support-context/search \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"query": "contact preference", "limit": 5}'
How does Wauldo Agent-to-Agent communication work?
Invoke an agent from another agent. Enables chaining, delegation, and multi-agent workflows. The called agent runs through the same verification pipeline.
# Agent A calls Agent B
curl -X POST https://api.wauldo.com/v1/a2a/AGENT_B_ID \
-H "Authorization: Bearer YOUR_KEY" \
-H "x-rapidapi-user: my-tenant" \
-H "Content-Type: application/json" \
-d '{"input": "Summarize the refund policy", "context": {"caller_agent": "manager-bot"}}'
# Returns a task_id — poll GET /v1/tasks/:id for the verified result
What are Wauldo error codes?
| Code | Meaning | Action |
|---|---|---|
| 400 | Bad request | Check required parameters |
| 401 | Unauthorized | Check your API key or token |
| 413 | Payload too large | Body exceeds 10MB — split your document |
| 429 | Rate limited | Wait and retry, or upgrade plan |
| 500 | Internal error | Retry once. Contact support if persistent |
| 502 | LLM provider error | Retryable — auto-retried 2x internally |
| 503 | Service starting up | Retry after 10-15s (cold start) |
What are Wauldo rate limits?
| Plan | Requests/month | Premium AI calls | Price |
|---|---|---|---|
| Basic | 500 | — | Free |
| Pro | 10,000 | — | $19/mo |
| Ultra | 100,000 | — | $99/mo |
| Mega | Unlimited | — | $0.008/req |
Rate limits are per API key. Manage your subscription on RapidAPI.
What are Wauldo limits and quotas?
| Resource | Limit |
|---|---|
| Request body | 10 MB |
| Max chunks per upload | 5,000 |
| Embedding dimensions | 1 – 4,096 |
| Streaming response | 256 KB |
| SSE timeout | 1,800s (30 min) |
| Standard API timeout | 180s (3 min) |
| Source chunks per query | Max 3 (relevance-filtered) |
How to use the Wauldo interactive explorer?
Try the API directly without writing code: