Wauldo is a verification layer for AI agents. It returns a numeric support_score between 0 and 1 with per-claim verdicts, so teams know which claims are grounded in their sources and which are hallucinated.

How does Wauldo verify AI outputs?

Wauldo extracts atomic claims from any AI answer, matches each claim against your uploaded sources, and returns a support score with per-claim verification (SAFE, UNVERIFIED, CONFLICT, etc.).

How much does Wauldo cost?

Wauldo has a free tier with 500 requests/month on RapidAPI. Paid plans start at $19/month for 10,000 requests with premium models and priority support.

What languages are supported by Wauldo SDKs?

Wauldo provides official SDKs for Python (PyPI), TypeScript (npm), and Rust (crates.io). All SDKs are MIT-licensed and support the full API including verification and fact-checking.

Can Wauldo detect hallucinations in RAG pipelines?

Yes. Wauldo integrates with any RAG pipeline to verify that retrieved context actually supports the generated answer. It measures hallucination rate and provides per-claim citations.

How to catch your first hallucination?

3 steps. Under 3 minutes. See Guard block a wrong answer.

1

Install + get your free API key

pip install wauldo

npm install wauldo

No install needed. Just curl.

Get free key on RapidAPI →

2

Verify an LLM response with Guard

from wauldo import HttpClient

client = HttpClient(
    base_url="https://smart-rag-api.p.rapidapi.com",
    api_key="YOUR_RAPIDAPI_KEY"
)

# Your LLM said this. Is it true?
result = client.guard(
    text="Returns are accepted within 60 days of purchase",
    source_context="Our return policy allows returns within 14 days."
)

print(result)  # verdict: rejected, reason: numerical_mismatch

import { HttpClient } from 'wauldo';

const client = new HttpClient({
  baseUrl: 'https://smart-rag-api.p.rapidapi.com',
  apiKey: 'YOUR_RAPIDAPI_KEY'
});

// Your LLM said this. Is it true?
const result = await client.guard(
  'Returns are accepted within 60 days of purchase',
  'Our return policy allows returns within 14 days.'
);

console.log(result); // verdict: rejected, reason: numerical_mismatch

curl -X POST https://smart-rag-api.p.rapidapi.com/v1/fact-check \
  -H "X-RapidAPI-Key: YOUR_KEY" \
  -H "X-RapidAPI-Host: smart-rag-api.p.rapidapi.com" \
  -H "Content-Type: application/json" \
  -d '{"text":"Returns are accepted within 60 days of purchase","source_context":"Our return policy allows returns within 14 days.","mode":"lexical"}'

3

Hallucination caught

{
  "verdict": "rejected",
  "action": "block",
  "reason": "numerical_mismatch",
  "confidence": 0.03,
  "supported": false
}

60 days vs 14 days. Guard caught it. Your user never sees the wrong answer.

Next: Upload your own documents and verify answers against them.

API Documentation

Wauldo is a RAG API that returns verified answers with source citations and confidence scores. OpenAI SDK compatible. Zero hallucinations.

Base URL

https://api.wauldo.com

Protocol

REST + SSE Streaming

Auth

RapidAPI Key or JWT

New here? Get a free API key on RapidAPI (500 requests/month, no credit card), then follow the Quick Start below.

How does Wauldo authentication work?

Two authentication methods are supported:

Option 1 — RapidAPI recommended

Get your API key from RapidAPI and include it in every request:

// Headers
X-RapidAPI-Key: your_api_key
X-RapidAPI-Host: smart-rag-api.p.rapidapi.com

Option 2 — Direct API Key

For self-hosted deployments, use a long-lived API key:

# Pass your API key as a Bearer token
curl -X POST https://api.wauldo.com/v1/fact-check \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "60 days", "source_context": "14 days", "mode": "lexical"}'

How to get started with Wauldo quickly?

Upload a document and get a verified answer in 2 API calls:

1

Upload your document

curl -X POST https://api.wauldo.com/v1/upload \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Section 4.2: Late payments incur a 2% monthly fee...",
    "filename": "contract.txt"
  }'

↓

2

Ask a question

curl -X POST https://api.wauldo.com/v1/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the late payment fee?", "top_k": 5}'

↓

3

Get a verified answer

{
  "answer": "The contract specifies a 2% monthly late payment fee (Section 4.2).",
  "sources": [
    { "content": "Section 4.2: Late payments incur a 2% monthly fee...", "score": 0.92 }
  ],
  "audit": {
    "confidence": 0.92,
    "grounded": true,
    "model": "auto"
  }
}

How does Wauldo's OpenAI SDK compatibility work?

Wauldo is a drop-in replacement for the OpenAI API. Just change the base_url — your existing code works as-is.

from openai import OpenAI

# Just swap the base_url — everything else is the same
client = OpenAI(
    base_url="https://api.wauldo.com/v1",
    api_key="your_jwt_token"
)

response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.wauldo.com/v1',
  apiKey: 'your_jwt_token',
});

const stream = await client.chat.completions.create({
  model: 'auto',
  messages: [{ role: 'user', content: 'Explain quantum computing' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

curl https://api.wauldo.com/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Explain quantum computing"}],
    "stream": true
  }'

Supported endpoints: /v1/chat/completions, /v1/models — both work identically to OpenAI. Wauldo auto-selects the best model for each request unless you specify one.

How to upload documents to Wauldo?

POST /v1/upload

Upload text content to be chunked, indexed, and available for queries.

Request Body

Parameter	Type	Description
content required	string	Document text content (max 10MB)
filename optional	string	Filename for source tracking (e.g. report.txt)

Response 200

{
  "status": "success",
  "chunks_count": 12,
  "source": "report.txt"
}

How to upload files to Wauldo?

POST /v1/upload/file

Upload a file directly using multipart form data.

Supported formats

.pdf .docx .txt .md .csv .json .yaml .xml .html .rtf .py .js .ts .rs .java .go .cpp .sql .sh .css .toml .log .png .jpg .gif .webp

curl -X POST https://api.wauldo.com/v1/upload/file \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@contract.txt"

Response 200

{
  "status": "success",
  "chunks_count": 24,
  "source": "contract.txt",
  "file_size": 15234
}

How to query Wauldo for verified answers?

POST /v1/query

Ask a question against your uploaded documents. Returns a verified answer with sources, confidence score, and full audit trail.

Request Body

Parameter	Type	Description
query required	string	Your question
top_k optional	integer	Number of source chunks to retrieve (default: 5, max: 20)
stream optional	boolean	Enable SSE streaming — see Streaming guide
debug optional	boolean	Include retrieval funnel diagnostics — see Audit Trail
quality_mode optional	string	fast, balanced, or premium — see Quality Modes

Response 200

{
  "answer": "The contract specifies a 2% monthly late payment fee (Section 4.2).",
  "sources": [
    {
      "content": "Section 4.2: Late payments incur a 2% monthly fee...",
      "score": 0.92,
      "source": "contract.txt"
    }
  ],
  "audit": {
    "confidence": 0.92,
    "confidence_label": "high",
    "grounded": true,
    "retrieval_path": "BM25Reranked",
    "model": "auto",
    "latency_ms": 1420,
    "sources_used": 2,
    "sources_evaluated": 5
  }
}

How does Wauldo Chat Completions work?

POST /v1/chat/completions

OpenAI-compatible chat endpoint. Works with any OpenAI SDK. Supports streaming.

Request Body

Parameter	Type	Description
messages required	array	Array of {"role": "user"\|"system"\|"assistant", "content": "..."}
model optional	string	Model name or "auto" (default: auto-selected)
stream optional	boolean	Enable SSE streaming (recommended for UX)
temperature optional	number	Sampling temperature, 0.0 to 2.0 (default: 0.7)
max_tokens optional	integer	Maximum tokens in the response

How to list available models in Wauldo?

GET /v1/models

Returns available models. OpenAI SDK compatible.

curl https://api.wauldo.com/v1/models \
  -H "Authorization: Bearer $TOKEN"

How to use Wauldo collections?

GET /v1/collections

List all document collections for the authenticated tenant.

DELETE /v1/collections/{name}

Delete a collection and all its chunks. Useful for re-uploading updated documents.

How does Wauldo fact-check work?

POST /v1/fact-check

Verify text claims against source context. Returns a structured verdict with actionable decisions per claim.

Request

curl -X POST https://api.wauldo.com/v1/fact-check \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "text": "Returns are accepted within 60 days. Gift cards never expire.",
    "source_context": "Our policy allows returns within 14 days. Gift cards expire after 12 months.",
    "mode": "lexical"
  }'

Response

{
  "verdict": "rejected",
  "action": "block",
  "hallucination_rate": 1.0,
  "claims": [
    {
      "text": "Returns are accepted within 60 days.",
      "verdict": "rejected",
      "action": "block",
      "confidence": 0.3,
      "confidence_label": "very_low",
      "reason": "numerical_mismatch"
    },
    {
      "text": "Gift cards never expire.",
      "verdict": "rejected",
      "action": "block",
      "confidence": 0.2,
      "confidence_label": "very_low",
      "reason": "negation_conflict"
    }
  ]
}

Verification Modes

Mode	Speed	Accuracy	Requires
`lexical`	<1ms	Good (catches numbers, negations)	Nothing
`hybrid`	~50ms	Better (adds semantic similarity)	Embedding model
`semantic`	~100ms	Best (full embeddings)	Embedding model

Verdicts & Actions

Verdict	Action	Meaning
`verified`	`allow`	Claim matches source (confidence ≥ 0.7)
`weak`	`review`	Partial match, needs human review (0.4–0.7)
`rejected`	`block`	Contradiction or no evidence (< 0.4)

Rejection Reasons

Reason	Example
`numerical_mismatch`	"60 days" vs source says "14 days"
`negation_conflict`	"never expire" vs source says "12 months"
`insufficient_evidence`	Claim topic not found in source
`partial_match`	Some overlap but not enough to verify

Use Cases

Customer support — verify agent responses before sending
Compliance — check documents against policy
Content moderation — detect false claims automatically
AI pipelines — validate LLM outputs before downstream use

How does Wauldo citation verify work?

POST /v1/verify

Verify that AI-generated text properly cites its sources. Detects uncited sentences, phantom citations (references to non-existent sources), and calculates citation coverage.

Request

curl -X POST https://api.wauldo.com/v1/verify \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "text": "Rust was released in 2010 [Source: rust_book]. It is fast [Source: fake_doc].",
    "sources": [
      {"name": "rust_book", "content": "Rust was first released in 2010 by Mozilla."}
    ],
    "threshold": 0.5
  }'

Response

{
  "citation_ratio": 1.0,
  "has_sufficient_citations": true,
  "sentence_count": 2,
  "citation_count": 2,
  "uncited_sentences": [],
  "citations": [
    {"citation": "[Source: rust_book]", "source_name": "rust_book", "is_valid": true},
    {"citation": "[Source: fake_doc]", "source_name": "fake_doc", "is_valid": false}
  ],
  "phantom_count": 1,
  "processing_time_ms": 0
}

Parameters

Field	Type	Required	Description
`text`	string	Yes	AI-generated text to verify (max 64 KB)
`sources`	array	No	Source chunks to validate citations against
`threshold`	number	No	Min citation ratio (0.0–1.0, default 0.5)

Citation Formats Detected

Format	Example
Source tag	`[Source: doc1]`, `[Ref: paper]`, `[See: manual]`
Numeric	`[1]`, `[2]`, `[42]`
Parenthetical	`(Source: report)`, `(Ref: study)`
Footnote	`^1`, `^12`

Use Cases

RAG pipelines — ensure LLM responses cite retrieved chunks
Academic/legal — verify all claims are sourced
Hallucination detection — flag phantom citations referencing missing sources
Quality gates — block responses below citation threshold

How to get insights from Wauldo?

GET /v1/insights

Get ROI metrics for your API key: token savings, estimated cost reduction, policy distribution, and validation latency. Track exactly how much value the pipeline delivers.

Request

curl https://api.wauldo.com/v1/insights \
  -H "Authorization: Bearer YOUR_TOKEN"

Response

{
  "tig_key": "tig_abc123",
  "total_requests": 1842,
  "intelligence_requests": 1650,
  "fallback_requests": 192,
  "tokens": {
    "baseline_total": 2450000,
    "real_total": 1890000,
    "saved_total": 560000,
    "saved_percent_avg": 22.8
  },
  "cost": {
    "estimated_usd_saved": 1.12
  },
  "policy": {
    "reasoning_distribution": {"fast": 1200, "balanced": 380, "deep": 70},
    "rag_distribution": {"full": 900, "light": 550, "none": 200}
  },
  "validation": {
    "validation_distribution": {"strict": 500, "standard": 1150},
    "total_validation_tokens": 85000,
    "total_validation_latency_ms": 42000,
    "avg_validation_latency_ms": 25.5
  },
  "period": {
    "since": "2026-04-01T00:00:00Z",
    "until": "now"
  }
}

Response Fields

Field	Description
`tokens.saved_total`	Total tokens saved by the intelligence pipeline
`tokens.saved_percent_avg`	Average savings percentage across all requests
`cost.estimated_usd_saved`	Estimated cost savings in USD
`policy.reasoning_distribution`	How many requests used each reasoning mode (fast/balanced/deep)
`validation.avg_validation_latency_ms`	Average validation latency in milliseconds

Shareable Card

GET /v1/insights/share returns a standalone HTML page with your savings metrics, optimized for sharing on LinkedIn and Twitter (Open Graph tags included).

How to use Wauldo analytics?

GET /v1/analytics

Cache performance, token savings, cost tracking, and system prompt deduplication metrics. Monitor your API usage and optimization in real time.

Request

curl "https://api.wauldo.com/v1/analytics?minutes=60" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response

{
  "cache": {
    "total_requests": 500,
    "result_store_hits": 45,
    "semantic_cache_hits": 120,
    "cache_misses": 335,
    "cache_hit_rate": 0.33,
    "avg_latency_ms": 180.5,
    "p95_latency_ms": 450.0,
    "p99_latency_ms": 890.0
  },
  "tokens": {
    "total_baseline": 125000,
    "total_real": 98000,
    "total_saved": 27000,
    "avg_savings_percent": 21.6
  },
  "cost": {
    "total_cost_usd": 0.25,
    "estimated_cost_saved_usd": 0.054,
    "cost_per_hour_usd": 0.25
  },
  "dedup": {
    "unique_system_prompts": 3,
    "total_requests": 500,
    "total_tokens_saved": 15000
  },
  "uptime_secs": 86400
}

Parameters

Field	Type	Required	Description
`minutes`	integer	No	Time window in minutes for cost metrics (default: 60). Cache, token, and dedup stats are cumulative since server start.

Traffic Monitoring

GET /v1/analytics/traffic returns per-tenant traffic stats: requests today, tokens used, success rate, average latency, and P95 latency. Useful for monitoring production workloads.

curl https://api.wauldo.com/v1/analytics/traffic \
  -H "Authorization: Bearer YOUR_TOKEN"

Traffic Response

{
  "total_requests_today": 3200,
  "total_tokens_today": 1450000,
  "top_tenants": [
    {
      "tenant_id": "user_abc",
      "requests_today": 850,
      "tokens_used": 380000,
      "success_rate": 0.98,
      "avg_latency_ms": 210
    }
  ],
  "error_rate": 0.02,
  "avg_latency_ms": 180,
  "p95_latency_ms": 450,
  "uptime_secs": 86400
}

How to check Wauldo API health?

GET /health

Returns API health, RAG chunk count, Redis status, active provider, and uptime. No auth required.

{
  "status": "ok",
  "rag_chunks": 142,
  "redis": "connected",
  "provider": "openrouter",
  "uptime_seconds": 86400
}

SSE Streaming

When stream: true is set on /v1/query, the response is delivered as Server-Sent Events (SSE). This lets you show sources and stream the answer token-by-token for a great UX.

Event sequence

sources Sent first — contains the retrieved source chunks with scores. Display these immediately while the answer generates.

token Sent repeatedly — each event contains one token of the answer. Append to your UI in real-time.

audit Sent once after all tokens — contains the full audit trail (confidence, grounded, model, latency).

[DONE] Stream complete. Close the connection.

Example: consume the stream

import requests, json

resp = requests.post(
    "https://api.wauldo.com/v1/query",
    headers={"Authorization": f"Bearer {token}"},
    json={"query": "What is the late fee?", "stream": True},
    stream=True
)

for line in resp.iter_lines():
    if not line:
        continue
    data = line.decode().removeprefix("data: ")
    if data == "[DONE]":
        break
    event = json.loads(data)

    if "sources" in event:
        print(f"Found {len(event['sources'])} sources")
    elif "token" in event:
        print(event["token"], end="")
    elif "audit" in event:
        print(f"\nConfidence: {event['audit']['confidence']}")

const resp = await fetch('https://api.wauldo.com/v1/query', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${token}`, 'Content-Type': 'application/json' },
  body: JSON.stringify({ query: 'What is the late fee?', stream: true })
});

const reader = resp.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  for (const line of decoder.decode(value).split('\n')) {
    if (!line.startsWith('data: ')) continue;
    const data = line.slice(6);
    if (data === '[DONE]') return;
    const event = JSON.parse(data);
    if (event.token) document.getElementById('answer').textContent += event.token;
  }
}

curl -N https://api.wauldo.com/v1/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the late fee?", "stream": true}'

# Output:
# data: {"sources": [...]}
# data: {"token": "The"}
# data: {"token": " contract"}
# data: {"token": " specifies"}
# ...
# data: {"audit": {"confidence": 0.92, "grounded": true, ...}}
# data: [DONE]

Audit Trail

Every query response includes an audit object that makes the answer self-verifiable. Use it to build trust indicators in your UI, flag low-confidence answers, or debug retrieval issues.

Audit fields

confidence 0.0 to 1.0 — How confident the system is in the answer. Based on source relevance scores and fact-checking. Display as a percentage in your UI.

confidence_label high, medium, or low. Use this to color-code answers: green, yellow, red.

grounded true or false — Whether the answer is fully supported by the retrieved sources. If false, the answer may contain information not in your documents.

retrieval_path Which retrieval strategy was used: BM25Only, BM25Reranked, or DenseFull. See Retrieval Paths.

model Identifier of the model that generated the answer. Surfaced for audit and observability.

latency_ms Total processing time in milliseconds (retrieval + LLM generation).

sources_used Number of source chunks included in the LLM context.

sources_evaluated Total chunks considered before filtering. Compare with sources_used to see filtering effectiveness.

Debug mode: Add "debug": true to your query to get the full retrieval funnel: candidates_found → candidates_after_tenant → candidates_after_score → sources_used. Useful for diagnosing "I uploaded a doc but the answer seems wrong" issues.

Using audit in your app

# Show a trust badge based on confidence
audit = response["audit"]

if audit["grounded"] and audit["confidence_label"] == "high":
    show_badge("Verified", color="green")      # Safe to display
elif audit["confidence_label"] == "medium":
    show_badge("Likely correct", color="yellow") # Show with caveat
else:
    show_badge("Low confidence", color="red")    # Warn the user

# Log for monitoring
log(model=audit["model"], latency=audit["latency_ms"], path=audit["retrieval_path"])

Quality Modes

Control the speed/quality tradeoff with the quality_mode parameter. If omitted, Wauldo auto-selects the best tier based on your query complexity and RAG confidence.

Fast

Lightweight model

~2-4s latency

Best for: simple questions, chat, summaries

Balanced

Mid-tier model

~5-10s latency

Best for: reasoning, analysis, mid-range RAG

Default for most queries

Premium

Premium model

~5-8s latency

Best for: complex analysis, critical accuracy

RAG quality: When retrieval confidence is high, Wauldo auto-selects a model optimized for document-grounded answers. See /pricing for per-tier pricing.

# Explicitly set quality mode
curl -X POST https://api.wauldo.com/v1/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Analyze the financial implications",
    "quality_mode": "premium"
  }'

Plan limits: Basic (free) plan caps at balanced tier. Upgrade to Pro or higher for premium access.

Retrieval Paths

Wauldo picks a retrieval strategy per query based on signal strength. The retrieval_path in the audit trail tells you which was used.

BM25Only strong keyword match

Fast keyword matching. Used when the query closely matches document terms. Fastest path (~10ms retrieval).

Example: "What is the late payment fee?" against a contract with those exact terms.

BM25Reranked moderate keyword match

Lexical retrieval + neural reranking. Best balance of speed and accuracy. Catches semantic matches that keyword search might miss.

Example: "How much extra do I pay if I'm late?" — paraphrased query, similar meaning.

DenseFull weak keyword match

Full dense vector search with rank fusion. Most thorough but slowest path. Used when the query is conceptually related but uses different vocabulary.

Example: "financial penalties for overdue invoices" against a doc that says "late payment fee".

Multi-source merge: Regardless of path, the top relevant chunks are included (max 3 sources). Sources are labeled by relevance so the LLM can resolve conflicts deterministically: Source 1 always wins.

Use Cases

Wauldo works best when you need verified, source-cited answers from your own documents.

⚖

Legal & Compliance

Upload contracts, policies, or regulations. Ask about specific clauses, obligations, or deadlines. Every answer cites the exact section.

Q: "What is the termination notice period?"
A: "60 days written notice (Section 12.3)"
   confidence: 0.95 | grounded: true

📚

Knowledge Base / Support

Upload product docs, FAQs, or runbooks. Build a support bot that gives accurate answers instead of hallucinating.

Q: "How do I reset my password?"
A: "Go to Settings > Security > Reset..."
   confidence: 0.88 | grounded: true

📈

Financial Analysis

Upload earnings reports, balance sheets, or market research. Extract specific numbers with source verification.

Q: "What was Q3 revenue?"
A: "$4.2M, up 23% YoY (page 3)"
   confidence: 0.91 | grounded: true

🛠

Technical Documentation

Upload API specs, architecture docs, or code. Get precise technical answers grounded in your actual documentation.

Q: "What's the max payload size?"
A: "10MB per request (API limits doc)"
   confidence: 0.93 | grounded: true

How to install Wauldo Python SDK from PyPI?

pip install wauldo

from wauldo import HttpClient

client = HttpClient(base_url="https://api.wauldo.com", api_key="YOUR_API_KEY")

# Guard — catch hallucinations in 3 lines
result = client.guard(
    text="Returns accepted within 60 days.",
    source_context="Our policy: returns within 14 days.",
)
print(result.verdict)              # "rejected"
print(result.claims[0].reason)    # "numerical_mismatch"

# RAG — upload, ask, verify
client.rag_upload(content="Your document text...", filename="doc.txt")
result = client.rag_query("What are the key points?")
print(result.answer)
print(result.sources)

How to install Wauldo TypeScript SDK from npm?

npm install wauldo

import { HttpClient } from 'wauldo';

const client = new HttpClient({
  baseUrl: 'https://api.wauldo.com',
  apiKey: 'YOUR_API_KEY',
});

// Guard — catch hallucinations
const result = await client.guard(
  'Returns accepted within 60 days.',
  'Our policy: returns within 14 days.',
);
console.log(result.verdict);            // "rejected"
console.log(result.claims[0]?.reason);  // "numerical_mismatch"

// RAG — upload, ask, verify
await client.ragUpload('Your document text...', 'doc.txt');
const answer = await client.ragQuery('What are the key points?');
console.log(answer.answer);

How to use Wauldo Rust SDK from crates.io?

cargo add wauldo

use wauldo::{HttpClient, ChatRequest, ChatMessage};

let client = HttpClient::with_key("https://api.wauldo.com", "YOUR_API_KEY")?;

// Guard — catch hallucinations
let result = client.guard(
    "Returns accepted within 60 days.",
    "Our policy: returns within 14 days.",
    None,
).await?;
println!("Verdict: {}", result.verdict);  // "rejected"

// RAG — upload, ask, verify
client.rag_upload("Your document text...", None).await?;
let result = client.rag_query("What are the key points?", None).await?;
println!("{}", result.answer);

How to deploy agents with Wauldo?

Create custom AI agents that verify every response before delivery. Upload documents, configure behavior, run queries — every answer is fact-checked.

Quick Start — Create and run an agent in 60 seconds

# 1. Create an agent
curl -X POST https://api.wauldo.com/v1/agents \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "support-bot",
    "description": "Customer support agent",
    "wauldo_toml": "[agent]\nname = \"support-bot\"\n\n[model]\nprovider = \"openrouter\"\nname = \"auto\"",
    "agents_md": "# Support Bot\nAnswer questions based ONLY on uploaded documents."
  }'

# Returns: { "id": "ag_abc123", "name": "support-bot", ... }

# 2. Upload a document
curl -X POST https://api.wauldo.com/v1/upload \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"content": "Returns accepted within 14 days...", "filename": "policy.txt"}'

# 3. Run the agent
curl -X POST https://api.wauldo.com/v1/agents/ag_abc123/runs \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"input": "What is the return policy?", "verification_mode": "balanced"}'

# Returns: { "task_id": "t_xyz", "status": "queued" }

# 4. Get the verified result
curl https://api.wauldo.com/v1/tasks/t_xyz \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant"

# Returns:
# {
#   "result": "Returns are accepted within 14 days.",
#   "verification": { "verdict": "SAFE", "trust_score": 1.0 }
# }

Agent Configuration (`wauldo.toml`)

wauldo.toml declares the agent's identity: which model to call, how strict verification should be, where the agent runs, and whether it remembers prior runs. It's the single file that controls behavior across all /v1/agents/:id/runs calls. AGENTS.md (optional) layers natural-language behavior instructions on top.

Minimal example

Two required sections, defaults applied for everything else.

[agent]
name = "support-bot"

[model]
provider = "openrouter"
name = "auto"

Full example (all sections)

[agent]
name = "support-bot"
description = "Handles customer support questions"
instructions = "./AGENTS.md"     # behavior file path
skills = "./skills/"             # skills directory
mcp = "./mcp.json"                # MCP server config

[model]
provider = "openrouter"           # openrouter | openai | anthropic | ollama
name = "auto"                    # "auto" for smart routing, or any provider model id
fallback = ["auto"]                # tried in order if primary fails
temperature = 0.2

[sandbox]
type = "none"                     # none | docker | daytona | modal | runloop

[verification]
mode = "balanced"                 # strict | balanced | permissive
min_trust_score = 0.6            # 0.0 – 1.0; reject below this

[deploy]
target = "local"                  # local | fly | render | selfhost
region = "cdg"

[memory]
enabled = true
namespace = "support"
auto_write = true

Field reference

Field	Type	Required	Default	Description
agent.name	string	required	—	`[a-zA-Z0-9_-]` only. Identifier shown in dashboards and logs.
agent.description	string	optional	`""`	One-line summary of what the agent does.
agent.instructions	string	optional	`"./AGENTS.md"`	Path to the markdown file with behavior rules.
agent.skills	string	optional	`"./skills/"`	Directory of optional skill files.
agent.mcp	string	optional	`"./mcp.json"`	MCP server configuration file.
model.provider	string	required	—	One of `openrouter`, `openai`, `anthropic`, `ollama`.
model.name	string	required	—	Model identifier, or `"auto"` for cost-aware routing.
model.fallback	string[]	optional	`[]`	Models tried in order if the primary fails.
model.temperature	number	optional	`null`	Sampling temperature passed through to the provider.
sandbox.type	enum	optional	`"none"`	`none` \| `docker` \| `daytona` \| `modal` \| `runloop`. Where tool calls execute.
verification.mode	enum	optional	`"balanced"`	`strict` \| `balanced` \| `permissive`. Default for runs that don't override.
verification.min_trust_score	number	optional	`0.6`	In `[0.0, 1.0]`. Below this, results are flagged.
deploy.target	enum	optional	`"local"`	`local` \| `fly` \| `render` \| `selfhost`.
deploy.region	string	optional	`null`	Deploy-target-specific region code.
memory.enabled	bool	optional	`false`	When true, the agent reads/writes a memory namespace across runs.
memory.namespace	string	optional	`""`	Logical bucket separating memory between agents.
memory.auto_write	bool	optional	`false`	When true, every successful run is auto-saved to memory.

Pass the file's contents as the wauldo_toml string field on POST /v1/agents. Validation runs server-side: missing required fields or out-of-range values return 400.

AGENTS.md (optional)

# Support Bot

You are a customer support agent for Acme Corp.

## Rules
- Answer questions based ONLY on the uploaded documents.
- If the answer is not in your sources, say "I don't have this information."
- Never invent facts or numbers.
- Be concise and professional.

Available Presets

A preset is a built-in multi-state workflow that shapes how the agent reasons before answering. Pass the preset name as the preset field on POST /v1/agents (set at agent creation) or as {"preset": "..."} on POST /v1/agents/:id/runs to override per-run. If omitted, runs default to general_task. wauldo.toml + AGENTS.md control identity, model, and tone — the multi-state workflow comes from the preset.

preset	Description	Typical use case	States
general_task	Single-state grounded Q&A. No side effects unless explicitly asked.	Default chat, support bots, simple lookups	1
planner_executor	Plan-then-execute. Decomposes the query into ordered steps with tool hints + dependencies, executes each step, then synthesises a cited answer. ReAct-style autonomous decomposition.	Multi-step research, anything that benefits from explicit planning before tool calls	3
rust_backend_architect	Senior Rust engineer. Analysis → Tradeoffs → Mitigations → Implementation → Validation.	Backend design review, architecture critique	5
rag_data_engineer	RAG pipeline expert. Audit, chunking strategy, embeddings, retrieval, eval plan.	RAG tuning, retrieval quality work	5
security_auditor	Threat modelling, vulnerability assessment, OWASP/CWE-tagged mitigations.	Security review of code or architecture	5
data_analyst	Data profiling, exploratory analysis, statistical modelling, executive summary.	KPI dashboards, business insights from data	5
growth_hacker	Distribution strategy, channel ROI, launch sequences, pricing positioning.	OSS / dev-tool go-to-market planning	5

Invoke a preset via the API

curl -X POST https://api.wauldo.com/v1/agents/ag_abc123/runs \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Critique my Axum service auth layer",
    "preset": "rust_backend_architect",
    "verification_mode": "balanced"
  }'

# Returns: { "task_id": "t_xyz", "status": "queued" }
# Stream state-by-state via GET /v1/tasks/t_xyz/stream

Custom workflows

Beyond the six presets, you can send your own workflow inline via the custom_preset field on POST /v1/agents. Format matches the built-in presets: workflow.states (allowed_tools, parallel_group, required_outputs), transitions, system, guardrails. Server enforces hard limits — max 50 states, ≤256 KB JSON, no transition cycles, every state's allowed_tools must reference a tool currently registered.

Limit	Value	Why
max_states	50	Cap memory + linear-scan validation cost
max_size	256 KB	Reject payload bombs at parse time
cycles	rejected	DFS check across `transitions` avoids infinite loops
unknown tools	rejected	Defense-in-depth — agent silently skips unknowns at runtime, but the API rejects upfront so you get a clear error

Inline a custom workflow

curl -X POST https://api.wauldo.com/v1/agents \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-triage-agent",
    "wauldo_toml": "[agent]\nname=\"triage\"\n[model]\nprovider=\"openrouter\"\nname=\"auto\"",
    "custom_preset": {
      "version": "2.0",
      "metadata": { "name": "MyTriage", "strict_mode": false },
      "system": { "role": "Issue triage assistant" },
      "workflow": {
        "default_steps": ["Classify", "Answer"],
        "states": {
          "Classify": { "allowed_tools": ["wikipedia"] },
          "Answer":   { "allowed_tools": [] }
        },
        "transitions": [
          { "from_state": "Classify", "to_state": "Answer", "condition_trigger": "always" }
        ]
      },
      "output_formats": { "default": { "schema": { "schema_type": "object", "required": [], "properties": {} } } },
      "guardrails": { "forbidden_behaviors": ["leak credentials"] }
    }
  }'

# custom_preset takes precedence over the built-in `preset` field if both are set.

Need an even larger graph or stricter limits raised? contact@wauldo.com — quotas can be tuned per tenant.

Endpoints

Method	Path	Description
POST	/v1/agents	Create an agent. Body: `{"name","description","wauldo_toml","agents_md"?,"preset"?}`
GET	/v1/agents	List your agents
GET	/v1/agents/:id	Get agent details
PATCH	/v1/agents/:id	Update agent config. Body: partial fields
DELETE	/v1/agents/:id	Delete agent
POST	/v1/agents/:id/runs	Run agent. Body: `{"input": "...", "verification_mode": "strict"\|"balanced"\|"off"}` → returns `{task_id, status}`

All authenticated endpoints require Authorization: Bearer YOUR_KEY and x-rapidapi-user: YOUR_TENANT_ID. Poll GET /v1/tasks/:id or stream GET /v1/tasks/:id/stream to retrieve the run result.

Verification Modes

Naming note — trust_score vs support_score. The API JSON returns the field as trust_score for backward compatibility with v0.x clients. The Python, TypeScript, and Rust SDKs expose it as support_score. Both refer to the same value: the 0–1 fraction of claims supported by the source documents you uploaded.

Every agent run goes through the verification pipeline. Control the strictness:

Mode	Behavior
strict	Unverified answers are blocked. Safest.
balanced	Unverified answers marked as partial. Default.
permissive	All answers returned with support score. Most lenient.

Verdict enum

Each completed task returns a verification.verdict. Pair it with verification.trust_score (0.0 – 1.0) and the optional verification.message for display.

verdict	When	Recommended action
SAFE	trust_score ≥ 0.7, claims supported by uploaded sources	Deliver as-is
UNCERTAIN	0.4 ≤ trust_score < 0.7	Show with warning / human review
PARTIAL	Mix of supported + unsupported claims	Display scrubbed version; see `stripped_claims`
BLOCK	Hallucination detected OR prompt injection	Do not surface to users
CONFLICT	Contradictory numerical values in output	Review before delivery
UNVERIFIED	No source documents uploaded — or no claim found sufficient support in the sources	Upload docs via `/v1/upload` to enable real verification; treat as low-trust until then

Note on UNVERIFIED: when verification_source = "prompt_only", the returned confidence and hallucination_rate reflect self-consistency of the LLM output against the prompt — not ground-truth fact-checking. trust_score is forced to 0.0 in that case. Treat verdict + trust_score + message as authoritative.

Python SDK

import json, time, urllib.request

BASE = "https://api.wauldo.com"
HEADERS = {
    "Authorization": "Bearer YOUR_KEY",
    "x-rapidapi-user": "my-tenant",
    "Content-Type": "application/json",
}

# Create agent
agent = post("/v1/agents", {
    "name": "my-bot",
    "wauldo_toml": '[agent]\nname = "my-bot"\n\n[model]\nprovider = "openrouter"\nname = "auto"',
})

# Upload docs
post("/v1/upload", {"content": "Your document text...", "filename": "doc.txt"})

# Run agent
run = post(f"/v1/agents/{agent['id']}/runs", {"input": "What is the refund policy?"})

# Poll for result
while True:
    task = get(f"/v1/tasks/{run['task_id']}")
    if task["status"] == "completed":
        print(task["result"])                      # Verified answer
        print(task["verification"]["verdict"])     # SAFE
        print(task["verification"]["trust_score"])  # 1.0
        break
    time.sleep(3)

Streaming (SSE) — `GET /v1/tasks/:id/stream`

Instead of polling, subscribe to Server-Sent Events to receive each workflow state transition as it completes. Ideal for long-running multi-state agents (RustArchitect, SecurityAuditor, etc.) where you want to stream reasoning in the UI. Each data: line is a JSON-encoded StateTransition.

Event field	Meaning
state_name	e.g. `Analysis`, `Tradeoffs`, `Answer`. Synthetic `TASK_COMPLETED` / `TASK_FAILED` for already-terminal tasks.
to_state	Next state name, or `null` on final state.
raw_output	Full LLM output for the state (truncated to 8k chars).
condition	`"Sequential execution"`, `"Parallel group '…'"` etc.
duration_ms	Wall time spent in the LLM call for this state.
prompt_tokens / completion_tokens	Rough estimates (~4 chars/token).
repair_count	Number of JSON repair passes applied on the final-state output.
success	`true` if the state completed without validation errors.

The stream closes when the task reaches a terminal status (completed / failed / cancelled). After closure, call GET /v1/tasks/:id once to fetch the full verification block and final result. Connection TTL is 30 minutes; reconnect if needed — already-emitted events are not replayed, so resubscribers only see subsequent transitions.

# Python — consume SSE with the stdlib
import json, urllib.request

req = urllib.request.Request(
    f"https://api.wauldo.com/v1/tasks/{task_id}/stream",
    headers={
        "Authorization": "Bearer YOUR_KEY",
        "x-rapidapi-user": "my-tenant",
        "Accept": "text/event-stream",
    },
)
with urllib.request.urlopen(req) as resp:
    for raw in resp:
        line = raw.decode().rstrip()
        if line.startswith("data:"):
            ev = json.loads(line[5:].strip())
            print(f"{ev['state_name']:<16} {ev['duration_ms']:>5}ms  {ev['completion_tokens']}tok")

// TypeScript — native EventSource (browser or bun/deno)
const headers = {
  "Authorization": "Bearer YOUR_KEY",
  "x-rapidapi-user": "my-tenant",
  "Accept": "text/event-stream",
};

const resp = await fetch(`https://api.wauldo.com/v1/tasks/${taskId}/stream`, { headers });
const reader = resp.body!.getReader();
const decoder = new TextDecoder();
let buf = "";
while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  buf += decoder.decode(value, { stream: true });
  for (const line of buf.split("\n")) {
    if (line.startsWith("data:")) {
      const ev = JSON.parse(line.slice(5).trim());
      console.log(ev.state_name, ev.duration_ms, ev.completion_tokens);
    }
  }
  buf = buf.slice(buf.lastIndexOf("\n") + 1);
}

How do Wauldo agent revisions work?

Every change to an agent's custom_preset mints an immutable, content-addressed revision (SHA-256). The agent points to one active revision; you can roll back or promote any past revision in O(1) — no LLM call, no rebuild. Modeled on AWS ECS task definitions: append-only history, atomic active pointer.

Why versioning matters

You tweak an agent the morning of a demo. It breaks. With revisions, rollback is one PATCH to the previous revision — your live runs flip back to the known-good prompt instantly. No re-validation, no re-deploy, no LLM cost.

Trait	Behavior
immutable	Revisions are never mutated in place — content-addressed via SHA-256.
monotone `rev`	Per-agent counter, never reused even after prune.
implicit mint	`POST /v1/agents` with `custom_preset` mints rev 1; subsequent `PATCH /v1/agents/:id` with `custom_preset` mints the next rev.
cap	50 revisions per agent. Oldest non-active revisions auto-pruned.
tenant-scoped	Revisions live under the tenant. Cross-tenant reads are rejected.
cascade delete	`DELETE /v1/agents/:id` purges all revisions atomically.

Endpoints

Method	Path	Description
POST	/v1/agents/:id/revisions	Mint a new revision (rate-limited 5/min/tenant)
GET	/v1/agents/:id/revisions	List revisions newest-first
GET	/v1/agents/:id/revisions/:rev	Fetch one revision verbatim
PATCH	/v1/agents/:id/active-revision	Promote / rollback in O(1) — body `{"rev": <n>}`

# Mint a new revision (becomes active by default)
curl -X POST https://api.wauldo.com/v1/agents/AGENT_ID/revisions \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "custom_preset": { "version": "2.0", "workflow": { "states": [...] } },
    "message": "tighten triage prompt",
    "set_active": true
  }'

# Returns: { "rev": 4, "sha256": "abc...", "active_rev": 4 }

# List revisions, newest first
curl https://api.wauldo.com/v1/agents/AGENT_ID/revisions \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant"

# Rollback to a previous revision (no LLM cost, instant)
curl -X PATCH https://api.wauldo.com/v1/agents/AGENT_ID/active-revision \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"rev": 3}'

SDK examples

# Python
from wauldo.agents import AgentsClient
agents = AgentsClient(base_url="https://api.wauldo.com", api_key="YOUR_KEY", tenant="my-tenant")

rev = agents.create_revision("AGENT_ID", custom_preset=preset_v2, message="tighten triage")
print(rev.rev, rev.sha256)

# Rollback in one line
agents.set_active_revision("AGENT_ID", rev=3)

// TypeScript
import { AgentsClient } from "wauldo";
const agents = new AgentsClient({ baseUrl: "https://api.wauldo.com", apiKey: "YOUR_KEY", tenant: "my-tenant" });

const rev = await agents.createRevision("AGENT_ID", { customPreset: presetV2, message: "tighten triage" });
await agents.setActiveRevision("AGENT_ID", 3);

// Rust
use wauldo::{AgentsClient, CreateRevisionRequest};
let agents = AgentsClient::new("https://api.wauldo.com")
    .with_api_key("YOUR_KEY")
    .with_tenant("my-tenant");

let rev = agents.create_revision("AGENT_ID", CreateRevisionRequest {
    custom_preset: preset_v2,
    message: Some("tighten triage".into()),
    set_active: true,
}).await?;

agents.set_active_revision("AGENT_ID", 3).await?;

How to use Wauldo cost tags?

Attach team, env, project, billing labels to an agent's wauldo.toml. Every LLM call originating from that agent emits cost metrics tagged with those labels — Prometheus / Grafana can then slice spend per team or per project without guessing from agent names.

How it works

Trait	Behavior
declarative	Tags live in `wauldo.toml`, not in every API call — set once, applied everywhere.
sanitized	Values must match `[a-zA-Z0-9._-]{1,32}`. Invalid inputs coerced to `"default"` and counted in `wauldo_cost_tag_rejected_total` so you can spot misconfigured agents.
low cardinality	4 fixed label keys. PII or unbounded values are blocked at ingest — your `/metrics` endpoint stays cheap to scrape.
unset = "default"	Missing tags fall back to `"default"`, never null — Grafana queries never need a special case for un-tagged spend.

Configure

# wauldo.toml — declared per agent
[agent]
name = "product-search"

[agent.tags]
team    = "frontend"
env     = "prod"
project = "search-relevance"
billing = "internal"

# Then create the agent normally — tags are picked up automatically.
curl -X POST https://api.wauldo.com/v1/agents \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "product-search",
    "wauldo_toml": "",
    "preset": "rag_grounded"
  }'

Read the metrics

Two Prometheus counters surface tagged spend (admin-scoped /metrics) :

# Cost in micro-USD broken down by tags + model
wauldo_llm_cost_by_tag_micro_usd_total{model, tag_team, tag_env, tag_project, tag_billing}

# Token volume same breakdown
wauldo_llm_tokens_by_tag_total{model, tag_team, tag_env, tag_project, tag_billing, kind}

How to configure Wauldo webhooks?

Subscribe to verdict and lifecycle events. When a task crosses an attention threshold (BLOCK / CONFLICT / UNVERIFIED), Wauldo fires a signed POST to your endpoint within seconds. Built for Slack handlers, on-call alerts, audit ingestion.

Event types

Event	Fired when
`task.completed`	A task reaches a terminal verdict — payload carries `verdict`, `support_score`, `halluc_rate`, `claims_count`.
`task.failed`	A task errored out — payload carries `error`.
`task.cancelled`	A task was cancelled mid-execution.
`verification.alert`	Auto-fired alongside `task.completed` when verdict is BLOCK / CONFLICT (severity `high`) or UNVERIFIED / INSUFFICIENT_CLAIMS / UNCERTAIN (severity `medium`).
`recommendation.new`	A new Insights recommendation surfaced for the tenant.
`*`	Wildcard subscription — receive every event.

Reliability guarantees

Trait	Behavior
at-least-once	Retries with exponential backoff (max 5 attempts, 1s → 60s). Your handler must be idempotent on `X-Event-Id`.
circuit breaker	Per-destination URL: 5 consecutive failures opens a 60s cooldown. A dead URL no longer drags every retry through 5 × 16 s of backoff.
DLQ	Final failures land in a dead-letter queue. Inspect via `GET /v1/webhooks/dlq`, replay via `POST /v1/webhooks/dlq/:event_id/retry`.
HMAC-SHA256	When you register a `secret`, every POST carries `X-Wauldo-Signature: sha256=<hex>` over the raw body. Verify with the standard `HMAC_SHA256(secret, raw_body)` recipe.
SSRF guardrails	Private IPs (10/8, 172.16/12, 192.168/16, 127/8, IPv6 loopback / link-local / ULA / IPv4-mapped) are rejected at registration AND re-validated at DLQ retry.

Endpoints

Method	Path	Description
POST	/v1/webhooks	Register a subscription
GET	/v1/webhooks	List subscriptions
DELETE	/v1/webhooks/:id	Remove a subscription
GET	/v1/webhooks/dlq	List failed deliveries
POST	/v1/webhooks/dlq/:event_id/retry	Replay a failed delivery
DELETE	/v1/webhooks/dlq/:event_id	Purge a DLQ entry

# Register a webhook for verdict alerts
curl -X POST https://api.wauldo.com/v1/webhooks \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/wauldo-hook",
    "events": ["verification.alert", "task.failed"],
    "secret": "whsec_your_random_secret"
  }'

# Verify a signature (Node.js example)
const crypto = require("crypto");
function verify(req) {
  const got = req.headers["x-wauldo-signature"];
  const expected = "sha256=" + crypto
    .createHmac("sha256", process.env.WEBHOOK_SECRET)
    .update(req.rawBody)
    .digest("hex");
  return crypto.timingSafeEqual(Buffer.from(got), Buffer.from(expected));
}

How do Wauldo workflows work?

Author multi-step pipelines as a state machine — Task / Choice / Wait / Pass / Fail / Succeed with explicit transitions. The runtime executes sequentially with bounded wall-clock, persists every transition, and exposes per-state observability. Today's executor supports tool:<name> resources ; agent chaining lands in a future release.

State types

Type	Purpose	Required fields
Task	Invoke a registered tool.	`resource`, `next`
Choice	Branch on a JSONPath variable.	`choices[]`, `default`
Wait	Pause up to 60 seconds.	`seconds`, `next`
Pass	Inject a constant payload.	`result`, `next`
Fail	Terminate with an error.	`error`
Succeed	Terminate with the current IO.	—

Validation guarantees (at create time)

Check	Behavior
cycle detection	DFS catches A→B→A and longer cycles before storage. `start_at` must reach every state.
transition targets	Every `next` must reference an existing state — no dangling pointers.
choice operators	Strict enum: `eq`, `neq`, `gt`, `lt`, `contains`. Unknown operators are 400 at create time, never at run time.
tenant cap	100 workflows per tenant. Tenant-scoped — cross-tenant reads rejected.
durable	Persisted on the API host's local store — survives restart without re-uploading.

Runtime guarantees (at execution time)

Cap	Behavior
wall clock	Each run terminates within 60 seconds. Past the deadline the run is marked `timed_out`.
wait	A single `Wait` state cannot exceed 60 seconds. Longer values rejected at runtime.
transitions	Hard ceiling of 200 state visits per run — protects against runaway loops that slip past static cycle detection.
history	5000 stored runs per tenant. Each run record is upserted on every state transition for full audit.
async	Runs are submit-and-poll. `POST /runs` returns 202 with an `execution_id` ; poll `GET /runs/:execution_id` for status.

Endpoints

Method	Path	Description
POST	/v1/workflows	Create a workflow definition.
GET	/v1/workflows	List workflows for the calling tenant.
GET	/v1/workflows/:id	Fetch one definition.
DELETE	/v1/workflows/:id	Remove a definition.
POST	/v1/workflows/:id/runs	Start an asynchronous execution. Returns 202 with an `execution_id`.
GET	/v1/workflows/:id/runs/:execution_id	Fetch the current state and output of a run.

Define a workflow

Three states : compute via tool:calculator, branch on the result, terminate. State type values use PascalCase ; transitions use next.

# Sequential pipeline: compute → branch → succeed
curl -X POST https://api.wauldo.com/v1/workflows \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "triage",
    "start_at": "Compute",
    "states": {
      "Compute": {
        "type": "Task",
        "resource": "tool:calculator",
        "next": "Route"
      },
      "Route": {
        "type": "Choice",
        "choices": [
          { "variable": "$.output", "operator": "contains", "value": "42", "next": "Done" }
        ],
        "default": "Done"
      },
      "Done": { "type": "Succeed" }
    }
  }'

Run it and poll

Submit the run, capture the execution_id, and poll until status reaches a terminal value (succeeded, failed, timed_out).

# Start the run (202 Accepted)
curl -X POST https://api.wauldo.com/v1/workflows/$WF_ID/runs \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "input": { "operation": "add", "a": 21, "b": 21 } }'

# → { "execution_id": "wfr_...", "workflow_id": "wf_...", "status": "running" }

# Poll the run record
curl https://api.wauldo.com/v1/workflows/$WF_ID/runs/$EXECUTION_ID \
  -H "Authorization: Bearer YOUR_KEY"

# → {
#     "execution": {
#       "id": "wfr_...",
#       "status": "succeeded",
#       "current_state": null,
#       "input":  { "operation": "add", "a": 21, "b": 21 },
#       "output": { "output": "42" },
#       "started_at": 1747...,
#       "ended_at":   1747...,
#       "error": null
#     }
# }

SDK clients

Same create → start → poll loop without writing the HTTP yourself. The wait_for_run helper polls until terminal so you get the final WorkflowExecution back directly.

from wauldo.workflows import WorkflowsClient

wf = WorkflowsClient(base_url="https://api.wauldo.com", api_key="YOUR_KEY")

created = wf.create(
    name="triage",
    start_at="Compute",
    states={
        "Compute": {"type": "Task", "resource": "tool:calculator", "next": "Done"},
        "Done": {"type": "Succeed"},
    },
)

run = wf.start_run(created.id, input={"operation": "add", "a": 21, "b": 21})
final = wf.wait_for_run(created.id, run.execution_id)
print(final.status, final.output)  # succeeded {'result': 42.0}

import { WorkflowsClient } from "wauldo";

const wf = new WorkflowsClient({ baseUrl: "https://api.wauldo.com", apiKey: "YOUR_KEY" });

const created = await wf.create({
  name: "triage",
  startAt: "Compute",
  states: {
    Compute: { type: "Task", resource: "tool:calculator", next: "Done" },
    Done: { type: "Succeed" },
  },
});

const run = await wf.startRun(created.id, { operation: "add", a: 21, b: 21 });
const final = await wf.waitForRun(created.id, run.execution_id);
console.log(final.status, final.output); // succeeded { result: 42 }

use wauldo::workflows::{CreateWorkflowRequest, WorkflowsClient};
use serde_json::json;
use std::collections::HashMap;

let wf = WorkflowsClient::new("https://api.wauldo.com")
    .with_api_key("YOUR_KEY");

let mut states = HashMap::new();
states.insert("Compute".into(), json!({"type": "Task", "resource": "tool:calculator", "next": "Done"}));
states.insert("Done".into(), json!({"type": "Succeed"}));

let created = wf.create(CreateWorkflowRequest {
    name: "triage".into(),
    start_at: "Compute".into(),
    states,
    description: None,
}).await?;

let run = wf.start_run(&created.id, Some(json!({"operation": "add", "a": 21, "b": 21}))).await?;
let final_exec = wf.wait_for_run(&created.id, &run.execution_id, None, None).await?;
println!("{} {:?}", final_exec.status, final_exec.output);

Execution record fields

Field	Description
id	Unique `wfr_*` identifier for this execution.
status	One of `running`, `succeeded`, `failed`, `timed_out`.
current_state	Name of the state being executed (while running). Null on terminal records.
input	The JSON body submitted to `POST /runs`.
output	Final IO value on success. Null when the run failed or timed out.
error	Human-readable error reason on terminal failure. Mirrors the Prometheus `reason` label.
started_at / ended_at	Unix seconds. `ended_at` is null while the run is in flight.

JSONPath subset

Variables in Choice.variable and Task.output_path use a small JSONPath subset. Missing fields fall through to the default branch ; invalid syntax is rejected.

Expression	Resolves to
`$`	The full current IO.
`$.field`	A top-level field.
`$.a.b.c`	A nested field path.
`$.arr[0]`	An array element by index.
`$.users[0].name`	Nested element field.

How to share Wauldo runs?

Publish any verified run as a public URL — https://wauldo.com/r/<id>. The page renders the verdict, claim-by-claim breakdown, sources, and timeline. Drop the link in Slack, a PR, or a bug report and the recipient sees what your agent actually said — without an account, without your prompt, without your tools.

Privacy contract

Trait	Behavior
private by default	Runs are never public until you call `POST /v1/tasks/:id/share` — explicit, per run, never global.
strict whitelist	The public payload exposes `verdict`, `support_score`, `halluc_rate`, `claims_count`, the answer, claim breakdown, source URLs, journal phase names + durations. Everything else stays in your tenant — never `custom_preset`, never `wauldo.toml`, never the system prompt, never tool args / results.
unguessable id	128 bits of entropy (`r_` + 32 hex). No login required to view ; the id is the credential.
noindex by default	The public page sets `X-Robots-Tag: noindex,nofollow` so Google doesn't index your shares. Opt-in indexing is a future flag.
TTL	30 days for free-tier tenants ; `expires_at: null` (no expiration) on paid tenants. Computed at publish time so a tier change never retroactively expires already-shared runs.
cap	1000 live shared runs per tenant. `POST` returns 429 above the cap.
idempotent	Re-publishing an already-shared run returns the existing id without bumping the cap.

Endpoints

Method	Path	Description
POST	/v1/tasks/:task_id/share	Publish a run (auth, idempotent). Returns `{share_id, url, expires_at}`.
GET	/v1/runs/:share_id	Public read — no auth. Whitelisted payload only.
DELETE	/v1/tasks/:task_id/share	Unpublish (auth, idempotent).

# Publish a completed run
curl -X POST https://api.wauldo.com/v1/tasks/TASK_ID/share \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{}'

# Returns: { "share_id": "r_abc...", "url": "https://wauldo.com/r/r_abc...", "expires_at": 1780797026156 }

# Anyone can read (no auth)
curl https://api.wauldo.com/v1/runs/r_abc...

# Or open the public page directly:
# https://wauldo.com/r/r_abc...

SDK examples

# Python (wauldo >= 0.13)
from wauldo.agents import AgentsClient
agents = AgentsClient(base_url="https://api.wauldo.com", api_key="YOUR_KEY", tenant="my-tenant")

share = agents.share_task("TASK_ID")
print(share.url)        # https://wauldo.com/r/r_abc...
print(share.expires_at) # epoch ms or None
agents.unshare_task("TASK_ID")

// TypeScript (wauldo >= 0.12)
import { AgentsClient } from "wauldo";
const agents = new AgentsClient({ baseUrl: "https://api.wauldo.com", apiKey: "YOUR_KEY", tenant: "my-tenant" });

const share = await agents.shareTask("TASK_ID");
console.log(share.url);
await agents.unshareTask("TASK_ID");

// Rust (wauldo >= 0.12)
use wauldo::AgentsClient;
let agents = AgentsClient::new("https://api.wauldo.com")
    .with_api_key("YOUR_KEY")
    .with_tenant("my-tenant");

let share = agents.share_task("TASK_ID").await?;
println!("{}", share.url);
agents.unshare_task("TASK_ID").await?;

How to use Wauldo OpenAI middleware?

Drop-in wrapper for the OpenAI Python SDK (and any OpenAI-compatible client). Three lines and every chat.completions.create() response carries a verified .wauldo namespace — verdict, support score, hallucination rate, claim count.

Install

pip install 'wauldo[openai]'

Pulls openai >= 1.0, < 2.0 as an extra. Doesn't conflict with an existing OpenAI install.

Usage

from openai import OpenAI
from wauldo.openai import with_verification

client = OpenAI()                # or AsyncOpenAI, or any OpenAI-compat client
verified = with_verification(    # wraps it
    client,
    wauldo_api_key="tig_live_...",
    fact_check_mode="lexical",  # or "hybrid", "semantic"
)

response = verified.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Capital of France?"}],
)

# Standard OpenAI ChatCompletion, plus:
response.wauldo.verdict          # "SAFE" | "UNVERIFIED" | "CONFLICT" | ...
response.wauldo.support_score    # 0.94
response.wauldo.halluc_rate      # 0.0
response.wauldo.claims_count     # 7
response.wauldo.fact_check_mode  # "lexical"
response.wauldo.error            # None when verdict attached cleanly

Behavior contract

Trait	Behavior
non-invasive	Wraps the client via a thin proxy. Every other attribute (`client.api_key`, `client.base_url`, etc.) passes through unchanged.
async auto-detected	If you pass `AsyncOpenAI`, the wrapper returns an awaitable proxy automatically. No `is_async=` flag.
fail-open by default	Wauldo down / timeout / unexpected response shape attaches `response.wauldo.error` + logs a warning. The OpenAI response itself is never broken.
`raise_on_error=True`	Opt in to bubble verification failures as `WauldoVerificationError` exceptions instead.
streaming pass-through	v1 : `stream=True` returns the OpenAI generator with `response.wauldo = None`. Post-stream verdict is a v2 feature.
no extra deps	Uses stdlib `urllib` for the `/v1/fact-check` POST — no `requests` / `httpx` pulled at runtime.

How does Wauldo memory work?

Key-value memory with namespace isolation. Store conversation context, user preferences, or any structured data per tenant. Supports lexical search.

What your agent remembers

Your agent persists context across calls via /v1/memory/:namespace. The endpoint is a generic key-value store with semantic search — you can use any namespace name you want. To stay aligned with the broader agent ecosystem (LangChain, CrewAI), we recommend four conventional namespaces below. They are conventions only, not enforced by the API.

Type	Namespace	TTL	Typical use case
Short-term	`short_term`	Client-managed	Chat history per session, windowed context. Conceptually FIFO — truncate or delete on the client side as the window fills.
Long-term	`long_term`	Persistent	User facts and preferences that should survive across sessions. Semantic search over keys + values.
Entity	`entity`	Persistent	Extracted entities (people, places, concepts) stored as structured JSON. Semantic search returns the closest entity record.
Contextual	`contextual`	Task-scoped	Scratch pad tied to a single task. Delete the keys explicitly when the task completes.

# short_term — append a session message
curl -X POST https://api.wauldo.com/v1/memory/short_term \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "session_abc:msg_017", "value": "user: how do I cancel my plan?", "tags": ["session_abc"]}'

# long_term — upsert a user preference
curl -X POST https://api.wauldo.com/v1/memory/long_term \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "user_42:tone", "value": "prefers concise replies, no emoji", "tags": ["user_42"]}'

# entity — store a structured entity record
curl -X POST https://api.wauldo.com/v1/memory/entity \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "person:ada_lovelace", "value": "{\"type\":\"person\",\"name\":\"Ada Lovelace\",\"role\":\"mathematician\",\"born\":1815}", "tags": ["person"]}'

# contextual — task-scoped scratch pad
curl -X POST https://api.wauldo.com/v1/memory/contextual \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "task_t_xyz:plan", "value": "step 1: fetch invoice, step 2: parse total, step 3: refund", "tags": ["task_t_xyz"]}'

Convention, not enforcement. The API does not require these four names — pick whatever fits your domain (chat_history, user_profile, kb_facts, etc.). It also does not auto-rotate or auto-expire entries: TTL, FIFO truncation, and task-scoped cleanup are the client's responsibility via explicit DELETE /v1/memory/:namespace/:key calls.

Endpoints

Method	Path	Description
POST	/v1/memory/:namespace	Create or update an entry
GET	/v1/memory/:namespace	List entries (paginated)
GET	/v1/memory/:namespace/:key	Get a specific entry
POST	/v1/memory/:namespace/search	Search entries by query
DELETE	/v1/memory/:namespace/:key	Delete an entry

# Store a memory entry
curl -X POST https://api.wauldo.com/v1/memory/support-context \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "user_preference", "value": "prefers email over phone", "tags": ["contact"]}'

# Search memory
curl -X POST https://api.wauldo.com/v1/memory/support-context/search \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"query": "contact preference", "limit": 5}'

How does Wauldo Agent-to-Agent communication work?

Invoke an agent from another agent. Enables chaining, delegation, and multi-agent workflows. The called agent runs through the same verification pipeline.

# Agent A calls Agent B
curl -X POST https://api.wauldo.com/v1/a2a/AGENT_B_ID \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"input": "Summarize the refund policy", "context": {"caller_agent": "manager-bot"}}'

# Returns a task_id — poll GET /v1/tasks/:id for the verified result

What are Wauldo error codes?

Code	Meaning	Action
400	Bad request	Check required parameters
401	Unauthorized	Check your API key or token
413	Payload too large	Body exceeds 10MB — split your document
429	Rate limited	Wait and retry, or upgrade plan
500	Internal error	Retry once. Contact support if persistent
502	LLM provider error	Retryable — auto-retried 2x internally
503	Service starting up	Retry after 10-15s (cold start)

Auto-retry: The SDKs automatically retry 502/503/500 errors with exponential backoff. If you're using raw HTTP, retry these status codes up to 2 times with a 2s delay.

What are Wauldo rate limits?

Plan	Requests/month	Premium AI calls	Price
Basic	500	—	Free
Pro	10,000	—	$19/mo
Ultra	100,000	—	$99/mo
Mega	Unlimited	—	$0.008/req

Rate limits are per API key. Manage your subscription on RapidAPI.

What are Wauldo limits and quotas?

Resource	Limit
Request body	10 MB
Max chunks per upload	5,000
Embedding dimensions	1 – 4,096
Streaming response	256 KB
SSE timeout	1,800s (30 min)
Standard API timeout	180s (3 min)
Source chunks per query	Max 3 (relevance-filtered)

How to use the Wauldo interactive explorer?

Try the API directly without writing code:

Swagger UI

Interactive API explorer with all endpoints

Postman Collection

10 pre-built requests with examples

Live Demo

Upload & query in your browser

How to catch your first hallucination?

API Documentation

How does Wauldo authentication work?

Option 1 — RapidAPI recommended

Option 2 — Direct API Key

How to get started with Wauldo quickly?

Upload your document

Ask a question

Get a verified answer

How does Wauldo's OpenAI SDK compatibility work?

How to upload documents to Wauldo?

Request Body

Response 200

How to upload files to Wauldo?

Supported formats

Response 200

How to query Wauldo for verified answers?

Request Body

Response 200

How does Wauldo Chat Completions work?

Request Body

How to list available models in Wauldo?

How to use Wauldo collections?

How does Wauldo fact-check work?

Request

Response

Verification Modes

Verdicts & Actions

Rejection Reasons

Use Cases

How does Wauldo citation verify work?

Request

Response

Parameters

Citation Formats Detected

Use Cases

How to get insights from Wauldo?

Request

Response

Response Fields

Shareable Card

How to use Wauldo analytics?

Request

Response

Parameters

Traffic Monitoring

Traffic Response

How to check Wauldo API health?

SSE Streaming

Event sequence

Example: consume the stream

Audit Trail

Audit fields

Using audit in your app

Quality Modes

Retrieval Paths

Use Cases

Legal & Compliance

Knowledge Base / Support

Financial Analysis

Technical Documentation

How to install Wauldo Python SDK from PyPI?

How to install Wauldo TypeScript SDK from npm?

How to use Wauldo Rust SDK from crates.io?

How to deploy agents with Wauldo?

Quick Start — Create and run an agent in 60 seconds

Agent Configuration (wauldo.toml)

Minimal example

Full example (all sections)

Field reference

AGENTS.md (optional)

Available Presets

Invoke a preset via the API

Custom workflows

Inline a custom workflow

Endpoints

Verification Modes

Verdict enum

Python SDK

Streaming (SSE) — GET /v1/tasks/:id/stream

Agent Configuration (`wauldo.toml`)

Streaming (SSE) — `GET /v1/tasks/:id/stream`