Skip to content

How to catch your first hallucination?

3 steps. Under 3 minutes. See Guard block a wrong answer.

1
Install + get your free API key
pip install wauldo
npm install wauldo

No install needed. Just curl.

Get free key on RapidAPI →
2
Verify an LLM response with Guard
from wauldo import HttpClient

client = HttpClient(
    base_url="https://smart-rag-api.p.rapidapi.com",
    api_key="YOUR_RAPIDAPI_KEY"
)

# Your LLM said this. Is it true?
result = client.guard(
    text="Returns are accepted within 60 days of purchase",
    source_context="Our return policy allows returns within 14 days."
)

print(result)  # verdict: rejected, reason: numerical_mismatch
import { HttpClient } from 'wauldo';

const client = new HttpClient({
  baseUrl: 'https://smart-rag-api.p.rapidapi.com',
  apiKey: 'YOUR_RAPIDAPI_KEY'
});

// Your LLM said this. Is it true?
const result = await client.guard(
  'Returns are accepted within 60 days of purchase',
  'Our return policy allows returns within 14 days.'
);

console.log(result); // verdict: rejected, reason: numerical_mismatch
curl -X POST https://smart-rag-api.p.rapidapi.com/v1/fact-check \
  -H "X-RapidAPI-Key: YOUR_KEY" \
  -H "X-RapidAPI-Host: smart-rag-api.p.rapidapi.com" \
  -H "Content-Type: application/json" \
  -d '{"text":"Returns are accepted within 60 days of purchase","source_context":"Our return policy allows returns within 14 days.","mode":"lexical"}'
3
Hallucination caught
{
  "verdict": "rejected",
  "action": "block",
  "reason": "numerical_mismatch",
  "confidence": 0.03,
  "supported": false
}

60 days vs 14 days. Guard caught it. Your user never sees the wrong answer.

Next: Upload your own documents and verify answers against them.

API Documentation

Wauldo is a RAG API that returns verified answers with source citations and confidence scores. OpenAI SDK compatible. Zero hallucinations.

Base URL
https://api.wauldo.com
Protocol
REST + SSE Streaming
Auth
RapidAPI Key or JWT
New here? Get a free API key on RapidAPI (500 requests/month, no credit card), then follow the Quick Start below.

How does Wauldo authentication work?

Two authentication methods are supported:

Option 1 — RapidAPI recommended

Get your API key from RapidAPI and include it in every request:

// Headers
X-RapidAPI-Key: your_api_key
X-RapidAPI-Host: smart-rag-api.p.rapidapi.com

Option 2 — Direct API Key

For self-hosted deployments, use a long-lived API key:

# Pass your API key as a Bearer token
curl -X POST https://api.wauldo.com/v1/fact-check \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "60 days", "source_context": "14 days", "mode": "lexical"}'

How to get started with Wauldo quickly?

Upload a document and get a verified answer in 2 API calls:

1

Upload your document

curl -X POST https://api.wauldo.com/v1/upload \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Section 4.2: Late payments incur a 2% monthly fee...",
    "filename": "contract.txt"
  }'
2

Ask a question

curl -X POST https://api.wauldo.com/v1/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the late payment fee?", "top_k": 5}'
3

Get a verified answer

{
  "answer": "The contract specifies a 2% monthly late payment fee (Section 4.2).",
  "sources": [
    { "content": "Section 4.2: Late payments incur a 2% monthly fee...", "score": 0.92 }
  ],
  "audit": {
    "confidence": 0.92,
    "grounded": true,
    "model": "auto"
  }
}

How does Wauldo's OpenAI SDK compatibility work?

Wauldo is a drop-in replacement for the OpenAI API. Just change the base_url — your existing code works as-is.

from openai import OpenAI

# Just swap the base_url — everything else is the same
client = OpenAI(
    base_url="https://api.wauldo.com/v1",
    api_key="your_jwt_token"
)

response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.wauldo.com/v1',
  apiKey: 'your_jwt_token',
});

const stream = await client.chat.completions.create({
  model: 'auto',
  messages: [{ role: 'user', content: 'Explain quantum computing' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
curl https://api.wauldo.com/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Explain quantum computing"}],
    "stream": true
  }'
Supported endpoints: /v1/chat/completions, /v1/models — both work identically to OpenAI. Wauldo auto-selects the best model for each request unless you specify one.

How to upload documents to Wauldo?

POST /v1/upload

Upload text content to be chunked, indexed, and available for queries.

Request Body

ParameterTypeDescription
content required string Document text content (max 10MB)
filename optional string Filename for source tracking (e.g. report.txt)

Response 200

{
  "status": "success",
  "chunks_count": 12,
  "source": "report.txt"
}

How to upload files to Wauldo?

POST /v1/upload/file

Upload a file directly using multipart form data.

Supported formats

.pdf .docx .txt .md .csv .json .yaml .xml .html .rtf .py .js .ts .rs .java .go .cpp .sql .sh .css .toml .log .png .jpg .gif .webp
curl -X POST https://api.wauldo.com/v1/upload/file \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@contract.txt"

Response 200

{
  "status": "success",
  "chunks_count": 24,
  "source": "contract.txt",
  "file_size": 15234
}

How to query Wauldo for verified answers?

POST /v1/query

Ask a question against your uploaded documents. Returns a verified answer with sources, confidence score, and full audit trail.

Request Body

ParameterTypeDescription
query required string Your question
top_k optional integer Number of source chunks to retrieve (default: 5, max: 20)
stream optional boolean Enable SSE streaming — see Streaming guide
debug optional boolean Include retrieval funnel diagnostics — see Audit Trail
quality_mode optional string fast, balanced, or premium — see Quality Modes

Response 200

{
  "answer": "The contract specifies a 2% monthly late payment fee (Section 4.2).",
  "sources": [
    {
      "content": "Section 4.2: Late payments incur a 2% monthly fee...",
      "score": 0.92,
      "source": "contract.txt"
    }
  ],
  "audit": {
    "confidence": 0.92,
    "confidence_label": "high",
    "grounded": true,
    "retrieval_path": "BM25Reranked",
    "model": "auto",
    "latency_ms": 1420,
    "sources_used": 2,
    "sources_evaluated": 5
  }
}

How does Wauldo Chat Completions work?

POST /v1/chat/completions

OpenAI-compatible chat endpoint. Works with any OpenAI SDK. Supports streaming.

Request Body

ParameterTypeDescription
messages required array Array of {"role": "user"|"system"|"assistant", "content": "..."}
model optional string Model name or "auto" (default: auto-selected)
stream optional boolean Enable SSE streaming (recommended for UX)
temperature optional number Sampling temperature, 0.0 to 2.0 (default: 0.7)
max_tokens optional integer Maximum tokens in the response

How to list available models in Wauldo?

GET /v1/models

Returns available models. OpenAI SDK compatible.

curl https://api.wauldo.com/v1/models \
  -H "Authorization: Bearer $TOKEN"

How to use Wauldo collections?

GET /v1/collections

List all document collections for the authenticated tenant.

DELETE /v1/collections/{name}

Delete a collection and all its chunks. Useful for re-uploading updated documents.

How does Wauldo fact-check work?

POST /v1/fact-check

Verify text claims against source context. Returns a structured verdict with actionable decisions per claim.

Request

curl -X POST https://api.wauldo.com/v1/fact-check \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "text": "Returns are accepted within 60 days. Gift cards never expire.",
    "source_context": "Our policy allows returns within 14 days. Gift cards expire after 12 months.",
    "mode": "lexical"
  }'

Response

{
  "verdict": "rejected",
  "action": "block",
  "hallucination_rate": 1.0,
  "claims": [
    {
      "text": "Returns are accepted within 60 days.",
      "verdict": "rejected",
      "action": "block",
      "confidence": 0.3,
      "confidence_label": "very_low",
      "reason": "numerical_mismatch"
    },
    {
      "text": "Gift cards never expire.",
      "verdict": "rejected",
      "action": "block",
      "confidence": 0.2,
      "confidence_label": "very_low",
      "reason": "negation_conflict"
    }
  ]
}

Verification Modes

ModeSpeedAccuracyRequires
lexical<1msGood (catches numbers, negations)Nothing
hybrid~50msBetter (adds semantic similarity)Embedding model
semantic~100msBest (full embeddings)Embedding model

Verdicts & Actions

VerdictActionMeaning
verifiedallowClaim matches source (confidence ≥ 0.7)
weakreviewPartial match, needs human review (0.4–0.7)
rejectedblockContradiction or no evidence (< 0.4)

Rejection Reasons

ReasonExample
numerical_mismatch"60 days" vs source says "14 days"
negation_conflict"never expire" vs source says "12 months"
insufficient_evidenceClaim topic not found in source
partial_matchSome overlap but not enough to verify

Use Cases

  • Customer support — verify agent responses before sending
  • Compliance — check documents against policy
  • Content moderation — detect false claims automatically
  • AI pipelines — validate LLM outputs before downstream use

How does Wauldo citation verify work?

POST /v1/verify

Verify that AI-generated text properly cites its sources. Detects uncited sentences, phantom citations (references to non-existent sources), and calculates citation coverage.

Request

curl -X POST https://api.wauldo.com/v1/verify \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "text": "Rust was released in 2010 [Source: rust_book]. It is fast [Source: fake_doc].",
    "sources": [
      {"name": "rust_book", "content": "Rust was first released in 2010 by Mozilla."}
    ],
    "threshold": 0.5
  }'

Response

{
  "citation_ratio": 1.0,
  "has_sufficient_citations": true,
  "sentence_count": 2,
  "citation_count": 2,
  "uncited_sentences": [],
  "citations": [
    {"citation": "[Source: rust_book]", "source_name": "rust_book", "is_valid": true},
    {"citation": "[Source: fake_doc]", "source_name": "fake_doc", "is_valid": false}
  ],
  "phantom_count": 1,
  "processing_time_ms": 0
}

Parameters

FieldTypeRequiredDescription
textstringYesAI-generated text to verify (max 64 KB)
sourcesarrayNoSource chunks to validate citations against
thresholdnumberNoMin citation ratio (0.0–1.0, default 0.5)

Citation Formats Detected

FormatExample
Source tag[Source: doc1], [Ref: paper], [See: manual]
Numeric[1], [2], [42]
Parenthetical(Source: report), (Ref: study)
Footnote^1, ^12

Use Cases

  • RAG pipelines — ensure LLM responses cite retrieved chunks
  • Academic/legal — verify all claims are sourced
  • Hallucination detection — flag phantom citations referencing missing sources
  • Quality gates — block responses below citation threshold

How to get insights from Wauldo?

GET /v1/insights

Get ROI metrics for your API key: token savings, estimated cost reduction, policy distribution, and validation latency. Track exactly how much value the pipeline delivers.

Request

curl https://api.wauldo.com/v1/insights \
  -H "Authorization: Bearer YOUR_TOKEN"

Response

{
  "tig_key": "tig_abc123",
  "total_requests": 1842,
  "intelligence_requests": 1650,
  "fallback_requests": 192,
  "tokens": {
    "baseline_total": 2450000,
    "real_total": 1890000,
    "saved_total": 560000,
    "saved_percent_avg": 22.8
  },
  "cost": {
    "estimated_usd_saved": 1.12
  },
  "policy": {
    "reasoning_distribution": {"fast": 1200, "balanced": 380, "deep": 70},
    "rag_distribution": {"full": 900, "light": 550, "none": 200}
  },
  "validation": {
    "validation_distribution": {"strict": 500, "standard": 1150},
    "total_validation_tokens": 85000,
    "total_validation_latency_ms": 42000,
    "avg_validation_latency_ms": 25.5
  },
  "period": {
    "since": "2026-04-01T00:00:00Z",
    "until": "now"
  }
}

Response Fields

FieldDescription
tokens.saved_totalTotal tokens saved by the intelligence pipeline
tokens.saved_percent_avgAverage savings percentage across all requests
cost.estimated_usd_savedEstimated cost savings in USD
policy.reasoning_distributionHow many requests used each reasoning mode (fast/balanced/deep)
validation.avg_validation_latency_msAverage validation latency in milliseconds

Shareable Card

GET /v1/insights/share returns a standalone HTML page with your savings metrics, optimized for sharing on LinkedIn and Twitter (Open Graph tags included).

How to use Wauldo analytics?

GET /v1/analytics

Cache performance, token savings, cost tracking, and system prompt deduplication metrics. Monitor your API usage and optimization in real time.

Request

curl "https://api.wauldo.com/v1/analytics?minutes=60" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response

{
  "cache": {
    "total_requests": 500,
    "result_store_hits": 45,
    "semantic_cache_hits": 120,
    "cache_misses": 335,
    "cache_hit_rate": 0.33,
    "avg_latency_ms": 180.5,
    "p95_latency_ms": 450.0,
    "p99_latency_ms": 890.0
  },
  "tokens": {
    "total_baseline": 125000,
    "total_real": 98000,
    "total_saved": 27000,
    "avg_savings_percent": 21.6
  },
  "cost": {
    "total_cost_usd": 0.25,
    "estimated_cost_saved_usd": 0.054,
    "cost_per_hour_usd": 0.25
  },
  "dedup": {
    "unique_system_prompts": 3,
    "total_requests": 500,
    "total_tokens_saved": 15000
  },
  "uptime_secs": 86400
}

Parameters

FieldTypeRequiredDescription
minutesintegerNoTime window in minutes for cost metrics (default: 60). Cache, token, and dedup stats are cumulative since server start.

Traffic Monitoring

GET /v1/analytics/traffic returns per-tenant traffic stats: requests today, tokens used, success rate, average latency, and P95 latency. Useful for monitoring production workloads.

curl https://api.wauldo.com/v1/analytics/traffic \
  -H "Authorization: Bearer YOUR_TOKEN"

Traffic Response

{
  "total_requests_today": 3200,
  "total_tokens_today": 1450000,
  "top_tenants": [
    {
      "tenant_id": "user_abc",
      "requests_today": 850,
      "tokens_used": 380000,
      "success_rate": 0.98,
      "avg_latency_ms": 210
    }
  ],
  "error_rate": 0.02,
  "avg_latency_ms": 180,
  "p95_latency_ms": 450,
  "uptime_secs": 86400
}

How to check Wauldo API health?

GET /health

Returns API health, RAG chunk count, Redis status, active provider, and uptime. No auth required.

{
  "status": "ok",
  "rag_chunks": 142,
  "redis": "connected",
  "provider": "openrouter",
  "uptime_seconds": 86400
}

SSE Streaming

When stream: true is set on /v1/query, the response is delivered as Server-Sent Events (SSE). This lets you show sources and stream the answer token-by-token for a great UX.

Event sequence

sources Sent first — contains the retrieved source chunks with scores. Display these immediately while the answer generates.
token Sent repeatedly — each event contains one token of the answer. Append to your UI in real-time.
audit Sent once after all tokens — contains the full audit trail (confidence, grounded, model, latency).
[DONE] Stream complete. Close the connection.

Example: consume the stream

import requests, json

resp = requests.post(
    "https://api.wauldo.com/v1/query",
    headers={"Authorization": f"Bearer {token}"},
    json={"query": "What is the late fee?", "stream": True},
    stream=True
)

for line in resp.iter_lines():
    if not line:
        continue
    data = line.decode().removeprefix("data: ")
    if data == "[DONE]":
        break
    event = json.loads(data)

    if "sources" in event:
        print(f"Found {len(event['sources'])} sources")
    elif "token" in event:
        print(event["token"], end="")
    elif "audit" in event:
        print(f"\nConfidence: {event['audit']['confidence']}")
const resp = await fetch('https://api.wauldo.com/v1/query', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${token}`, 'Content-Type': 'application/json' },
  body: JSON.stringify({ query: 'What is the late fee?', stream: true })
});

const reader = resp.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  for (const line of decoder.decode(value).split('\n')) {
    if (!line.startsWith('data: ')) continue;
    const data = line.slice(6);
    if (data === '[DONE]') return;
    const event = JSON.parse(data);
    if (event.token) document.getElementById('answer').textContent += event.token;
  }
}
curl -N https://api.wauldo.com/v1/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is the late fee?", "stream": true}'

# Output:
# data: {"sources": [...]}
# data: {"token": "The"}
# data: {"token": " contract"}
# data: {"token": " specifies"}
# ...
# data: {"audit": {"confidence": 0.92, "grounded": true, ...}}
# data: [DONE]

Audit Trail

Every query response includes an audit object that makes the answer self-verifiable. Use it to build trust indicators in your UI, flag low-confidence answers, or debug retrieval issues.

Audit fields

confidence 0.0 to 1.0 — How confident the system is in the answer. Based on source relevance scores and fact-checking. Display as a percentage in your UI.
confidence_label high, medium, or low. Use this to color-code answers: green, yellow, red.
grounded true or false — Whether the answer is fully supported by the retrieved sources. If false, the answer may contain information not in your documents.
retrieval_path Which retrieval strategy was used: BM25Only, BM25Reranked, or DenseFull. See Retrieval Paths.
model Identifier of the model that generated the answer. Surfaced for audit and observability.
latency_ms Total processing time in milliseconds (retrieval + LLM generation).
sources_used Number of source chunks included in the LLM context.
sources_evaluated Total chunks considered before filtering. Compare with sources_used to see filtering effectiveness.
Debug mode: Add "debug": true to your query to get the full retrieval funnel: candidates_foundcandidates_after_tenantcandidates_after_scoresources_used. Useful for diagnosing "I uploaded a doc but the answer seems wrong" issues.

Using audit in your app

# Show a trust badge based on confidence
audit = response["audit"]

if audit["grounded"] and audit["confidence_label"] == "high":
    show_badge("Verified", color="green")      # Safe to display
elif audit["confidence_label"] == "medium":
    show_badge("Likely correct", color="yellow") # Show with caveat
else:
    show_badge("Low confidence", color="red")    # Warn the user

# Log for monitoring
log(model=audit["model"], latency=audit["latency_ms"], path=audit["retrieval_path"])

Quality Modes

Control the speed/quality tradeoff with the quality_mode parameter. If omitted, Wauldo auto-selects the best tier based on your query complexity and RAG confidence.

Fast
Lightweight model
~2-4s latency
Best for: simple questions, chat, summaries
Premium
Premium model
~5-8s latency
Best for: complex analysis, critical accuracy
RAG quality: When retrieval confidence is high, Wauldo auto-selects a model optimized for document-grounded answers. See /pricing for per-tier pricing.
# Explicitly set quality mode
curl -X POST https://api.wauldo.com/v1/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Analyze the financial implications",
    "quality_mode": "premium"
  }'
Plan limits: Basic (free) plan caps at balanced tier. Upgrade to Pro or higher for premium access.

Retrieval Paths

Wauldo picks a retrieval strategy per query based on signal strength. The retrieval_path in the audit trail tells you which was used.

BM25Only strong keyword match

Fast keyword matching. Used when the query closely matches document terms. Fastest path (~10ms retrieval).

Example: "What is the late payment fee?" against a contract with those exact terms.

BM25Reranked moderate keyword match

Lexical retrieval + neural reranking. Best balance of speed and accuracy. Catches semantic matches that keyword search might miss.

Example: "How much extra do I pay if I'm late?" — paraphrased query, similar meaning.

DenseFull weak keyword match

Full dense vector search with rank fusion. Most thorough but slowest path. Used when the query is conceptually related but uses different vocabulary.

Example: "financial penalties for overdue invoices" against a doc that says "late payment fee".

Multi-source merge: Regardless of path, the top relevant chunks are included (max 3 sources). Sources are labeled by relevance so the LLM can resolve conflicts deterministically: Source 1 always wins.

Use Cases

Wauldo works best when you need verified, source-cited answers from your own documents.

Legal & Compliance

Upload contracts, policies, or regulations. Ask about specific clauses, obligations, or deadlines. Every answer cites the exact section.

Q: "What is the termination notice period?"
A: "60 days written notice (Section 12.3)"
   confidence: 0.95 | grounded: true
📚

Knowledge Base / Support

Upload product docs, FAQs, or runbooks. Build a support bot that gives accurate answers instead of hallucinating.

Q: "How do I reset my password?"
A: "Go to Settings > Security > Reset..."
   confidence: 0.88 | grounded: true
📈

Financial Analysis

Upload earnings reports, balance sheets, or market research. Extract specific numbers with source verification.

Q: "What was Q3 revenue?"
A: "$4.2M, up 23% YoY (page 3)"
   confidence: 0.91 | grounded: true
🛠

Technical Documentation

Upload API specs, architecture docs, or code. Get precise technical answers grounded in your actual documentation.

Q: "What's the max payload size?"
A: "10MB per request (API limits doc)"
   confidence: 0.93 | grounded: true

How to install Wauldo Python SDK from PyPI?

pip install wauldo
from wauldo import HttpClient

client = HttpClient(base_url="https://api.wauldo.com", api_key="YOUR_API_KEY")

# Guard — catch hallucinations in 3 lines
result = client.guard(
    text="Returns accepted within 60 days.",
    source_context="Our policy: returns within 14 days.",
)
print(result.verdict)              # "rejected"
print(result.claims[0].reason)    # "numerical_mismatch"

# RAG — upload, ask, verify
client.rag_upload(content="Your document text...", filename="doc.txt")
result = client.rag_query("What are the key points?")
print(result.answer)
print(result.sources)

How to install Wauldo TypeScript SDK from npm?

npm install wauldo
import { HttpClient } from 'wauldo';

const client = new HttpClient({
  baseUrl: 'https://api.wauldo.com',
  apiKey: 'YOUR_API_KEY',
});

// Guard — catch hallucinations
const result = await client.guard(
  'Returns accepted within 60 days.',
  'Our policy: returns within 14 days.',
);
console.log(result.verdict);            // "rejected"
console.log(result.claims[0]?.reason);  // "numerical_mismatch"

// RAG — upload, ask, verify
await client.ragUpload('Your document text...', 'doc.txt');
const answer = await client.ragQuery('What are the key points?');
console.log(answer.answer);

How to use Wauldo Rust SDK from crates.io?

cargo add wauldo
use wauldo::{HttpClient, ChatRequest, ChatMessage};

let client = HttpClient::with_key("https://api.wauldo.com", "YOUR_API_KEY")?;

// Guard — catch hallucinations
let result = client.guard(
    "Returns accepted within 60 days.",
    "Our policy: returns within 14 days.",
    None,
).await?;
println!("Verdict: {}", result.verdict);  // "rejected"

// RAG — upload, ask, verify
client.rag_upload("Your document text...", None).await?;
let result = client.rag_query("What are the key points?", None).await?;
println!("{}", result.answer);

How to deploy agents with Wauldo?

Create custom AI agents that verify every response before delivery. Upload documents, configure behavior, run queries — every answer is fact-checked.

Quick Start — Create and run an agent in 60 seconds

# 1. Create an agent
curl -X POST https://api.wauldo.com/v1/agents \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "support-bot",
    "description": "Customer support agent",
    "wauldo_toml": "[agent]\nname = \"support-bot\"\n\n[model]\nprovider = \"openrouter\"\nname = \"auto\"",
    "agents_md": "# Support Bot\nAnswer questions based ONLY on uploaded documents."
  }'

# Returns: { "id": "ag_abc123", "name": "support-bot", ... }

# 2. Upload a document
curl -X POST https://api.wauldo.com/v1/upload \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"content": "Returns accepted within 14 days...", "filename": "policy.txt"}'

# 3. Run the agent
curl -X POST https://api.wauldo.com/v1/agents/ag_abc123/runs \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"input": "What is the return policy?", "verification_mode": "balanced"}'

# Returns: { "task_id": "t_xyz", "status": "queued" }

# 4. Get the verified result
curl https://api.wauldo.com/v1/tasks/t_xyz \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant"

# Returns:
# {
#   "result": "Returns are accepted within 14 days.",
#   "verification": { "verdict": "SAFE", "trust_score": 1.0 }
# }

Agent Configuration (wauldo.toml)

wauldo.toml declares the agent's identity: which model to call, how strict verification should be, where the agent runs, and whether it remembers prior runs. It's the single file that controls behavior across all /v1/agents/:id/runs calls. AGENTS.md (optional) layers natural-language behavior instructions on top.

Minimal example

Two required sections, defaults applied for everything else.

[agent]
name = "support-bot"

[model]
provider = "openrouter"
name = "auto"

Full example (all sections)

[agent]
name = "support-bot"
description = "Handles customer support questions"
instructions = "./AGENTS.md"     # behavior file path
skills = "./skills/"             # skills directory
mcp = "./mcp.json"                # MCP server config

[model]
provider = "openrouter"           # openrouter | openai | anthropic | ollama
name = "auto"                    # "auto" for smart routing, or any provider model id
fallback = ["auto"]                # tried in order if primary fails
temperature = 0.2

[sandbox]
type = "none"                     # none | docker | daytona | modal | runloop

[verification]
mode = "balanced"                 # strict | balanced | permissive
min_trust_score = 0.6            # 0.0 – 1.0; reject below this

[deploy]
target = "local"                  # local | fly | render | selfhost
region = "cdg"

[memory]
enabled = true
namespace = "support"
auto_write = true

Field reference

FieldTypeRequiredDefaultDescription
agent.namestringrequired[a-zA-Z0-9_-] only. Identifier shown in dashboards and logs.
agent.descriptionstringoptional""One-line summary of what the agent does.
agent.instructionsstringoptional"./AGENTS.md"Path to the markdown file with behavior rules.
agent.skillsstringoptional"./skills/"Directory of optional skill files.
agent.mcpstringoptional"./mcp.json"MCP server configuration file.
model.providerstringrequiredOne of openrouter, openai, anthropic, ollama.
model.namestringrequiredModel identifier, or "auto" for cost-aware routing.
model.fallbackstring[]optional[]Models tried in order if the primary fails.
model.temperaturenumberoptionalnullSampling temperature passed through to the provider.
sandbox.typeenumoptional"none"none | docker | daytona | modal | runloop. Where tool calls execute.
verification.modeenumoptional"balanced"strict | balanced | permissive. Default for runs that don't override.
verification.min_trust_scorenumberoptional0.6In [0.0, 1.0]. Below this, results are flagged.
deploy.targetenumoptional"local"local | fly | render | selfhost.
deploy.regionstringoptionalnullDeploy-target-specific region code.
memory.enabledbooloptionalfalseWhen true, the agent reads/writes a memory namespace across runs.
memory.namespacestringoptional""Logical bucket separating memory between agents.
memory.auto_writebooloptionalfalseWhen true, every successful run is auto-saved to memory.

Pass the file's contents as the wauldo_toml string field on POST /v1/agents. Validation runs server-side: missing required fields or out-of-range values return 400.

AGENTS.md (optional)

# Support Bot

You are a customer support agent for Acme Corp.

## Rules
- Answer questions based ONLY on the uploaded documents.
- If the answer is not in your sources, say "I don't have this information."
- Never invent facts or numbers.
- Be concise and professional.

Available Presets

A preset is a built-in multi-state workflow that shapes how the agent reasons before answering. Pass the preset name as the preset field on POST /v1/agents (set at agent creation) or as {"preset": "..."} on POST /v1/agents/:id/runs to override per-run. If omitted, runs default to general_task. wauldo.toml + AGENTS.md control identity, model, and tone — the multi-state workflow comes from the preset.

presetDescriptionTypical use caseStates
general_taskSingle-state grounded Q&A. No side effects unless explicitly asked.Default chat, support bots, simple lookups1
planner_executorPlan-then-execute. Decomposes the query into ordered steps with tool hints + dependencies, executes each step, then synthesises a cited answer. ReAct-style autonomous decomposition.Multi-step research, anything that benefits from explicit planning before tool calls3
rust_backend_architectSenior Rust engineer. Analysis → Tradeoffs → Mitigations → Implementation → Validation.Backend design review, architecture critique5
rag_data_engineerRAG pipeline expert. Audit, chunking strategy, embeddings, retrieval, eval plan.RAG tuning, retrieval quality work5
security_auditorThreat modelling, vulnerability assessment, OWASP/CWE-tagged mitigations.Security review of code or architecture5
data_analystData profiling, exploratory analysis, statistical modelling, executive summary.KPI dashboards, business insights from data5
growth_hackerDistribution strategy, channel ROI, launch sequences, pricing positioning.OSS / dev-tool go-to-market planning5

Invoke a preset via the API

curl -X POST https://api.wauldo.com/v1/agents/ag_abc123/runs \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Critique my Axum service auth layer",
    "preset": "rust_backend_architect",
    "verification_mode": "balanced"
  }'

# Returns: { "task_id": "t_xyz", "status": "queued" }
# Stream state-by-state via GET /v1/tasks/t_xyz/stream

Custom workflows

Beyond the six presets, you can send your own workflow inline via the custom_preset field on POST /v1/agents. Format matches the built-in presets: workflow.states (allowed_tools, parallel_group, required_outputs), transitions, system, guardrails. Server enforces hard limits — max 50 states, ≤256 KB JSON, no transition cycles, every state's allowed_tools must reference a tool currently registered.

LimitValueWhy
max_states50Cap memory + linear-scan validation cost
max_size256 KBReject payload bombs at parse time
cyclesrejectedDFS check across transitions avoids infinite loops
unknown toolsrejectedDefense-in-depth — agent silently skips unknowns at runtime, but the API rejects upfront so you get a clear error

Inline a custom workflow

curl -X POST https://api.wauldo.com/v1/agents \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-triage-agent",
    "wauldo_toml": "[agent]\nname=\"triage\"\n[model]\nprovider=\"openrouter\"\nname=\"auto\"",
    "custom_preset": {
      "version": "2.0",
      "metadata": { "name": "MyTriage", "strict_mode": false },
      "system": { "role": "Issue triage assistant" },
      "workflow": {
        "default_steps": ["Classify", "Answer"],
        "states": {
          "Classify": { "allowed_tools": ["wikipedia"] },
          "Answer":   { "allowed_tools": [] }
        },
        "transitions": [
          { "from_state": "Classify", "to_state": "Answer", "condition_trigger": "always" }
        ]
      },
      "output_formats": { "default": { "schema": { "schema_type": "object", "required": [], "properties": {} } } },
      "guardrails": { "forbidden_behaviors": ["leak credentials"] }
    }
  }'

# custom_preset takes precedence over the built-in `preset` field if both are set.

Need an even larger graph or stricter limits raised? contact@wauldo.com — quotas can be tuned per tenant.

Endpoints

MethodPathDescription
POST/v1/agentsCreate an agent. Body: {"name","description","wauldo_toml","agents_md"?,"preset"?}
GET/v1/agentsList your agents
GET/v1/agents/:idGet agent details
PATCH/v1/agents/:idUpdate agent config. Body: partial fields
DELETE/v1/agents/:idDelete agent
POST/v1/agents/:id/runsRun agent. Body: {"input": "...", "verification_mode": "strict"|"balanced"|"off"} → returns {task_id, status}

All authenticated endpoints require Authorization: Bearer YOUR_KEY and x-rapidapi-user: YOUR_TENANT_ID. Poll GET /v1/tasks/:id or stream GET /v1/tasks/:id/stream to retrieve the run result.

Verification Modes

Naming note — trust_score vs support_score. The API JSON returns the field as trust_score for backward compatibility with v0.x clients. The Python, TypeScript, and Rust SDKs expose it as support_score. Both refer to the same value: the 0–1 fraction of claims supported by the source documents you uploaded.

Every agent run goes through the verification pipeline. Control the strictness:

ModeBehavior
strictUnverified answers are blocked. Safest.
balancedUnverified answers marked as partial. Default.
permissiveAll answers returned with support score. Most lenient.

Verdict enum

Each completed task returns a verification.verdict. Pair it with verification.trust_score (0.0 – 1.0) and the optional verification.message for display.

verdictWhenRecommended action
SAFEtrust_score ≥ 0.7, claims supported by uploaded sourcesDeliver as-is
UNCERTAIN0.4 ≤ trust_score < 0.7Show with warning / human review
PARTIALMix of supported + unsupported claimsDisplay scrubbed version; see stripped_claims
BLOCKHallucination detected OR prompt injectionDo not surface to users
CONFLICTContradictory numerical values in outputReview before delivery
UNVERIFIEDNo source documents uploaded — or no claim found sufficient support in the sourcesUpload docs via /v1/upload to enable real verification; treat as low-trust until then

Note on UNVERIFIED: when verification_source = "prompt_only", the returned confidence and hallucination_rate reflect self-consistency of the LLM output against the prompt — not ground-truth fact-checking. trust_score is forced to 0.0 in that case. Treat verdict + trust_score + message as authoritative.

Python SDK

import json, time, urllib.request

BASE = "https://api.wauldo.com"
HEADERS = {
    "Authorization": "Bearer YOUR_KEY",
    "x-rapidapi-user": "my-tenant",
    "Content-Type": "application/json",
}

# Create agent
agent = post("/v1/agents", {
    "name": "my-bot",
    "wauldo_toml": '[agent]\nname = "my-bot"\n\n[model]\nprovider = "openrouter"\nname = "auto"',
})

# Upload docs
post("/v1/upload", {"content": "Your document text...", "filename": "doc.txt"})

# Run agent
run = post(f"/v1/agents/{agent['id']}/runs", {"input": "What is the refund policy?"})

# Poll for result
while True:
    task = get(f"/v1/tasks/{run['task_id']}")
    if task["status"] == "completed":
        print(task["result"])                      # Verified answer
        print(task["verification"]["verdict"])     # SAFE
        print(task["verification"]["trust_score"])  # 1.0
        break
    time.sleep(3)

Streaming (SSE) — GET /v1/tasks/:id/stream

Instead of polling, subscribe to Server-Sent Events to receive each workflow state transition as it completes. Ideal for long-running multi-state agents (RustArchitect, SecurityAuditor, etc.) where you want to stream reasoning in the UI. Each data: line is a JSON-encoded StateTransition.

Event fieldMeaning
state_namee.g. Analysis, Tradeoffs, Answer. Synthetic TASK_COMPLETED / TASK_FAILED for already-terminal tasks.
to_stateNext state name, or null on final state.
raw_outputFull LLM output for the state (truncated to 8k chars).
condition"Sequential execution", "Parallel group '…'" etc.
duration_msWall time spent in the LLM call for this state.
prompt_tokens / completion_tokensRough estimates (~4 chars/token).
repair_countNumber of JSON repair passes applied on the final-state output.
successtrue if the state completed without validation errors.

The stream closes when the task reaches a terminal status (completed / failed / cancelled). After closure, call GET /v1/tasks/:id once to fetch the full verification block and final result. Connection TTL is 30 minutes; reconnect if needed — already-emitted events are not replayed, so resubscribers only see subsequent transitions.

# Python — consume SSE with the stdlib
import json, urllib.request

req = urllib.request.Request(
    f"https://api.wauldo.com/v1/tasks/{task_id}/stream",
    headers={
        "Authorization": "Bearer YOUR_KEY",
        "x-rapidapi-user": "my-tenant",
        "Accept": "text/event-stream",
    },
)
with urllib.request.urlopen(req) as resp:
    for raw in resp:
        line = raw.decode().rstrip()
        if line.startswith("data:"):
            ev = json.loads(line[5:].strip())
            print(f"{ev['state_name']:<16} {ev['duration_ms']:>5}ms  {ev['completion_tokens']}tok")
// TypeScript — native EventSource (browser or bun/deno)
const headers = {
  "Authorization": "Bearer YOUR_KEY",
  "x-rapidapi-user": "my-tenant",
  "Accept": "text/event-stream",
};

const resp = await fetch(`https://api.wauldo.com/v1/tasks/${taskId}/stream`, { headers });
const reader = resp.body!.getReader();
const decoder = new TextDecoder();
let buf = "";
while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  buf += decoder.decode(value, { stream: true });
  for (const line of buf.split("\n")) {
    if (line.startsWith("data:")) {
      const ev = JSON.parse(line.slice(5).trim());
      console.log(ev.state_name, ev.duration_ms, ev.completion_tokens);
    }
  }
  buf = buf.slice(buf.lastIndexOf("\n") + 1);
}

How do Wauldo agent revisions work?

Every change to an agent's custom_preset mints an immutable, content-addressed revision (SHA-256). The agent points to one active revision; you can roll back or promote any past revision in O(1) — no LLM call, no rebuild. Modeled on AWS ECS task definitions: append-only history, atomic active pointer.

Why versioning matters

You tweak an agent the morning of a demo. It breaks. With revisions, rollback is one PATCH to the previous revision — your live runs flip back to the known-good prompt instantly. No re-validation, no re-deploy, no LLM cost.

TraitBehavior
immutableRevisions are never mutated in place — content-addressed via SHA-256.
monotone revPer-agent counter, never reused even after prune.
implicit mintPOST /v1/agents with custom_preset mints rev 1; subsequent PATCH /v1/agents/:id with custom_preset mints the next rev.
cap50 revisions per agent. Oldest non-active revisions auto-pruned.
tenant-scopedRevisions live under the tenant. Cross-tenant reads are rejected.
cascade deleteDELETE /v1/agents/:id purges all revisions atomically.

Endpoints

MethodPathDescription
POST/v1/agents/:id/revisionsMint a new revision (rate-limited 5/min/tenant)
GET/v1/agents/:id/revisionsList revisions newest-first
GET/v1/agents/:id/revisions/:revFetch one revision verbatim
PATCH/v1/agents/:id/active-revisionPromote / rollback in O(1) — body {"rev": <n>}
# Mint a new revision (becomes active by default)
curl -X POST https://api.wauldo.com/v1/agents/AGENT_ID/revisions \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "custom_preset": { "version": "2.0", "workflow": { "states": [...] } },
    "message": "tighten triage prompt",
    "set_active": true
  }'

# Returns: { "rev": 4, "sha256": "abc...", "active_rev": 4 }

# List revisions, newest first
curl https://api.wauldo.com/v1/agents/AGENT_ID/revisions \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant"

# Rollback to a previous revision (no LLM cost, instant)
curl -X PATCH https://api.wauldo.com/v1/agents/AGENT_ID/active-revision \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"rev": 3}'

SDK examples

# Python
from wauldo.agents import AgentsClient
agents = AgentsClient(base_url="https://api.wauldo.com", api_key="YOUR_KEY", tenant="my-tenant")

rev = agents.create_revision("AGENT_ID", custom_preset=preset_v2, message="tighten triage")
print(rev.rev, rev.sha256)

# Rollback in one line
agents.set_active_revision("AGENT_ID", rev=3)
// TypeScript
import { AgentsClient } from "wauldo";
const agents = new AgentsClient({ baseUrl: "https://api.wauldo.com", apiKey: "YOUR_KEY", tenant: "my-tenant" });

const rev = await agents.createRevision("AGENT_ID", { customPreset: presetV2, message: "tighten triage" });
await agents.setActiveRevision("AGENT_ID", 3);
// Rust
use wauldo::{AgentsClient, CreateRevisionRequest};
let agents = AgentsClient::new("https://api.wauldo.com")
    .with_api_key("YOUR_KEY")
    .with_tenant("my-tenant");

let rev = agents.create_revision("AGENT_ID", CreateRevisionRequest {
    custom_preset: preset_v2,
    message: Some("tighten triage".into()),
    set_active: true,
}).await?;

agents.set_active_revision("AGENT_ID", 3).await?;

How to use Wauldo cost tags?

Attach team, env, project, billing labels to an agent's wauldo.toml. Every LLM call originating from that agent emits cost metrics tagged with those labels — Prometheus / Grafana can then slice spend per team or per project without guessing from agent names.

How it works

TraitBehavior
declarativeTags live in wauldo.toml, not in every API call — set once, applied everywhere.
sanitizedValues must match [a-zA-Z0-9._-]{1,32}. Invalid inputs coerced to "default" and counted in wauldo_cost_tag_rejected_total so you can spot misconfigured agents.
low cardinality4 fixed label keys. PII or unbounded values are blocked at ingest — your /metrics endpoint stays cheap to scrape.
unset = "default"Missing tags fall back to "default", never null — Grafana queries never need a special case for un-tagged spend.

Configure

# wauldo.toml — declared per agent
[agent]
name = "product-search"

[agent.tags]
team    = "frontend"
env     = "prod"
project = "search-relevance"
billing = "internal"
# Then create the agent normally — tags are picked up automatically.
curl -X POST https://api.wauldo.com/v1/agents \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "product-search",
    "wauldo_toml": "",
    "preset": "rag_grounded"
  }'

Read the metrics

Two Prometheus counters surface tagged spend (admin-scoped /metrics) :

# Cost in micro-USD broken down by tags + model
wauldo_llm_cost_by_tag_micro_usd_total{model, tag_team, tag_env, tag_project, tag_billing}

# Token volume same breakdown
wauldo_llm_tokens_by_tag_total{model, tag_team, tag_env, tag_project, tag_billing, kind}

How to configure Wauldo webhooks?

Subscribe to verdict and lifecycle events. When a task crosses an attention threshold (BLOCK / CONFLICT / UNVERIFIED), Wauldo fires a signed POST to your endpoint within seconds. Built for Slack handlers, on-call alerts, audit ingestion.

Event types

EventFired when
task.completedA task reaches a terminal verdict — payload carries verdict, support_score, halluc_rate, claims_count.
task.failedA task errored out — payload carries error.
task.cancelledA task was cancelled mid-execution.
verification.alertAuto-fired alongside task.completed when verdict is BLOCK / CONFLICT (severity high) or UNVERIFIED / INSUFFICIENT_CLAIMS / UNCERTAIN (severity medium).
recommendation.newA new Insights recommendation surfaced for the tenant.
*Wildcard subscription — receive every event.

Reliability guarantees

TraitBehavior
at-least-onceRetries with exponential backoff (max 5 attempts, 1s → 60s). Your handler must be idempotent on X-Event-Id.
circuit breakerPer-destination URL: 5 consecutive failures opens a 60s cooldown. A dead URL no longer drags every retry through 5 × 16 s of backoff.
DLQFinal failures land in a dead-letter queue. Inspect via GET /v1/webhooks/dlq, replay via POST /v1/webhooks/dlq/:event_id/retry.
HMAC-SHA256When you register a secret, every POST carries X-Wauldo-Signature: sha256=<hex> over the raw body. Verify with the standard HMAC_SHA256(secret, raw_body) recipe.
SSRF guardrailsPrivate IPs (10/8, 172.16/12, 192.168/16, 127/8, IPv6 loopback / link-local / ULA / IPv4-mapped) are rejected at registration AND re-validated at DLQ retry.

Endpoints

MethodPathDescription
POST/v1/webhooksRegister a subscription
GET/v1/webhooksList subscriptions
DELETE/v1/webhooks/:idRemove a subscription
GET/v1/webhooks/dlqList failed deliveries
POST/v1/webhooks/dlq/:event_id/retryReplay a failed delivery
DELETE/v1/webhooks/dlq/:event_idPurge a DLQ entry
# Register a webhook for verdict alerts
curl -X POST https://api.wauldo.com/v1/webhooks \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/wauldo-hook",
    "events": ["verification.alert", "task.failed"],
    "secret": "whsec_your_random_secret"
  }'
# Verify a signature (Node.js example)
const crypto = require("crypto");
function verify(req) {
  const got = req.headers["x-wauldo-signature"];
  const expected = "sha256=" + crypto
    .createHmac("sha256", process.env.WEBHOOK_SECRET)
    .update(req.rawBody)
    .digest("hex");
  return crypto.timingSafeEqual(Buffer.from(got), Buffer.from(expected));
}

How do Wauldo workflows work?

Author multi-step pipelines as a state machine — Task / Choice / Wait / Pass / Fail / Succeed with explicit transitions. The runtime executes sequentially with bounded wall-clock, persists every transition, and exposes per-state observability. Today's executor supports tool:<name> resources ; agent chaining lands in a future release.

State types

TypePurposeRequired fields
TaskInvoke a registered tool.resource, next
ChoiceBranch on a JSONPath variable.choices[], default
WaitPause up to 60 seconds.seconds, next
PassInject a constant payload.result, next
FailTerminate with an error.error
SucceedTerminate with the current IO.

Validation guarantees (at create time)

CheckBehavior
cycle detectionDFS catches A→B→A and longer cycles before storage. start_at must reach every state.
transition targetsEvery next must reference an existing state — no dangling pointers.
choice operatorsStrict enum: eq, neq, gt, lt, contains. Unknown operators are 400 at create time, never at run time.
tenant cap100 workflows per tenant. Tenant-scoped — cross-tenant reads rejected.
durablePersisted on the API host's local store — survives restart without re-uploading.

Runtime guarantees (at execution time)

CapBehavior
wall clockEach run terminates within 60 seconds. Past the deadline the run is marked timed_out.
waitA single Wait state cannot exceed 60 seconds. Longer values rejected at runtime.
transitionsHard ceiling of 200 state visits per run — protects against runaway loops that slip past static cycle detection.
history5000 stored runs per tenant. Each run record is upserted on every state transition for full audit.
asyncRuns are submit-and-poll. POST /runs returns 202 with an execution_id ; poll GET /runs/:execution_id for status.

Endpoints

MethodPathDescription
POST/v1/workflowsCreate a workflow definition.
GET/v1/workflowsList workflows for the calling tenant.
GET/v1/workflows/:idFetch one definition.
DELETE/v1/workflows/:idRemove a definition.
POST/v1/workflows/:id/runsStart an asynchronous execution. Returns 202 with an execution_id.
GET/v1/workflows/:id/runs/:execution_idFetch the current state and output of a run.

Define a workflow

Three states : compute via tool:calculator, branch on the result, terminate. State type values use PascalCase ; transitions use next.

# Sequential pipeline: compute → branch → succeed
curl -X POST https://api.wauldo.com/v1/workflows \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "triage",
    "start_at": "Compute",
    "states": {
      "Compute": {
        "type": "Task",
        "resource": "tool:calculator",
        "next": "Route"
      },
      "Route": {
        "type": "Choice",
        "choices": [
          { "variable": "$.output", "operator": "contains", "value": "42", "next": "Done" }
        ],
        "default": "Done"
      },
      "Done": { "type": "Succeed" }
    }
  }'

Run it and poll

Submit the run, capture the execution_id, and poll until status reaches a terminal value (succeeded, failed, timed_out).

# Start the run (202 Accepted)
curl -X POST https://api.wauldo.com/v1/workflows/$WF_ID/runs \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "input": { "operation": "add", "a": 21, "b": 21 } }'

# → { "execution_id": "wfr_...", "workflow_id": "wf_...", "status": "running" }

# Poll the run record
curl https://api.wauldo.com/v1/workflows/$WF_ID/runs/$EXECUTION_ID \
  -H "Authorization: Bearer YOUR_KEY"

# → {
#     "execution": {
#       "id": "wfr_...",
#       "status": "succeeded",
#       "current_state": null,
#       "input":  { "operation": "add", "a": 21, "b": 21 },
#       "output": { "output": "42" },
#       "started_at": 1747...,
#       "ended_at":   1747...,
#       "error": null
#     }
# }

SDK clients

Same create → start → poll loop without writing the HTTP yourself. The wait_for_run helper polls until terminal so you get the final WorkflowExecution back directly.

from wauldo.workflows import WorkflowsClient

wf = WorkflowsClient(base_url="https://api.wauldo.com", api_key="YOUR_KEY")

created = wf.create(
    name="triage",
    start_at="Compute",
    states={
        "Compute": {"type": "Task", "resource": "tool:calculator", "next": "Done"},
        "Done": {"type": "Succeed"},
    },
)

run = wf.start_run(created.id, input={"operation": "add", "a": 21, "b": 21})
final = wf.wait_for_run(created.id, run.execution_id)
print(final.status, final.output)  # succeeded {'result': 42.0}
import { WorkflowsClient } from "wauldo";

const wf = new WorkflowsClient({ baseUrl: "https://api.wauldo.com", apiKey: "YOUR_KEY" });

const created = await wf.create({
  name: "triage",
  startAt: "Compute",
  states: {
    Compute: { type: "Task", resource: "tool:calculator", next: "Done" },
    Done: { type: "Succeed" },
  },
});

const run = await wf.startRun(created.id, { operation: "add", a: 21, b: 21 });
const final = await wf.waitForRun(created.id, run.execution_id);
console.log(final.status, final.output); // succeeded { result: 42 }
use wauldo::workflows::{CreateWorkflowRequest, WorkflowsClient};
use serde_json::json;
use std::collections::HashMap;

let wf = WorkflowsClient::new("https://api.wauldo.com")
    .with_api_key("YOUR_KEY");

let mut states = HashMap::new();
states.insert("Compute".into(), json!({"type": "Task", "resource": "tool:calculator", "next": "Done"}));
states.insert("Done".into(), json!({"type": "Succeed"}));

let created = wf.create(CreateWorkflowRequest {
    name: "triage".into(),
    start_at: "Compute".into(),
    states,
    description: None,
}).await?;

let run = wf.start_run(&created.id, Some(json!({"operation": "add", "a": 21, "b": 21}))).await?;
let final_exec = wf.wait_for_run(&created.id, &run.execution_id, None, None).await?;
println!("{} {:?}", final_exec.status, final_exec.output);

Execution record fields

FieldDescription
idUnique wfr_* identifier for this execution.
statusOne of running, succeeded, failed, timed_out.
current_stateName of the state being executed (while running). Null on terminal records.
inputThe JSON body submitted to POST /runs.
outputFinal IO value on success. Null when the run failed or timed out.
errorHuman-readable error reason on terminal failure. Mirrors the Prometheus reason label.
started_at / ended_atUnix seconds. ended_at is null while the run is in flight.

JSONPath subset

Variables in Choice.variable and Task.output_path use a small JSONPath subset. Missing fields fall through to the default branch ; invalid syntax is rejected.

ExpressionResolves to
$The full current IO.
$.fieldA top-level field.
$.a.b.cA nested field path.
$.arr[0]An array element by index.
$.users[0].nameNested element field.

How to share Wauldo runs?

Publish any verified run as a public URL — https://wauldo.com/r/<id>. The page renders the verdict, claim-by-claim breakdown, sources, and timeline. Drop the link in Slack, a PR, or a bug report and the recipient sees what your agent actually said — without an account, without your prompt, without your tools.

Privacy contract

TraitBehavior
private by defaultRuns are never public until you call POST /v1/tasks/:id/share — explicit, per run, never global.
strict whitelistThe public payload exposes verdict, support_score, halluc_rate, claims_count, the answer, claim breakdown, source URLs, journal phase names + durations. Everything else stays in your tenant — never custom_preset, never wauldo.toml, never the system prompt, never tool args / results.
unguessable id128 bits of entropy (r_ + 32 hex). No login required to view ; the id is the credential.
noindex by defaultThe public page sets X-Robots-Tag: noindex,nofollow so Google doesn't index your shares. Opt-in indexing is a future flag.
TTL30 days for free-tier tenants ; expires_at: null (no expiration) on paid tenants. Computed at publish time so a tier change never retroactively expires already-shared runs.
cap1000 live shared runs per tenant. POST returns 429 above the cap.
idempotentRe-publishing an already-shared run returns the existing id without bumping the cap.

Endpoints

MethodPathDescription
POST/v1/tasks/:task_id/sharePublish a run (auth, idempotent). Returns {share_id, url, expires_at}.
GET/v1/runs/:share_idPublic read — no auth. Whitelisted payload only.
DELETE/v1/tasks/:task_id/shareUnpublish (auth, idempotent).
# Publish a completed run
curl -X POST https://api.wauldo.com/v1/tasks/TASK_ID/share \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{}'

# Returns: { "share_id": "r_abc...", "url": "https://wauldo.com/r/r_abc...", "expires_at": 1780797026156 }

# Anyone can read (no auth)
curl https://api.wauldo.com/v1/runs/r_abc...

# Or open the public page directly:
# https://wauldo.com/r/r_abc...

SDK examples

# Python (wauldo >= 0.13)
from wauldo.agents import AgentsClient
agents = AgentsClient(base_url="https://api.wauldo.com", api_key="YOUR_KEY", tenant="my-tenant")

share = agents.share_task("TASK_ID")
print(share.url)        # https://wauldo.com/r/r_abc...
print(share.expires_at) # epoch ms or None
agents.unshare_task("TASK_ID")
// TypeScript (wauldo >= 0.12)
import { AgentsClient } from "wauldo";
const agents = new AgentsClient({ baseUrl: "https://api.wauldo.com", apiKey: "YOUR_KEY", tenant: "my-tenant" });

const share = await agents.shareTask("TASK_ID");
console.log(share.url);
await agents.unshareTask("TASK_ID");
// Rust (wauldo >= 0.12)
use wauldo::AgentsClient;
let agents = AgentsClient::new("https://api.wauldo.com")
    .with_api_key("YOUR_KEY")
    .with_tenant("my-tenant");

let share = agents.share_task("TASK_ID").await?;
println!("{}", share.url);
agents.unshare_task("TASK_ID").await?;

How to use Wauldo OpenAI middleware?

Drop-in wrapper for the OpenAI Python SDK (and any OpenAI-compatible client). Three lines and every chat.completions.create() response carries a verified .wauldo namespace — verdict, support score, hallucination rate, claim count.

Install

pip install 'wauldo[openai]'

Pulls openai >= 1.0, < 2.0 as an extra. Doesn't conflict with an existing OpenAI install.

Usage

from openai import OpenAI
from wauldo.openai import with_verification

client = OpenAI()                # or AsyncOpenAI, or any OpenAI-compat client
verified = with_verification(    # wraps it
    client,
    wauldo_api_key="tig_live_...",
    fact_check_mode="lexical",  # or "hybrid", "semantic"
)

response = verified.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Capital of France?"}],
)

# Standard OpenAI ChatCompletion, plus:
response.wauldo.verdict          # "SAFE" | "UNVERIFIED" | "CONFLICT" | ...
response.wauldo.support_score    # 0.94
response.wauldo.halluc_rate      # 0.0
response.wauldo.claims_count     # 7
response.wauldo.fact_check_mode  # "lexical"
response.wauldo.error            # None when verdict attached cleanly

Behavior contract

TraitBehavior
non-invasiveWraps the client via a thin proxy. Every other attribute (client.api_key, client.base_url, etc.) passes through unchanged.
async auto-detectedIf you pass AsyncOpenAI, the wrapper returns an awaitable proxy automatically. No is_async= flag.
fail-open by defaultWauldo down / timeout / unexpected response shape attaches response.wauldo.error + logs a warning. The OpenAI response itself is never broken.
raise_on_error=TrueOpt in to bubble verification failures as WauldoVerificationError exceptions instead.
streaming pass-throughv1 : stream=True returns the OpenAI generator with response.wauldo = None. Post-stream verdict is a v2 feature.
no extra depsUses stdlib urllib for the /v1/fact-check POST — no requests / httpx pulled at runtime.

How does Wauldo memory work?

Key-value memory with namespace isolation. Store conversation context, user preferences, or any structured data per tenant. Supports lexical search.

What your agent remembers

Your agent persists context across calls via /v1/memory/:namespace. The endpoint is a generic key-value store with semantic search — you can use any namespace name you want. To stay aligned with the broader agent ecosystem (LangChain, CrewAI), we recommend four conventional namespaces below. They are conventions only, not enforced by the API.

TypeNamespaceTTLTypical use case
Short-termshort_termClient-managedChat history per session, windowed context. Conceptually FIFO — truncate or delete on the client side as the window fills.
Long-termlong_termPersistentUser facts and preferences that should survive across sessions. Semantic search over keys + values.
EntityentityPersistentExtracted entities (people, places, concepts) stored as structured JSON. Semantic search returns the closest entity record.
ContextualcontextualTask-scopedScratch pad tied to a single task. Delete the keys explicitly when the task completes.
# short_term — append a session message
curl -X POST https://api.wauldo.com/v1/memory/short_term \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "session_abc:msg_017", "value": "user: how do I cancel my plan?", "tags": ["session_abc"]}'

# long_term — upsert a user preference
curl -X POST https://api.wauldo.com/v1/memory/long_term \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "user_42:tone", "value": "prefers concise replies, no emoji", "tags": ["user_42"]}'

# entity — store a structured entity record
curl -X POST https://api.wauldo.com/v1/memory/entity \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "person:ada_lovelace", "value": "{\"type\":\"person\",\"name\":\"Ada Lovelace\",\"role\":\"mathematician\",\"born\":1815}", "tags": ["person"]}'

# contextual — task-scoped scratch pad
curl -X POST https://api.wauldo.com/v1/memory/contextual \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "task_t_xyz:plan", "value": "step 1: fetch invoice, step 2: parse total, step 3: refund", "tags": ["task_t_xyz"]}'
Convention, not enforcement. The API does not require these four names — pick whatever fits your domain (chat_history, user_profile, kb_facts, etc.). It also does not auto-rotate or auto-expire entries: TTL, FIFO truncation, and task-scoped cleanup are the client's responsibility via explicit DELETE /v1/memory/:namespace/:key calls.

Endpoints

MethodPathDescription
POST/v1/memory/:namespaceCreate or update an entry
GET/v1/memory/:namespaceList entries (paginated)
GET/v1/memory/:namespace/:keyGet a specific entry
POST/v1/memory/:namespace/searchSearch entries by query
DELETE/v1/memory/:namespace/:keyDelete an entry
# Store a memory entry
curl -X POST https://api.wauldo.com/v1/memory/support-context \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"key": "user_preference", "value": "prefers email over phone", "tags": ["contact"]}'

# Search memory
curl -X POST https://api.wauldo.com/v1/memory/support-context/search \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"query": "contact preference", "limit": 5}'

How does Wauldo Agent-to-Agent communication work?

Invoke an agent from another agent. Enables chaining, delegation, and multi-agent workflows. The called agent runs through the same verification pipeline.

# Agent A calls Agent B
curl -X POST https://api.wauldo.com/v1/a2a/AGENT_B_ID \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "x-rapidapi-user: my-tenant" \
  -H "Content-Type: application/json" \
  -d '{"input": "Summarize the refund policy", "context": {"caller_agent": "manager-bot"}}'

# Returns a task_id — poll GET /v1/tasks/:id for the verified result

What are Wauldo error codes?

CodeMeaningAction
400Bad requestCheck required parameters
401UnauthorizedCheck your API key or token
413Payload too largeBody exceeds 10MB — split your document
429Rate limitedWait and retry, or upgrade plan
500Internal errorRetry once. Contact support if persistent
502LLM provider errorRetryable — auto-retried 2x internally
503Service starting upRetry after 10-15s (cold start)
Auto-retry: The SDKs automatically retry 502/503/500 errors with exponential backoff. If you're using raw HTTP, retry these status codes up to 2 times with a 2s delay.

What are Wauldo rate limits?

PlanRequests/monthPremium AI callsPrice
Basic500Free
Pro10,000$19/mo
Ultra100,000$99/mo
MegaUnlimited$0.008/req

Rate limits are per API key. Manage your subscription on RapidAPI.

What are Wauldo limits and quotas?

ResourceLimit
Request body10 MB
Max chunks per upload5,000
Embedding dimensions1 – 4,096
Streaming response256 KB
SSE timeout1,800s (30 min)
Standard API timeout180s (3 min)
Source chunks per queryMax 3 (relevance-filtered)

How to use the Wauldo interactive explorer?

Try the API directly without writing code: