Bot quotes a 30-day refund window. Your policy is 14 days. Customer screenshots it. Legal opens a ticket.
Confident tone. Fabricated terms. A real customer on the other side reading it like policy. One screenshot on Twitter and the cost outweighs the savings — for months.
Bot quotes a 30-day refund window. Your policy is 14 days. Customer screenshots it. Legal opens a ticket.
Bot recommends Product A for use case B. Product A doesn't support use case B. Churn or chargeback follows.
LLMs are trained to sound helpful. Helpful + wrong is more dangerous than unhelpful + correct.
The reply goes through Wauldo with the retrieved KB articles as source_context. Each claim in the draft is scored against what the KB actually says. If the verdict comes back as UNVERIFIED or PARTIAL — or if support_score drops below your chosen threshold — the reply is flagged, escalated to a human, or rewritten before it ever reaches the customer.
No bot reply leaves the system if it's not grounded. The LLM stays. The confident fabrications don't.
from wauldo import Wauldo
w = Wauldo(api_key=os.environ["WAULDO_API_KEY"])
def send_reply(customer_msg: str, kb_chunks: list[str]) -> str:
draft = llm.generate(customer_msg, context=kb_chunks)
verified = w.fact_check(
text=draft,
source_context="\n\n".join(kb_chunks),
mode="lexical", # ~1s, fast enough for chat
)
if verified.verdict != "SAFE":
escalate_to_human(customer_msg, draft, verified)
return FALLBACK_RESPONSE
return draft
Customer: "What's your refund policy for digital downloads?"
Bot draft: "You have 60 days to request a full refund on any digital purchase, no questions asked."
KB says: "Digital products are non-refundable."
Wauldo: BLOCK, support_score 0.08. Bot never sent it. Support lead got a dashboard flag instead of a chargeback dispute.
Lexical mode runs at ~1s on a typical 300-word reply. That's inside the human expectation window for a thoughtful response. For async email / ticket flows, hybrid mode (3-5s) gives you cross-lingual grounding without blocking the agent.
Wauldo sits between your LLM and your channel. Any support stack that gives you a programmatic hook before outbound — Intercom Fin, Zendesk AI agents, Front automations, or a custom wrapper — can integrate in an afternoon. You keep your LLM provider, you keep your KB, you add one HTTP call.
Yes. Return the support_score and verdict to your own policy layer. Block, rewrite, warn, or escalate — your call. Most teams start with the verdict=SAFE gate and tune a numeric floor after a week or two of real traffic.
Hybrid mode uses a multilingual embedding. FR / EN / ES cross-lingual grounding tested. Other languages supported via lexical fallback — token overlap works across scripts even when the semantic model doesn't cover the target language natively.
Requests hit your Wauldo API key, never pooled. Private self-host option for SOC2 / HIPAA flows — contact sales. The backend is MIT-licensed SDKs with a tested request path; you can audit exactly what leaves your infrastructure.
Free on RapidAPI. 300 verifications/month.