Most "AI support" projects fail the same way: the bot answers confidently, gets it wrong, and there's no clean path to a human. Deflection done right is answer what you can prove, hand off the rest with context. Here's the playbook we recommend for a first agent.
The use case
Your support inbox is full of the same questions — billing, password resets, "where's my order," plan limits. A grounded agent can resolve most of these and route the genuinely hard ones to a person, with the whole conversation and the cited sources attached.
Step 1 — Ground the agent
Create an agent and attach knowledge: help-center articles, policy docs, and a crawl of your pricing and docs pages. Answers are retrieved from your approved library (with citations) instead of generic web text, so the agent stops inventing refund policies.
Step 2 — The system prompt
Keep it short, specific, and escalation-first:
You are the support agent for Acme. Answer only from the attached knowledge.
- If the answer isn't in the knowledge, say so and escalate — never guess.
- For refunds over $100, account changes, or anything legal/billing-sensitive,
escalate to a human instead of acting.
- Be concise. Cite the article you used.
- Match the customer's language and a calm, helpful tone.This pairs with guardrails (allowed/blocked topics, max response length) so policy is enforced, not just suggested.
Step 3 — Tools
Give the agent only what it needs:
- Knowledge search — retrieve and cite.
- Escalate / Resolve — built-in tools to hand off or close a thread.
- A custom HTTP tool for "look up order status" against your backend (with auth and rate limits), or an MCP connector to your helpdesk (e.g. Zendesk).
Step 4 — The handoff workflow
On a drag-and-drop canvas, wire a simple flow:
- Trigger — new conversation on web, WhatsApp, or email.
- KB search + LLM — draft a grounded answer.
- Manager branch — confidence high and topic allowed → reply; otherwise → escalate.
- Hand off — route to the right team (round-robin or least-busy) with the transcript and citations attached.
Because the agent runs across channels, the customer never repeats themselves and your team sees one timeline.
Measure, then expand
Run this on one painful queue first. Track deflection rate, handle time on escalations, and customer sentiment for two weeks. When it holds, add the next queue — don't boil the ocean on day one.
Next: see tool calling with MCP to wire your helpdesk and CRM, and build a triage workflow with AI to generate the canvas above in minutes.