THE CALL COMES IN
Water is coming through the kitchen ceiling. The homeowner grabs a phone and dials the only plumber whose magnet is still on the fridge. It's the middle of the night. Two rings. Then a calm voice: "Hi, thanks for calling — tell me what's going on." Nobody got out of bed. The call was already answered.
WHY THIS CALL MATTERS
The missed call is the lost job.
Most missed calls happen nights, weekends, and during emergencies — exactly when the customer is most desperate and most likely to dial the next number instead. TalkTuit answers all of them.
Answers every call, 24/7.
Nights, weekends, holidays. The phone never goes to voicemail again.
Books the job, captures the details.
It understands the caller, schedules the work, and writes down exactly what's needed.
Rushes real emergencies to a human.
Gas, carbon monoxide, flooding, a burst pipe — it gets a live on-call person on the line in seconds.
DIAGRAM 1 OF 6 · THE CAST
Meet the team behind the call.
Six services do the live work of that 2 AM call together. Each is a best-in-class outside service — the same kind of AI you've already seen elsewhere — doing one job a person would understand.
- The Phone Line (Twilio) answers the call and streams the caller's audio in.
- The Ears (Deepgram) turn that speech into text, live.
- The Brain (Claude) understands the text, decides what to say, and fills in one clean record of the call.
- The Mouth (ElevenLabs) turns the brain's words back into a human voice that goes out through the phone line.
- The Orchestrator (Pipecat, our code) sits underneath and moves the audio between the ears, brain, and mouth.
- The Brain writes one structured record into the Memory (DynamoDB).
HOW THE LINE EVEN CONNECTED
Before the homeowner heard "Hi, thanks for calling," something happened in about half a second. The phone company knocked on a tiny door, a one-line answer came back — "open a live audio stream over here" — and then the real connection took over for the whole call. Two pieces, two very different jobs.
DIAGRAM 2 OF 6 · THE HANDSHAKE
What happens when someone calls.
You never see this — it's the reason the call connects in about half a second and never drops. (One for the engineers: a short request fits a tiny function; holding a call open for minutes needs a long-running container. That difference is the whole reason both exist.)
- The caller dials and reaches Twilio, the phone line.
- Phase 1, the doorbell, lasts only milliseconds: Twilio makes one short request to a small Lambda function (a tiny program that wakes up, answers, and exits), which answers with a one-line instruction that means "open a live audio line here," then steps out and is never in the audio path.
- Phase 2, the live call, lasts about five minutes: Twilio opens a persistent connection to a Fargate container (a long-running program that stays alive for the whole call) that holds the call open and streams audio both ways the entire time.
- When the call ends, one structured record is written to DynamoDB.
- The point: a short stateless request suits a tiny function, while holding a call open for minutes requires a long-running container.
DIAGRAM 3 OF 6 · THE LIVE LOOP
Inside one live call.
One turn, step by step. The whole loop runs in about a second, then repeats for the next turn — a typical call is 15 to 40 turns. The caller can interrupt anytime, and actions run in the background, so the conversation never waits.
- Step 1, HEAR: Deepgram turns the caller's speech into text.
- Step 2, CHECK: plain code scans for life-safety emergencies, no AI, instantly.
- Step 3, THINK: Claude decides what to do, using Haiku for routine turns or Sonnet for emergencies.
- Step 4, ACT: it fires tools such as book the job, text the owner, or write to the CRM. These run in the background (async), so the conversation never waits for them.
- Step 5, SPEAK: ElevenLabs replies in a human voice.
- The loop returns to step 1 for the next turn and repeats 15 to 40 times in a typical call.
- The caller can interrupt anytime; the loop simply restarts at step 1.
THE PART THAT MATTERS AT 2 AM
The homeowner says the word "burst." The receptionist doesn't guess and doesn't shrug it off as a routine booking. A fast check flags it instantly, the smarter brain takes the turn, and it starts dialing a real on-call person. And here's the part that matters most: it does not say "help is coming" until a human is actually on the line. If the first number doesn't pick up, it falls back and keeps trying. It never lies to someone standing in rising water.
DIAGRAM 4 OF 6 · THE SAFETY FORK
Emergencies are handled in code, not hope.
A prompt can't route itself to a smarter model, so the emergency decision lives in plain, deterministic code — not in something the AI might get wrong.
- Every caller utterance hits a fast keyword check — plain code, no AI, instant.
- MISS (calm path): routine turns go to Haiku, the fast everyday tier of the brain (cheaper and faster), which is structurally forbidden from declaring an emergency on a guess.
- If Haiku is unsure, a dashed escalation arrow re-runs the turn on the smarter Sonnet model.
- HIT (emergency path): keywords like gas, carbon monoxide, flood, burst pipe, sewage, or no-heat-while-freezing force the turn onto Sonnet, the smarter tier of the brain — the only one allowed to forward an emergency.
- Sonnet forwards to a real human, which is treated as something that can fail.
- Success: a human is engaged, and only then do we tell the caller help is coming.
- Failure (ring-no-answer or busy): it falls back to 911 or the posted emergency line and retries.
WHY IT NEVER SOUNDS LIKE IT'S THINKING
The fillers hide the thinking time.
Predictable lines — the greeting, little fillers like "one moment" or "let me check that" — are recorded once and replayed instantly. Only the words that genuinely have to be invented for this caller use the live brain. The fillers hide the thinking time, so there's never dead air.
Gold = the instant, pre-recorded line that buys time. Plain = the live brain inventing the part that's unique to this caller.
DIAGRAM 5 OF 6 · THE MOAT
Why it gets better every month.
Every call becomes one clean, structured record. Across many shops those records roll up, anonymized, into a per-trade brain that gets sharper as volume grows.
- Each call becomes one structured record whose key fields are fixed dropdowns — trade, problem, urgency, outcome — never free text.
- Across many shops, those identical records converge, anonymized, so no raw customer data leaves a shop.
- They roll up into one per-trade brain that gets sharper as call volume grows.
- The receptionist is the wedge. The data is the asset.
Every call becomes one clean, structured record — the key fields are fixed dropdowns from call one, never free text — so every call is comparable across every shop and every month. Across many shops those records roll up, anonymized, into a per-trade brain that gets smarter with volume.
The receptionist is the wedge. The data is the asset.
Aggregates are anonymized — no raw customer data leaves a shop.
DIAGRAM 6 OF 6 · THE SYSTEM
One brain on the call. A fleet behind it.
The phone call is single-agent on purpose — one brain, kept lean and fast. The business that runs around the calls is multi-agent: a fleet of supporting agents that onboard shops, sync CRMs, chase reviews, research, self-heal, and roll every call up into the per-trade brain. None of them sit in the live call.
- On the phone, in real time: one call agent (Claude) with two gears — Haiku for routine turns, Sonnet for emergencies — plus its tools book_job, forward_emergency, text_owner and write_crm. One agent per call, kept lean because extra agents add lag and failure risk on a live call.
- Behind the scenes, offline: a fleet of separate agents runs the business — an onboarding agent that builds the call agents, an orchestrator, a CRM/GoHighLevel sync agent, a reviews agent, a researcher agent, self-healing workflows, and a nightly vertical-brain rollup that is the data moat.
- Each call feeds its record to the system; the fleet builds, configures, and tunes the call agent.
- So the system is single-agent where latency matters (the call) and multi-agent where it does not (everything else).
THE SAME CALL, EVERY NIGHT
Every call gets this. 24/7. Every shop.
The 2 AM burst pipe, the Tuesday tune-up, the Saturday no-heat call — every one of them gets answered on the second ring, booked, and saved as one clean record. No call goes to voicemail. No job gets lost.