I’ve been building LobbyStack, an open-source AI receptionist for small businesses.
It answers inbound calls, handles SMS conversations, books/reschedules/cancels appointments, takes messages, stores transcripts/recordings, and transfers calls to humans.
How we’re using Convex/components:
- RAG: indexes business knowledge from snippets, uploaded docs, and imported website pages. Voice/SMS can search that indexed knowledge before answering. - Agent: generates grounded SMS replies with per-conversation threads, business snapshots, retrieved knowledge, and appointment tools. - Workflow: runs multi-step jobs like refreshing the business context snapshot, syncing appointments to external calendars, creating post-booking notifications, and importing website knowledge. - Workpool: separates higher-priority knowledge reindexing from heavier bulk work like document extraction and website page indexing. - Rate Limiter: protects onboarding, phone verification, number claiming, feedback/test notification actions, and web voice starts. - Action Retrier: retries notification delivery, including booking confirmations and scheduled reminders. - Crons: registers per-business calendar reconciliation jobs. - Polar component: checkout, subscriptions, customer mapping, webhooks, and metered usage events. - Resend component: transactional email sending. - Firecrawl scrape component: pulls website content into the knowledge pipeline.
We had to use a separate voice gateway because the live call path is a long-lived WebSocket bridge, not a normal backend request. Twilio Media Streams opens a socket to us, we keep another socket open to OpenAI Realtime, and the gateway continuously passes audio frames between them while handling interruptions, tool calls, recording assembly, and cleanup when either side disconnects.