This one grew beyond what we originally planned. We started with "let users send SMS from the dashboard." Then someone on the team said "what about calls?" And then "what about voice agents that can actually handle conversations?" Eight months later, we have a full communications module with 10 dashboard tabs, 20 AI agent tools, and 300+ text-to-speech voices.

The Communications module (v4.0) is now 26,500+ lines of code across 29 files using 18 database tables. It is, by a wide margin, the largest module in NeuroGen. We will walk through what it does and why we built it this way.

10
Dashboard Tabs
300+
TTS Voices
20
Agent Tools
18
DB Tables

The 10 Tabs

When you open the Communications module, you see 10 tabs. Each one handles a different piece of the phone system. We spent a lot of time on this layout because cramming phone, SMS, email, and automation into one screen does not work. People need to focus on one thing at a time.

Dialer
SMS Inbox
Contacts
Campaigns
Call History
Phone Numbers
Calendar
Email
Autoresponder
Settings

The Dialer handles outbound and inbound calls. The SMS Inbox is a two-way messaging interface. Contacts is your address book with tagging and segmentation. Campaigns let you run multi-step outreach sequences. Call History shows every call with duration, recording links, and transcripts. Phone Numbers is a marketplace where you can buy and manage your numbers. Calendar handles scheduling. Email supports both SMTP and SendGrid. Autoresponder sets up automatic replies based on schedules and rules. Settings configures the whole system.

Voice Agents: AI on the Phone

This is the headline feature. Voice agents are AI assistants that can handle real phone conversations in real time. When someone calls your number (or when the agent makes an outbound call), the AI picks up and talks.

Under the hood, we connect the voice stream to OpenAI's Realtime API. The caller speaks, the audio streams to our server, we pipe it to the Realtime API, and the response streams back as speech. The latency is noticeable but workable. We are talking sub-second response times for most turns.

The voice agent has access to the same tools any NeuroGen agent has. It can look up information in a Knowledge Base, check a calendar, create a contact record, or trigger an automation. The voice part is the interface, but the agent underneath is the same multi-tool system that powers the text-based agents. It just happens to communicate by voice.

Voice Activity Detection (VAD)

The agent knows when the caller is speaking and when they have stopped. VAD parameters are configurable per agent: sensitivity threshold, silence duration before the agent responds, and interrupt handling. You can tune these based on your use case. A customer service agent should be patient and wait for the caller to finish. A quick survey agent can jump in faster.

Configurable Parameters

Temperature, max duration, credit pre-check, and voice selection are all configurable per voice agent. You can set a low temperature for factual interactions (appointment booking, FAQ) or higher for more conversational use cases. Max duration is tier-gated to prevent runaway costs.

300+ TTS Voices, Zero API Cost

We use Edge-TTS (version 7.2.7) for text-to-speech. This is a free library that accesses Microsoft's edge browser TTS engine. No API key required. No per-character billing. Over 300 voices across dozens of languages.

TTS is available everywhere in the platform, not just in voice agents. Any assistant message in the standalone chat has a speaker button that reads it aloud. Agents can proactively generate speech through the TTS toolkit. The generated audio files are stored in user_storage/{user_id}/tts/ with a configurable TTL (default 24 hours).

Each TTS generation costs 2 credits (configurable via the TTS_CREDIT_COST environment variable). Demo tier users are limited to 5 generations per day. The voice, rate, and pitch are all configurable per assistant in the database.

IVR Call Trees

For businesses that need a phone menu system ("Press 1 for sales, press 2 for support"), we built IVR (Interactive Voice Response) call trees. You design the tree in the dashboard with a visual builder. Each node can play a message, collect input, route to a department, or hand off to a voice agent.

The IVR system handles the common patterns: business hours routing (send to voicemail after 6pm), department selection, callback requests, and hold music. But because it runs on top of our agent infrastructure, you can also drop an AI voice agent into any node of the tree. "Press 3 to speak with our AI assistant" is a real option.

Autoresponders and Sequences

The Autoresponder tab handles automated email replies. It supports both SMTP (direct email server connection) and SendGrid (for higher volume). You set up rules: if an email comes in during business hours, send one reply. If it comes in after hours, send a different one. If it matches certain keywords, route it to a specific template.

Sequences are multi-step automated outreach. Define a series of touchpoints (SMS on day 1, email on day 3, call on day 7) and assign contacts. The system executes each step on schedule, tracks responses, and stops the sequence when someone replies or takes the desired action.

Lead and call scoring: Every interaction gets scored. Lead scoring combines engagement signals (opened email, replied to SMS, answered call) into a single score. Call scoring evaluates call quality based on duration, talk-to-listen ratio, and whether the call objective was met. Both feed into the Campaigns view for prioritization.

Tier-Gated Call Duration

Voice agent calls burn credits continuously. To prevent surprise bills, we enforce max call duration limits by tier. When the limit hits, the agent wraps up the conversation gracefully.

Tier Monthly Price Max Call Duration
Demo $0 2 minutes
Starter $47 5 minutes
Professional $97 15 minutes
Business $297 30 minutes
Enterprise $997 60 minutes

Before a call connects, the system runs a credit pre-check. If your account does not have enough credits for at least the minimum call duration, the call will not start. This prevents the frustrating situation where a call drops 30 seconds in because credits ran out.

20 Tools for AI Agents

The communications module exposes 20 tools to the AI agent framework. Any agent you build can use these tools to interact with the phone system. Some examples:

  • make_call and send_sms for outbound communication
  • check_call_history and get_sms_thread for looking up past interactions
  • create_contact and update_contact for managing the address book
  • schedule_callback for setting up future outbound calls
  • get_phone_numbers for checking available numbers
  • start_sequence for launching automated outreach on a contact

This means you can build agents that handle communication as part of a larger workflow. An agent that takes a customer complaint, creates a ticket, schedules a callback, and sends a confirmation SMS. All through the agent tool system, all credit-tracked.

The Phone Number Marketplace

The Phone Numbers tab connects to our number inventory. You can search for numbers by area code, pattern, or capabilities (voice, SMS, MMS). Purchase happens directly in the dashboard. Purchased numbers show up immediately and can be assigned to voice agents, IVR trees, or used as caller ID for outbound calls.

Number management includes caller ID configuration, forwarding rules, and usage monitoring. You can see how many calls and messages each number has handled and what they are costing.

Set Up Your First Voice Agent

Create an AI-powered phone agent in minutes. Pick a voice, assign a number, and start taking calls.

Start Free Trial
Back to Blog