Twilio Architecture Deep Dive

Twilio Segment: Customer Data Platform CDP

What is Segment?

One-liner: Segment collects customer data from every touchpoint (web, mobile, server), unifies it into complete customer profiles, and activates it across 700+ downstream tools in real-time.

The Problem It Solves: Without Segment, companies have fragmented customer data across dozens of tools (analytics, email, ads, CRM). Each tool has incomplete data. Marketing sends irrelevant emails. Support doesn't know the customer's history. Ads target people who already converted.

How Segment Works: The Data Flow

┌─────────────────────────────────────────────────────────────────────────────────────┐ │ SEGMENT CDP ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ SOURCES │ │ │ │ │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │ Web │ │ Mobile │ │ Server │ │ Cloud │ │Warehouse│ │ │ │ │ │Analytics│ │iOS/Andr │ │ Node.js │ │ Stripe │ │Snowflake│ │ │ │ │ │ .js │ │ SDKs │ │ Python │ │Salesforce│ │ BigQuery│ │ │ │ │ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ │ │ │ │ │ │ │ │ │ └────────┼────────────┼────────────┼────────────┼────────────┼────────────────┘ │ │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ TRACKING API (TAPI) │ │ │ │ Go servers • 800K RPS • 30ms • 99.9999% │ │ │ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ Shard 1 Shard 2 Shard 3 Shard N │ │ │ │ │ │ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │ │ │ │ │ NLB │ │ NLB │ │ NLB │ │ NLB │ │ │ │ │ │ │ │ ALB │ │ ALB │ │ ALB │ │ ALB │ │ │ │ │ │ │ │NSQD │ │NSQD │ │NSQD │ │NSQD │ ← Local buffer │ │ │ │ │ │ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └──────┼────────────┼────────────┼────────────┼───────────────────────┘ │ │ │ └─────────┼────────────┼────────────┼────────────┼───────────────────────────┘ │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ KAFKA BACKBONE │ │ │ │ Primary Cluster ◄────► Secondary Cluster (Multi-tier failover) │ │ │ │ │ │ │ │ ┌────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ Partitioning by messageId → Same ID always same consumer │ │ │ │ │ └────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ┌────────────────────────┼────────────────────────┐ │ │ ▼ ▼ ▼ │ │ ┌─────────────────┐ ┌─────────────────────┐ ┌─────────────────┐ │ │ │ DEDUPLICATION │ │ IDENTITY RESOLUTION │ │ TRANSFORMATION │ │ │ │ │ │ │ │ │ │ │ │ RocksDB local │ │ DynamoDB Global │ │ Custom JSON │ │ │ │ 60B keys │ │ Tables for ms │ │ parser (zero- │ │ │ │ 1.5TB disk │ │ resolution │ │ allocation) │ │ │ │ 4-week window │ │ Graph-based merge │ │ │ │ │ │ Bloom filters │ │ │ │ │ │ │ └─────────────────┘ └─────────────────────┘ └─────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ UNIFIED PROFILES │ │ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ │ │ │ │ user_id: "john@acme.com" │ │ │ │ │ │ anonymous_ids: ["abc123", "xyz789"] │ │ │ │ │ │ traits: { name: "John", plan: "enterprise", ltv: 50000 } │ │ │ │ │ │ events: [page_viewed, product_added, checkout_started, ...] │ │ │ │ │ │ computed_traits: { predicted_churn_risk: 0.12, ... } │ │ │ │ │ └───────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ┌────────────────────────┼────────────────────────┐ │ │ ▼ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ CENTRIFUGE │ │ │ │ (Reliable delivery to 700+ destinations) │ │ │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ Directors (80-300 at peak) │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ MySQL/RDS as job queue │ │ │ │ │ │ │ │ • jobs table (immutable, append-only) │ │ │ │ │ │ │ │ • job_state_transitions table │ │ │ │ │ │ │ │ • 4-hour retry window, exponential backoff │ │ │ │ │ │ │ │ • S3 archival for undelivered │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ │ │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ DESTINATIONS │ │ │ │ │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │Analytics│ │Warehouse│ │ Ads │ │ Email │ │ CRM │ │ A/B │ │ │ │ │ │Google │ │Snowflake│ │Meta Ads │ │ Braze │ │Salesforce│ │Amplitude│ │ │ │ │ │Mixpanel │ │BigQuery │ │Google │ │Iterable │ │HubSpot │ │Optimizly│ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │ │ │ │ │ │ │ Cloud Mode: Segment servers translate & send (smaller page load) │ │ │ │ Device Mode: Direct API calls from client (for tools requiring it) │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────┘

Why Segment Matters to Twilio

1. Data for Personalization

Segment provides the customer context that makes Twilio communications relevant. "Welcome back, John!" vs "Dear Customer".

2. Channel Orchestration

Segment's Journeys can trigger Twilio SMS/Email/WhatsApp based on real-time customer behavior (cart abandonment → instant SMS).

3. AI Foundation

AI needs data. Segment's unified profiles feed ML models for predictive traits, churn scoring, and audience targeting.

Scale Metrics

400K

Events/Second

25K+

Companies Using

700+

Integrations

Rows/Month to Warehouses

99.9999%

API Availability

IDC CDP Market Share (4 years)

Interview Talking Point

"Segment solves the fragmented customer data problem. Companies have dozens of tools—analytics, email, ads, CRM—each with incomplete customer data. Segment sits at the center: it collects events from every touchpoint, resolves identity across devices and sessions, builds unified profiles, and then activates that data to 700+ downstream tools in real-time.

For Twilio, Segment is the data layer that makes communications intelligent. Without it, you're sending 'Dear Customer' emails. With it, you know the customer's name, their purchase history, their predicted churn risk, and you can trigger an SMS the moment they abandon their cart.

The technical architecture is impressive—400K events per second through a shard-based Tracking API, Kafka backbone for durability, RocksDB for 60-billion-key deduplication, and Centrifuge for reliable delivery to destinations with 4-hour retry windows."

AI Identity Thesis: The Stytch Acquisition VERIFIED

Is the AI Identity Thesis Valid?

Yes, definitively. The research confirms that AI agent authentication is a real, unsolved problem that traditional identity systems (OAuth, SAML) weren't designed for. Multiple sources validate this:

Cloud Security Alliance (CSA) has published a framework specifically for "Agentic AI Identity & Access Management"
AWS launched "Amazon Bedrock AgentCore Identity" in 2025 for the same purpose
Anthropic's MCP protocol mandates OAuth 2.1 for agent authorization
Academic research (arXiv papers) proposes zero-trust frameworks for AI agents
Orca Security reports non-human identities outnumber humans 50:1 (projected 80:1 by 2027)

Why Traditional Auth Fails for AI Agents

Aspect	Human Users	AI Agents	Problem
Session Duration	Persistent (hours/days)	Ephemeral (seconds/minutes)	OAuth assumes persistent sessions
Consent	Interactive approval	Delegated, autonomous	No human in the loop for consent
Access Scope	Coarse-grained roles	Task-specific, granular	Agents need fine-grained, ephemeral tokens
Multi-Agent	Single user	Chains of agents calling agents	Delegation chains, trust propagation
Revocation	Manual by user	Real-time, risk-based	Need instant revocation on suspicious behavior

What Stytch Brings

┌─────────────────────────────────────────────────────────────────────────────────────┐ │ STYTCH: AI AGENT AUTHENTICATION │ ├─────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ MODEL CONTEXT PROTOCOL (MCP) │ │ │ │ (Anthropic's open standard for AI tools) │ │ │ │ │ │ │ │ Claude ◄────────────► MCP Client ◄────────────► MCP Server │ │ │ │ ChatGPT (Your App) │ │ │ │ Custom LLM │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ STYTCH CONNECTED APPS │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ OAuth 2.1 + OIDC for Agents │ │ │ │ │ │ • Dynamic client registration │ │ │ │ │ │ • Scoped access tokens (read:customer, create:ticket, etc.) │ │ │ │ │ │ • Human-in-the-loop step-up authentication │ │ │ │ │ │ • Token lifecycle management (issue, validate, revoke) │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ Agent-Specific Features │ │ │ │ │ │ • Ephemeral, task-scoped credentials │ │ │ │ │ │ • Delegation chains for multi-agent workflows │ │ │ │ │ │ • Real-time revocation tied to risk signals │ │ │ │ │ │ • Audit trails for agent actions │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ INTEGRATION WITH TWILIO │ │ │ │ │ │ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │ │ │ │ Twilio Verify │ │ Twilio Channels │ │ Twilio Segment │ │ │ │ │ │ (Human 2FA) │ │ (Voice,SMS,Email)│ │ (User Profiles) │ │ │ │ │ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │ │ │ │ │ │ │ │ │ │ │ └──────────────────────┼──────────────────────┘ │ │ │ │ ▼ │ │ │ │ ┌──────────────────────────────────────────────┐ │ │ │ │ │ UNIFIED IDENTITY LAYER │ │ │ │ │ │ Humans + AI Agents + Bot Detection │ │ │ │ │ │ "Distinguish humans, trusted agents, │ │ │ │ │ │ and rogue agents" │ │ │ │ │ └──────────────────────────────────────────────┘ │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────┘

Stytch's Current Capabilities

MCP Integration

Native support for Claude, ChatGPT connectors
OAuth 2.1 as mandated by MCP spec
Dynamic client registration for new agents
Drop-in consent UI for user approval
Org-level admin policy controls

Agent-Ready Auth Primitives

Scoped tokens (read:customers, write:tickets, etc.)
Human-in-the-loop step-up for sensitive operations
Token lifecycle: issue, validate, revoke
Free for first 10K active users and agents

Strategic Rationale

Why Twilio + Stytch Makes Sense:

Channel-Aware Identity: Twilio's channels (Voice, SMS, Email) + Stytch's auth = verification, step-up, and recovery that works worldwide across channels.
Pre-Market Positioning: AI agents are nascent. Owning the identity layer before the market matures creates lock-in.
Developer Trust: Twilio's 10M developers already trust the platform. Stytch extends that trust to the agent ecosystem.
Competitive Moat: Auth0 (Okta) doesn't have communications. Twilio now has communications + identity.

Interview Talking Point

"The Stytch acquisition validates a real market need. Traditional OAuth was designed for humans—persistent sessions, interactive consent, coarse-grained roles. AI agents are fundamentally different: ephemeral, autonomous, operating in multi-agent chains where one agent calls another.

The research confirms this isn't hype. AWS launched Bedrock AgentCore Identity. The Cloud Security Alliance published an Agentic AI IAM framework. Anthropic's MCP protocol mandates OAuth 2.1 for agent authorization. Non-human identities already outnumber humans 50:1 in enterprise environments.

Stytch brings MCP-ready OAuth, scoped tokens, human-in-the-loop step-up, and the primitives for agent-to-agent delegation. Combined with Twilio's channels for out-of-band verification and Segment's user profiles, Twilio can offer a unified identity layer that distinguishes humans from trusted agents from rogue agents. That's a defensible moat as AI agents proliferate."

Kafka as Backbone & Centrifuge System Deep Dive

How Kafka is Used

Kafka is the durable, ordered message backbone for Twilio Segment. It serves three critical functions:

Durability: Data persisted across availability zones—never lost, even if consumers crash
Ordering: Messages partitioned by ID ensure same-ID messages go to same consumer
Replay: Consumers can replay from any offset for recovery or reprocessing

Kafka Architecture in Segment

┌─────────────────────────────────────────────────────────────────────────────────────┐ │ KAFKA AT TWILIO SEGMENT │ ├─────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ MULTI-TIER KAFKA FAILOVER │ │ │ │ │ │ │ │ Each TAPI shard has: │ │ │ │ ┌──────────────────────┐ ┌──────────────────────┐ │ │ │ │ │ PRIMARY CLUSTER │◄──►│ SECONDARY CLUSTER │ │ │ │ │ │ (Same AZ) │ │ (Different AZ) │ │ │ │ │ └──────────────────────┘ └──────────────────────┘ │ │ │ │ │ │ │ │ │ │ ▼ ▼ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ FAILOVER LOGIC (in Replicated service) │ │ │ │ │ │ • Monitor broker health │ │ │ │ │ │ • Switch to secondary if primary degrades beyond threshold │ │ │ │ │ │ • Auto-revert when primary recovers │ │ │ │ │ │ • Dynamic routing via Kubernetes ConfigMaps │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ DEDUPLICATION ARCHITECTURE │ │ │ │ │ │ │ │ ┌────────────────┐ │ │ │ │ │ Kafka Input │──────► Partition by messageId │ │ │ │ │ Topic │ (same ID → same consumer) │ │ │ │ └────────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ DEDUP WORKER (Go) │ │ │ │ │ │ │ │ │ │ │ │ 1. Read message from input topic │ │ │ │ │ │ 2. Query RocksDB: "Have I seen this messageId?" │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ RocksDB (embedded, local EBS) │ │ │ │ │ │ │ │ • 60 billion keys │ │ │ │ │ │ │ │ • 1.5 TB on disk │ │ │ │ │ │ │ │ • 4-week deduplication window │ │ │ │ │ │ │ │ • Bloom filters: "definitely not seen" fast path │ │ │ │ │ │ │ │ • LSM tree: append-only, efficient compaction │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ 3. If NEW: add to RocksDB, publish to output topic │ │ │ │ │ │ 4. If DUPLICATE: discard, update offset only │ │ │ │ │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ ┌────────────────┐ │ │ │ │ │ Kafka Output │──────► To downstream processing │ │ │ │ │ Topic │ (transformations, destinations) │ │ │ │ └────────────────┘ │ │ │ │ │ │ │ │ KEY INSIGHT: Kafka output topic is the SOURCE OF TRUTH │ │ │ │ On restart, worker reads output topic first to rebuild RocksDB state │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────┘

Deep Dive: RocksDB as a High-Performance Cache

The Key Question: If Kafka is the source of truth, why do we need RocksDB?

Answer: Kafka is source of truth for correctness/recovery. RocksDB is the performance index for real-time lookups.

To deduplicate, you need to answer "Have I seen this messageId before?" for every incoming message. Consider the alternatives:

Approach	Why It Doesn't Work
Scan Kafka output topic	Way too slow - billions of messages, sequential scan
Keep all IDs in memory	60 billion keys × ~40 bytes = ~2.4TB RAM
Query remote DB (DynamoDB)	Network latency on every message kills throughput
RocksDB (embedded)	Local disk, no network hop, Bloom filters for fast "not seen" path

RocksDB vs Traditional Cache

Aspect	Traditional Cache (Redis/Memcached)	RocksDB Here
Location	Remote (network hop)	Embedded (local disk)
Durability	Ephemeral (lost on restart)	Persistent (EBS-backed)
Cache miss	Query source of truth	Never - it has ALL data for this partition
Rebuild	Cold start, gradual population	Read Kafka output topic once on startup
Failover	Need replica cluster	Just reattach EBS volume

Why Kafka is Still "Source of Truth"

For crash recovery and consistency:

Normal operation:
1. Read message from Kafka INPUT topic
2. Query RocksDB: "seen this ID?"
3. If new: write to RocksDB, publish to Kafka OUTPUT topic

Crash scenario:
- Worker writes to RocksDB, crashes before publishing to Kafka OUTPUT
- OR: Worker publishes to Kafka OUTPUT, crashes before writing to RocksDB

Recovery:
- On restart, worker reads Kafka OUTPUT topic first
- Rebuilds RocksDB state from what's actually been published
- Now RocksDB and Kafka OUTPUT are consistent

Kafka OUTPUT is the commit point. If it's in Kafka OUTPUT, it's been processed. RocksDB is just a fast cache that can be rebuilt.

Why Deduplication is Needed

Mobile clients retry on network failure, creating duplicates. At scale, even 0.6% duplicates cause real problems:

400K events/second × 0.6% = 2,400 duplicate events/second
Over a month = ~6 billion duplicates
Business impact: double-counted revenue, duplicate emails sent, wrong analytics

Why They Switched from Memcached

"We no longer have a set of large Memcached instances which require failover replicas. Instead we use embedded RocksDB databases which require zero coordination."

No network hop: Memcached ~1ms, RocksDB local ~0.01ms
No coordination: No distributed cache invalidation
No failover replicas: Cheaper, simpler ops
Bloom filters: For new messages, often zero disk reads

┌───────────────────────────────────────────────────────────────────────────────────────┐ │ SCALE NUMBERS │ │ │ │ │ │ │ • Nearly 1 million messages/second through Kafka │ │ │ │ • 0.6% of events are duplicates (from mobile retry) │ │ │ │ • 200 billion messages processed in first 3 months │ │ │ │ • 100x improvement over previous Memcached-based approach │ │ │ │ • Zero coordination required (embedded RocksDB vs Memcached replicas) │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────┘

Centrifuge: Reliable Delivery to External APIs

┌─────────────────────────────────────────────────────────────────────────────────────┐ │ CENTRIFUGE ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ PROBLEM: Deliver to 700+ third-party APIs where "dozens are failing at any time" │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ WHY NOT TRADITIONAL QUEUES? │ │ │ │ │ │ │ │ ✗ Single Queue: One slow endpoint backs up everything │ │ │ │ ✗ Per-Destination Queue: Big customers dominate, block small customers │ │ │ │ ✗ Per-Source Queue: 42K sources × 2.1 destinations = 88K queues │ │ │ │ (Kafka/RabbitMQ/NSQ can't handle this cardinality) │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ SOLUTION: DATABASE-AS-QUEUE │ │ │ │ │ │ │ │ MySQL/RDS provides flexibility that queues don't: │ │ │ │ "Change QoS by running a single SQL statement" │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ JOBS TABLE │ │ │ │ │ │ • Immutable, append-only │ │ │ │ │ │ • KSUID primary key (k-sortable by timestamp, globally unique) │ │ │ │ │ │ • Stores: payload, endpoint, headers, expiration │ │ │ │ │ │ • Single index on KSUID (no expensive updates) │ │ │ │ │ │ • Median payload: ~5KB, max: 750KB │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ JOB_STATE_TRANSITIONS TABLE │ │ │ │ │ │ • Also immutable (no updates, ever) │ │ │ │ │ │ • States: awaiting_scheduling → executing → succeeded/discarded │ │ │ │ │ │ → awaiting_retry → archiving → archived │ │ │ │ │ │ • Tracks retry count with exponential backoff │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ DIRECTORS │ │ │ │ (80-300 running at peak load) │ │ │ │ │ │ │ │ Each Director: │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ 1. Acquires exclusive DB lock via Consul session │ │ │ │ │ │ 2. Caches jobs in-memory for fast access │ │ │ │ │ │ 3. Accepts RPC requests, logs jobs, executes HTTP │ │ │ │ │ │ 4. On success (200): mark succeeded, expire from cache │ │ │ │ │ │ 5. On transient failure (5xx, timeout): mark awaiting_retry │ │ │ │ │ │ 6. On permanent failure (4xx): mark discarded │ │ │ │ │ │ 7. Scales horizontally based on CPU │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ JOBDB MANAGER │ │ │ │ │ │ │ │ • Dynamically provisions/retires databases matching compute scaling │ │ │ │ • Cycles JobDBs every ~30 minutes at ~100% fill │ │ │ │ • Uses TABLE DROP instead of random DELETEs (massive perf win) │ │ │ │ • Drains in-flight retry jobs before decommissioning │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ RETRY & ARCHIVAL │ │ │ │ │ │ │ │ • 4-hour retry window with exponential backoff │ │ │ │ • 1.5% of all messages succeed on retry (that's 163M extra events!) │ │ │ │ • Undelivered messages archived to S3 │ │ │ │ • March 2018: absorbed 85M events during 90-min third-party outage │ │ │ │ flushed entire queue in 30 min post-recovery │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ PRODUCTION SCALE │ │ │ │ │ │ │ │ • 400,000 outbound HTTP requests/second sustained │ │ │ │ • 2 million requests/second in load testing │ │ │ │ • 340 billion jobs executed monthly │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────┘

Why Database-as-Queue?

Approach	Pros	Cons
Kafka/RabbitMQ/NSQ	Purpose-built for messaging	Can't handle 88K queues, limited reordering
MySQL/RDS (Centrifuge)	Flexible QoS via SQL, handles cardinality, reorder anytime	Need careful schema design (immutable rows)

Key Design Principles

Immutable rows: No updates, ever. Append-only design eliminates write contention.
No JOINs: Per-job queries allow massive parallelization.
Write-heavy, small working set: Cache in memory, expire immediately after delivery.
TABLE DROP over DELETE: Cycle databases and drop tables instead of random deletes.

Deep Dive: Why Database-as-Queue?

The Core Problem: 88,000 Logical Queues

Segment has 42K sources sending to an average of 2.1 destinations each. That's 88K source×destination pairs, each needing fair scheduling and isolated failure handling.

Why Traditional Queues Fail

Approach	What Happens	Why It Fails
Single Queue	All messages in one FIFO queue	One slow/failing endpoint backs up everything. Google Analytics slow → Salesforce blocked.
Per-Destination Queue	One queue for Google Analytics, one for Salesforce, etc.	Big customers dominate. Walmart's 10M events/day blocks small startup's 1K events.
Per-Source×Destination Queue	Separate queue for each source-destination pair	42K × 2.1 = 88K queues. Kafka/RabbitMQ/NSQ can't handle this cardinality.

The Database-as-Queue Insight

Key Quote: "Change QoS by running a single SQL statement."

With a database, you can reorder, prioritize, and schedule jobs with SQL flexibility. Need to deprioritize a failing destination? One UPDATE. Need to prioritize transactional over promotional? Add a WHERE clause. Queues don't give you that.

Immutable Rows Design

The secret to making MySQL work at queue scale: never UPDATE, only INSERT.

-- Jobs table: immutable after insert
CREATE TABLE jobs (
    ksuid CHAR(27) PRIMARY KEY,  -- K-Sortable globally unique ID
    payload BLOB,                 -- The actual data (median 5KB, max 750KB)
    endpoint VARCHAR(255),
    headers TEXT,
    expires_at TIMESTAMP
);

-- State transitions: also immutable
CREATE TABLE job_state_transitions (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    job_ksuid CHAR(27),
    from_state ENUM('awaiting_scheduling','executing','awaiting_retry','archiving'),
    to_state ENUM('executing','succeeded','discarded','awaiting_retry','archived'),
    transitioned_at TIMESTAMP,
    retry_count INT DEFAULT 0
);

-- Finding jobs to execute: single SQL query
SELECT j.ksuid, j.payload, j.endpoint
FROM jobs j
JOIN (
    SELECT job_ksuid, MAX(id) as latest
    FROM job_state_transitions
    GROUP BY job_ksuid
) latest ON j.ksuid = latest.job_ksuid
JOIN job_state_transitions jst ON jst.id = latest.latest
WHERE jst.to_state = 'awaiting_scheduling'
  AND j.expires_at > NOW()
ORDER BY j.ksuid  -- KSUID is time-ordered!
LIMIT 1000;

Why Immutable Rows?

No write contention: INSERTs are append-only, no row locks
No index fragmentation: KSUID is time-ordered, so inserts are sequential
Simple queries: Join to find latest state, no complex UPDATE logic
Audit trail for free: Every state transition is recorded

The TABLE DROP Trick

Normal queue systems do random DELETEs, which fragments indexes and requires expensive vacuuming.

Centrifuge's approach:

Create a new JobDB every ~30 minutes
New jobs go to the newest database
Old databases drain their retry queues
When a database is empty: DROP TABLE jobs; DROP TABLE job_state_transitions;

Result: Zero random deletes. DROP TABLE is instant regardless of table size.

Directors Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                        DIRECTOR LIFECYCLE                                │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  1. ACQUIRE LOCK                                                         │
│     └─► Consul session locks specific JobDB                              │
│         (Only one Director per database)                                 │
│                                                                          │
│  2. LOAD JOBS INTO MEMORY                                                │
│     └─► SELECT awaiting_scheduling jobs from JobDB                       │
│     └─► Cache in-memory for fast access                                  │
│                                                                          │
│  3. ACCEPT RPC REQUESTS                                                  │
│     └─► Upstream Kafka consumer calls Director to log new job            │
│     └─► INSERT into jobs table + job_state_transitions                   │
│                                                                          │
│  4. EXECUTE HTTP REQUESTS                                                │
│     └─► Pop job from in-memory queue                                     │
│     └─► Make HTTP call to external API                                   │
│     └─► Handle response:                                                 │
│         ├─► 200 OK: INSERT state → 'succeeded'                           │
│         ├─► 5xx/timeout: INSERT state → 'awaiting_retry'                 │
│         └─► 4xx: INSERT state → 'discarded'                              │
│                                                                          │
│  5. RETRY LOOP                                                           │
│     └─► Background thread scans awaiting_retry                           │
│     └─► Exponential backoff (1s, 2s, 4s, 8s, ... up to 4 hours)         │
│     └─► Re-queue for execution                                           │
│                                                                          │
│  6. ARCHIVAL                                                             │
│     └─► After 4 hours: INSERT state → 'archiving'                        │
│     └─► Write to S3 for later replay                                     │
│     └─► INSERT state → 'archived'                                        │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Scale: 80-300 Directors running at peak, scaled by CPU utilization

Real-World Validation

March 2018 Outage Test:

A major third-party destination went down for 90 minutes
Centrifuge absorbed 85 million events into retry queues
When destination recovered, flushed entire backlog in 30 minutes
No data lost, no manual intervention required

Interview Talking Point

"Kafka is the durable backbone for Segment—it ensures data is never lost by persisting across availability zones, provides ordering guarantees through message partitioning, and enables replay for recovery. Each shard has primary and secondary Kafka clusters with automatic failover.

The deduplication layer is clever: they partition by messageId so the same ID always goes to the same consumer, which narrows the search space. Each worker runs an embedded RocksDB with Bloom filters—for new messages, the filter returns 'definitely not seen' without hitting disk. They maintain 60 billion keys across 1.5TB with a 4-week dedup window.

Centrifuge is even more interesting. Traditional message queues can't handle their cardinality—42K sources times 2.1 destinations means 88K queues. They built a database-as-queue using MySQL with immutable, append-only tables. Directors acquire exclusive locks via Consul, cache jobs in memory, execute HTTP requests, and scale horizontally based on CPU. The key insight: TABLE DROP is dramatically faster than random DELETEs, so they cycle entire databases every 30 minutes. Result: 400K outbound requests/second sustained, 2M in load testing."

SMS Messaging Architecture

Message Flow: API to Carrier

┌─────────────────────────────────────────────────────────────────────────────────────┐ │ TWILIO SMS MESSAGING PIPELINE │ ├─────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ YOUR APPLICATION │ │ │ │ │ │ │ │ POST /Messages │ │ │ │ { "To": "+1555...", "From": "+1888...", "Body": "Hello!" } │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ TWILIO API LAYER │ │ │ │ │ │ │ │ ┌────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ 1. Authentication (API Key / Auth Token) │ │ │ │ │ │ 2. Validation (E.164 format, sender eligibility) │ │ │ │ │ │ 3. Compliance check (TrustHub verification status) │ │ │ │ │ │ 4. Rate limit check (concurrent request limits) │ │ │ │ │ └────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ ACCOUNT QUEUE │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ • FIFO ordering │ │ │ │ │ │ • 4-hour maximum queue time (then expires with 30001) │ │ │ │ │ │ • Rate limited by sender type: │ │ │ │ │ │ - Short Code: 100+ MPS │ │ │ │ │ │ - Toll-Free: 3 MPS (upgradeable) │ │ │ │ │ │ - 10DLC: Variable by trust score │ │ │ │ │ │ - Long Code: 1 MPS │ │ │ │ │ │ • Queue capacity = 4 hours × MPS × 60 × 60 │ │ │ │ │ │ (Short Code 100 MPS = 1,440,000 message segments) │ │ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ SUPER NETWORK │ │ │ │ (4,800 carrier connections worldwide) │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ INTELLIGENT ROUTING │ │ │ │ │ │ • 900+ data points analyzed per message │ │ │ │ │ │ • 3.2 billion data points monitored daily │ │ │ │ │ │ • 4x redundant routes per destination (average) │ │ │ │ │ │ • Automatic failover if primary route degrades │ │ │ │ │ │ • Traffic Optimization Engine prioritizes by urgency │ │ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ SMPP GATEWAY │ │ │ │ │ │ • Short Message Peer-to-Peer protocol │ │ │ │ │ │ • Connects to carrier SMSCs (Short Message Service Centers) │ │ │ │ │ │ • Handles PDU encoding, segmentation, concatenation │ │ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ CARRIER NETWORK │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ AT&T │ │ Verizon │ │ T-Mobile │ │ Vodafone │ │ │ │ │ │ SMSC │ │ SMSC │ │ SMSC │ │ SMSC │ │ │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ │ │ │ │ │ │ │ └────────────────┴────────────────┴────────────────┘ │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ 📱 End User Device │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────┐ │ │ │ DELIVERY FEEDBACK LOOP │ │ │ │ │ │ │ │ Carrier DLR ──► Status Callback Webhook ──► Your Application │ │ │ │ (Delivery (queued → sending → sent (Update CRM, │ │ │ │ Receipt) → delivered/failed) trigger next) │ │ │ │ │ │ │ │ Event Streams ──► Kinesis/S3 ──► Data Warehouse (Analytics) │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────┘

Enterprise Messaging Architecture (10-Step Pattern)

1Application Event Ingestion: Internal API with auth gates incoming requests

2Event Processing: Evaluate CDP data, determine eligibility, set routing

3Internal Queue: Rate-limit throughput, prioritize (transactional > promotional)

4Twilio API: REST over HTTPS with API Keys + Public Key Client Validation

5Account Routing: Map to appropriate subaccount based on sender

6Compliance Layer: TrustHub verifies carrier registration

7Super Network: Optimize carrier selection for delivery

8Status Callbacks: Real-time feedback on message status

9Inbound Handling: Webhooks for 2-way messaging, compliance keywords

10Data Pipeline: Engagement metrics back to CDP/CRM

Programmable Voice Architecture

┌─────────────────────────────────────────────────────────────────────────────────────┐ │ TWILIO VOICE ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ CONNECTION METHODS │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ PSTN │ │ WebRTC │ │ SIP │ │ │ │ │ │ (Phone │ │ (Browser/ │ │ (Enterprise│ │ │ │ │ │ Network) │ │ Mobile SDK)│ │ PBX) │ │ │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ │ │ │ │ │ └──────────┼──────────────────┼──────────────────┼───────────────────────────────┘ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ GLOBAL LOW LATENCY (GLL) EDGE │ │ │ │ (9 Edge Locations) │ │ │ │ │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ │ │ Ashburn │ │ Dublin │ │Frankfurt │ │ Tokyo │ │ Sydney │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ │ │ São Paulo│ │Singapore │ │ Mumbai │ │ Oregon │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ │ │ │ │ • Auto-selects nearest edge based on call origin │ │ │ │ • Same edge for inbound/outbound legs (lowest latency) │ │ │ │ • Conferences placed where most participants located │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ SIGNALING & MEDIA SERVERS │ │ │ │ │ │ │ │ ┌────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ SIGNALING PATH │ │ │ │ │ │ │ │ │ │ │ │ Inbound Call ──► Webhook to your app ──► TwiML Response │ │ │ │ │ │ (, , , , , etc.) │ │ │ │ │ │ │ │ │ │ │ │ REST API ──► Initiate outbound call ──► Connect to TwiML │ │ │ │ │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ MEDIA PATH │ │ │ │ │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ Media Server │ │ │ │ │ │ │ │ • G.711 μ-law/A-law codecs (no transcoding to carriers) │ │ │ │ │ │ │ │ • SRTP encryption for secure media │ │ │ │ │ │ │ │ • Symmetric RTP for NAT traversal │ │ │ │ │ │ │ │ • STUN/TURN handled by Twilio (you don't need servers) │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ P2P Rooms: Twilio handles signaling, media goes direct │ │ │ │ │ │ Group Rooms: Twilio SFU (Selective Forwarding Unit) routes media │ │ │ │ │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ┌──────────────────────────┼──────────────────────────┐ │ │ ▼ ▼ ▼ │ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │ │ PSTN GATEWAY │ │ SIP INTERFACE │ │ BYOC TRUNK │ │ │ │ │ │ │ │ │ │ │ │ Connects to │ │ SIP Domain │ │ Bring Your Own │ │ │ │ carrier │ │ for PBX/SIP │ │ Carrier for │ │ │ │ networks │ │ clients │ │ existing #s │ │ │ │ │ │ │ │ │ │ │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────┐ │ │ │ CONVERSATIONRELAY (AI VOICE) │ │ │ │ │ │ │ │ Inbound Call ──► Speech-to-Text ──► WebSocket events ──► Your LLM │ │ │ │ (Deepgram, (text chunks) (OpenAI, │ │ │ │ Google, Claude, │ │ │ │ Amazon) Custom) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ Text-to-Speech ◄──────────────────────────────────────── Response │ │ │ │ (ElevenLabs, │ │ │ │ Amazon Polly) │ │ │ │ │ │ │ │ Features: Interruption handling, HIPAA-eligible, tool calling │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────┘

Complete Twilio Platform Architecture

┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ TWILIO PLATFORM (2025) │ ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ DEVELOPER INTERFACE │ │ │ │ │ │ │ │ REST APIs │ SDKs (JS, Python, Java, Go, ...) │ TwiML │ Console │ CLI │ │ │ │ │ │ │ └────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ┌────────────────────────────────────────────┼────────────────────────────────────────────────────────┐ │ │ │ ▼ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ IDENTITY LAYER (Stytch + Verify) │ │ │ │ │ │ │ │ │ │ │ │ Human Auth │ AI Agent Auth │ MCP/OAuth 2.1 │ Fraud Guard │ │ │ │ │ │ (2FA, Passkeys) (Scoped tokens, (Claude, ChatGPT (Bot detection, │ │ │ │ │ │ delegation) connectors) rate limiting) │ │ │ │ │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ └────────────────────────────────────────────┼────────────────────────────────────────────────────────┘ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ CUSTOMER DATA PLATFORM (Segment) │ │ │ │ │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ │ │ Sources │ │ Profiles │ │ Audiences │ │ Journeys │ │ Destinations │ │ │ │ │ │ (700+) │ │ (Unified) │ │ (ML-based) │ │ (Real-time) │ │ (700+) │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ │ │ │ │ │ Infrastructure: Kafka backbone • RocksDB dedup • Centrifuge delivery • 400K events/sec │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ┌────────────────────────────────────────────┼────────────────────────────────────────────────────────┐ │ │ │ ▼ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ AI / INTELLIGENCE LAYER │ │ │ │ │ │ │ │ │ │ │ │ ConversationRelay │ Conversational │ Agent Copilot │ Predictive │ │ │ │ │ │ (Voice AI Agents) Intelligence (Flex Contact Traits │ │ │ │ │ │ (Sentiment, Center) (ML) │ │ │ │ │ │ • LLM-agnostic Intent Detection) │ │ │ │ │ │ • HIPAA-eligible │ │ │ │ │ │ • Tool calling │ │ │ │ │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ └────────────────────────────────────────────┼────────────────────────────────────────────────────────┘ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ COMMUNICATIONS LAYER (CPaaS) │ │ │ │ │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ │ │ SMS │ │ Voice │ │ Email │ │ WhatsApp │ │ RCS │ │ Video │ │ Fax │ │ │ │ │ │ MMS │ │ (PSTN, │ │(SendGrid)│ │ Business │ │ Business │ │ (WebRTC) │ │ │ │ │ │ │ │ │ │ WebRTC, │ │ │ │ Calling │ │ Messaging│ │ │ │ │ │ │ │ │ │ │ │ SIP) │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ MESSAGING SERVICES │ │ │ │ │ │ • Sender Pools (Short Code, Toll-Free, 10DLC, Long Code) │ │ │ │ │ │ • Advanced Opt-Out │ │ │ │ │ │ • Sticky Sender (consistent caller ID per recipient) │ │ │ │ │ │ • Link Shortening & Tracking │ │ │ │ │ │ • Content API Templates │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ┌────────────────────────────────────────────┼────────────────────────────────────────────────────────┐ │ │ │ ▼ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ TRUST & COMPLIANCE │ │ │ │ │ │ │ │ │ │ │ │ TrustHub │ Compliance Toolkit │ Branded Comms │ Data Residency │ │ │ │ │ │ (Carrier Reg) (AI violation detect) (RCS, BIMI) (US, EU, APAC) │ │ │ │ │ │ │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ └────────────────────────────────────────────┼────────────────────────────────────────────────────────┘ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ SUPER NETWORK │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ INFRASTRUCTURE │ │ │ │ │ │ │ │ │ │ │ │ 4,800 Carrier │ 8 Edge Locations │ 3.2B Data Points │ Intelligent │ │ │ │ │ │ Connections (24 AZs) /Day Monitored Routing │ │ │ │ │ │ (900+ signals) │ │ │ │ │ │ │ │ │ │ │ │ 4x Redundant Routes │ 99.95% SLA │ Hybrid Cloud │ SMPP/SIP/PSTN │ │ │ │ │ │ Per Destination (99.999% CDP) (AWS + Dedicated) Gateways │ │ │ │ │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Key Architectural Takeaways

                    Layered Design
                    Identity at top: All interactions (human + AI) authenticated first
Data in middle: Segment provides context for personalization
AI augments: Intelligence layer enhances but doesn't replace channels
Channels at edge: CPaaS as the delivery mechanism
Network as foundation: Super Network makes it all reliable

                

                    "One Twilio" Integration Points
                    Segment → SMS/Email: Event-triggered journeys fire channels
Stytch → Verify: Human 2FA + AI agent auth unified
ConversationRelay → Flex: AI handles, escalates to human
Segment → ConversationRelay: Profiles inform AI context
All channels → Segment: Engagement data feeds back to CDP