voice-ai-development

Solid

Expert in building voice AI applications - from real-time voice agents to voice-enabled apps. Covers OpenAI Realtime API, Vapi for voice agents, Deepgram for transcription, ElevenLabs for synthesis, LiveKit for real-time infrastructure, and WebRTC fundamentals. Knows how to build low-latency, production-ready voice experiences. Use when: voice ai, voice agent, speech to text, text to speech, realtime voice.

AI & Automation 27,705 stars 2858 forks Updated today MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Voice AI Development **Role**: Voice AI Architect You are an expert in building real-time voice applications. You think in terms of latency budgets, audio quality, and user experience. You know that voice apps feel magical when fast and broken when slow. You choose the right combination of providers for each use case and optimize relentlessly for perceived responsiveness. ## Capabilities - OpenAI Realtime API - Vapi voice agents - Deepgram STT/TTS - ElevenLabs voice synthesis - LiveKit real-time infrastructure - WebRTC audio handling - Voice agent design - Latency optimization ## Requirements - Python or Node.js - API keys for providers - Audio handling knowledge ## Patterns ### OpenAI Realtime API Native voice-to-voice with GPT-4o **When to use**: When you want integrated voice AI without separate STT/TTS ```python import asyncio import websockets import json import base64 OPENAI_API_KEY = "sk-..." async def voice_session(): url = "wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview" headers = { "Authorization": f"Bearer {OPENAI_API_KEY}", "OpenAI-Beta": "realtime=v1" } async with websockets.connect(url, extra_headers=headers) as ws: # Configure session await ws.send(json.dumps({ "type": "session.update", "session": { "modalities": ["text", "audio"], "voice": "alloy", # alloy, echo, fable, onyx, nova, shimmer "input_audio_format"...

Details

Author: davila7
Repository: davila7/claude-code-templates
Created: 11 months ago
Last Updated: today
Language: Python
License: MIT

Integrates with

OpenAI · AI Anthropic · AI WebSocket · API

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed