podcast-generation

Solid

Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creation from content, or integrating with Azure OpenAI Realtime API for real audio output. Covers full-stack implementation from React frontend to Python FastAPI backend with WebSocket streaming.

AI & Automation 2,541 stars 295 forks Updated yesterday MIT

Install

View on GitHub

Quality Score: 93/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Podcast Generation with GPT Realtime Mini Generate real audio narratives from text content using Azure OpenAI's Realtime API. ## Quick Start 1. Configure environment variables for Realtime API 2. Connect via WebSocket to Azure OpenAI Realtime endpoint 3. Send text prompt, collect PCM audio chunks + transcript 4. Convert PCM to WAV format 5. Return base64-encoded audio to frontend for playback ## Environment Configuration ```env AZURE_OPENAI_AUDIO_API_KEY=your_realtime_api_key AZURE_OPENAI_AUDIO_ENDPOINT=https://your-resource.cognitiveservices.azure.com AZURE_OPENAI_AUDIO_DEPLOYMENT=gpt-realtime-mini ``` **Note**: Endpoint should NOT include `/openai/v1/` - just the base URL. ## Core Workflow ### Backend Audio Generation ```python from openai import AsyncOpenAI import base64 # Convert HTTPS endpoint to WebSocket URL ws_url = endpoint.replace("https://", "wss://") + "/openai/v1" client = AsyncOpenAI( websocket_base_url=ws_url, api_key=api_key ) audio_chunks = [] transcript_parts = [] async with client.realtime.connect(model="gpt-realtime-mini") as conn: # Configure for audio-only output await conn.session.update(session={ "output_modalities": ["audio"], "instructions": "You are a narrator. Speak naturally." }) # Send text to narrate await conn.conversation.item.create(item={ "type": "message", "role": "user", "content": [{"type": "input_text", "text": prompt}] }) await conn.resp...

Details

Author
microsoft
Repository
microsoft/skills
Created
4 months ago
Last Updated
yesterday
Language
TypeScript
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category