whisper

Featured

OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.

AI & Automation 27,984 stars 2901 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Whisper - Robust Speech Recognition OpenAI's multilingual speech recognition model. ## When to use Whisper **Use when:** - Speech-to-text transcription (99 languages) - Podcast/video transcription - Meeting notes automation - Translation to English - Noisy audio transcription - Multilingual audio processing **Metrics**: - **72,900+ GitHub stars** - 99 languages supported - Trained on 680,000 hours of audio - MIT License **Use alternatives instead**: - **AssemblyAI**: Managed API, speaker diarization - **Deepgram**: Real-time streaming ASR - **Google Speech-to-Text**: Cloud-based ## Quick start ### Installation ```bash # Requires Python 3.8-3.11 pip install -U openai-whisper # Requires ffmpeg # macOS: brew install ffmpeg # Ubuntu: sudo apt install ffmpeg # Windows: choco install ffmpeg ``` ### Basic transcription ```python import whisper # Load model model = whisper.load_model("base") # Transcribe result = model.transcribe("audio.mp3") # Print text print(result["text"]) # Access segments for segment in result["segments"]: print(f"[{segment['start']:.2f}s - {segment['end']:.2f}s] {segment['text']}") ``` ## Model sizes ```python # Available models models = ["tiny", "base", "small", "medium", "large", "turbo"] # Load specific model model = whisper.load_model("turbo") # Fastest, good quality ``` | Model | Parameters | English-only | Multilingual | Speed | VRAM | |-------|------------|--------------|--------------|-------|------| | tiny | 39M | ✓ | ✓ | ~32...

Details

Author
davila7
Repository
davila7/claude-code-templates
Created
11 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category