← ClaudeAtlas

openai-whisperlisted

Speech-to-text transcription via OpenAI Whisper. Supports two modes — Local CLI (no API key, runs on-device) and Cloud API (fast, scalable, requires OPENAI_API_KEY). Use when the user needs to transcribe audio files, translate speech, or convert audio to text.
rkz91/coco · ★ 3 · AI & Automation · score 72
Install: claude install-skill rkz91/coco
# OpenAI Whisper — Speech-to-Text Transcribe audio files using OpenAI's Whisper model. Two modes available depending on your needs: | Mode | Latency | Cost | Privacy | Setup | |------|---------|------|---------|-------| | Local CLI | Slower (on-device GPU/CPU) | Free | Audio never leaves machine | Install `whisper` binary | | Cloud API | Fast | Per-minute pricing | Audio sent to OpenAI | `OPENAI_API_KEY` required | --- ## Mode 1: Local CLI Run Whisper locally with no API key required. Models download to `~/.cache/whisper` on first run. ### Quick Start ```bash whisper /path/audio.mp3 --model medium --output_format txt --output_dir . ``` ### Common Commands ```bash # Transcribe to text file whisper /path/audio.mp3 --model medium --output_format txt --output_dir . # Transcribe with translation to English whisper /path/audio.m4a --task translate --output_format srt # Transcribe with specific language whisper /path/audio.wav --model large --language en --output_format json ``` ### Model Selection | Model | Speed | Accuracy | VRAM | |-------|-------|----------|------| | `tiny` | Fastest | Lowest | ~1 GB | | `base` | Fast | Low | ~1 GB | | `small` | Medium | Good | ~2 GB | | `medium` | Slow | Better | ~5 GB | | `large` | Slowest | Best | ~10 GB | | `turbo` | Fast | Good (default) | ~6 GB | ### Output Formats - `txt` — Plain text transcript - `srt` — SubRip subtitle format with timestamps - `vtt` — WebVTT subtitle format - `json` — Detailed JSON with word-level timesta