transcribe

Solid

Transcribe audio and video files using the configured speech-to-text provider

AI & Automation 648 stars 94 forks Updated today MIT

Install

View on GitHub

Quality Score: 89/100

Stars 20%
94
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
52
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

Transcribe audio and video files using the configured speech-to-text provider. Supports multiple STT providers including OpenAI Whisper, Deepgram, and Google Gemini — the active provider is selected in Settings under Speech-to-Text (`services.stt`). ## Usage Notes - The tool accepts a `file_path` (absolute path to a local audio or video file) to transcribe. - Supported formats: any video (mp4, mov, etc.) or audio (mp3, wav, m4a, etc.) file. - For video files, audio is automatically extracted via ffmpeg before transcription. - Large files are automatically split into chunks for processing. - If no STT provider credentials are configured, the tool will return an error with setup instructions. - The STT provider (`services.stt`) is shared between transcription and telephony call paths. ## Maintenance When adding or modifying an STT provider, follow the onboarding checklist at `assistant/docs/stt-provider-onboarding.md`. That document covers the daemon catalog, config schema, adapter wiring, client catalog parity, and required tests.

Details

Author
vellum-ai
Repository
vellum-ai/vellum-assistant
Created
4 months ago
Last Updated
today
Language
TypeScript
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category