← ClaudeAtlas

watchlisted

Watch a video from YouTube, Instagram, X/Twitter, Vimeo, TikTok or any of ~1800 yt-dlp sites (or a local path). Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls the transcript from captions (or local mlx-whisper fallback, no API key), and hands the result to Claude so it can answer questions about what's in the video.
mathiaschu/watch · ★ 14 · Data & Documents · score 84
Install: claude install-skill mathiaschu/watch
# /watch — Claude watches a video You don't have a video input; this skill gives you one. A Python script downloads the video, extracts frames as JPEGs, gets a timestamped transcript (native captions first, then **local mlx-whisper** as fallback — runs on-device, no API and no key), and prints frame paths. You then `Read` each frame path to see the images and combine them with the transcript to answer the user. ## Step 0 — Setup preflight (runs every `/watch` invocation, silent on success) **Python interpreter:** every `python3 ...` command in this skill is for macOS/Linux. On **Windows**, substitute `python` — the `python3` command on Windows is the Microsoft Store stub and will not run the script. Before every `/watch` run, verify that dependencies are in place: ```bash python3 "${CLAUDE_SKILL_DIR}/scripts/setup.py" --check ``` This is a <100ms lookup. On exit 0, the script emits **nothing** — proceed to Step 1 without comment. **Do NOT announce "setup is complete" to the user** — they don't need a status message on every turn. The only acceptable user-visible output from Step 0 is when remediation is required. On non-zero exit, follow the table: | Exit | Meaning | Action | |------|---------|--------| | `2` | Missing binaries (`ffmpeg` / `ffprobe` / `yt-dlp`) | Run installer | | `3` | No local whisper engine (`mlx-whisper` / `openai-whisper`) | Run installer, then tell user the `pip3` command it prints | | `4` | Both missing | Run installer | The installer is idemp