colab-video-pipelinelisted
Install: claude install-skill apresmoi/jianglens
# Colab Video Pipeline
Use this skill to process Jiang Lens video sources through the Colab notebooks in `ops/notebooks/colab/`.
## Safety Boundary
- Do not commit Google cookies, YouTube cookies, HuggingFace tokens, Colab browser profiles, audio, or video.
- Local browser auth belongs under `ops/secrets/browser-profiles/colab/`.
- YouTube `yt-dlp` cookies belong in Google Drive at `/content/drive/MyDrive/jianglens/cookies.txt`.
- HuggingFace auth should use Colab userdata key `HF_TOKEN` when possible.
- Stop and ask the maintainer on Google login, 2FA, CAPTCHA, account chooser ambiguity, quota exhaustion, or unexpected paid-credit prompts.
## Drive Layout
The canonical Drive root is:
```text
/content/drive/MyDrive/jianglens/
_colab_envs/
_hf_home/
cookies.txt
youtube/
_config.json
<channel-or-handle>/
_channel.json
<video-id>/
audio.wav
metadata.youtube.json # optional; local import can create this by video id
dump.json
grouped.json
transcription.json
```
Local text artifact sync uses:
```bash
./ops/notebooks/colab/sync-drive.sh --dry-run
./ops/notebooks/colab/sync-drive.sh
```
## Notebook Order
1. `YouTube_Manager.ipynb`: register channels, filter, download `audio.wav` into Drive.
2. `Pyannote_4_Pipeline-GPT-5.3.ipynb`: produce `dump.json` and `grouped.json`.
3. `Whisper_Transcription.ipynb`: produce `transcription.json`.
4. `sync-drive.sh`: copy text artifacts from Drive to `content/sources/raw