study-imglisted
Install: claude install-skill 2362094903-ops/study-assistant-skills
# Study Image Reading
**Output language: ALL learner-facing content MUST be in Simplified Chinese.**
## Step 1: try native vision first (zero config, zero cost)
Use the Read tool on the image file:
- **You can perceive the image** (the model is multimodal) → work directly from what you see; the bundled script is unnecessary. Still follow the per-scenario output requirements in "Modes" below (scanned pages get a full transcription with LaTeX formulas; handwritten answers are transcribed verbatim without corrections; figures get a teaching-grade description).
- **Read errors out or you cannot perceive the image** → the model has no vision; go to step 2. One failed attempt per session is enough evidence — don't retry on every image.
## Step 2: external vision API (script)
```bash
python3 ~/.claude/skills/study-img/scripts/recognize.py <image> --mode <mode>
```
### First use: configuration walkthrough
The script uses the **user's own** vision-model API. With no configuration it exits with code 2 and prints a guide — at that point ask the user for three things (in Chinese):
1. **API type**: OpenAI-compatible (DashScope/Zhipu/Moonshot/SiliconFlow/OpenRouter/OpenAI — almost everything) or Anthropic;
2. **base_url and api_key** (base_url optional for Anthropic);
3. **vision model name** (e.g. qwen3.5-flash, glm-4v-flash, claude-sonnet-4-6).
Write the config to `~/.config/study-img/config.json` (path also shown by `recognize.py --show-config`), `chmod 600`, then verify with