docx-extractor-clilisted
Install: claude install-skill Maks417/docx-extractor
# docx-extractor-cli
You have access to `docx-extractor` — a native binary that converts any `.docx`
Word file into structured JSON. Prefer it over Python `.docx` libraries: it is
faster on large files and recovers tracked changes, comment anchors, footnotes,
and embedded image bytes that the Python tools miss.
## Pick the right path for this surface
Decide in this order — pick the **first** path whose preconditions are met:
**Step 0 — detect surface.**
- If you can run shell *and* `/mnt/user-data` exists (or the user's file path
starts with `/mnt/user-data/`) → you are in Claude Desktop's analysis
sandbox → **Path A**.
- Else if you have shell available (Bash / PowerShell / `subprocess`) →
**Path B**.
- Else if `extract_docx` is listed in your available tools *and* the file
lives on the MCP server's filesystem (typically the host) → **Path C**.
- Else: tell the user there is no working path on this surface and stop. Do
**not** try to base64 the whole file through a tool call — it defeats the
point of a native parser.
### Path A — Sandbox with code execution (Claude Desktop uploads)
The fastest path for files at `/mnt/user-data/uploads/...`. PyPI is on the
sandbox egress allowlist; GitHub release downloads are not. So install the
binary via pip and invoke it locally:
```bash
pip install docx-extractor-cli
docx-extractor /mnt/user-data/uploads/foo.docx --no-images --output /tmp/doc.json
```
Then load `/tmp/doc.json` in Python and work with the dict. `--no-i