processing-docx

Solid

Processes Word document files (.docx). Creates, edits, annotates, tracks revisions, analyzes OOXML structure, and preserves formatting for contracts, policies, academic papers, and business documents. Use when working with .docx files or Word documents. Do NOT use for PDFs, spreadsheets, presentations, or plain text files.

Data & Documents 228 stars 30 forks Updated today MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%
79
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
80
License 10%
100
Description 5%
100

Skill Content

# DOCX Processing `.docx` is a ZIP archive of XML and resources. Different tasks have different tools and workflows. ## Workflow Decision | Intent | Workflow | Reference | |--------|----------|-----------| | **Read/analyze text only** | pandoc → markdown | [raw-xml-access.md](references/raw-xml-access.md) | | **Read structure, comments, media, formatting** | unpack → raw XML | [raw-xml-access.md](references/raw-xml-access.md) | | **Create new document** | docx-js (JS/TS) | [docx-js.md](docx-js.md) | | **Edit own document, simple changes** | Document library (Python) | [ooxml.md](ooxml.md) | | **Edit someone else's document** | Redlining (tracked changes) | [redlining.md](references/redlining.md) | | **Legal / academic / business / gov docs** | Redlining — REQUIRED | [redlining.md](references/redlining.md) | | **Visual analysis** | soffice → PDF → pdftoppm | [raw-xml-access.md](references/raw-xml-access.md) | ## Text Extraction (Quick) ```bash pandoc --track-changes=all path-to-file.docx -o output.md # --track-changes=accept/reject/all ``` ## Create New Document 1. **MANDATORY — READ ENTIRE FILE**: `docx-js.md` (~500 lines). NEVER set range limits. 2. Create JS/TS file using Document, Paragraph, TextRun components. 3. Export with `Packer.toBuffer()`. ## Edit Existing Document (Own, Simple) 1. **MANDATORY — READ ENTIRE FILE**: `ooxml.md` (~600 lines). NEVER set range limits. 2. `python ooxml/scripts/unpack.py <office_file> <output_dir>` 3. Run Python script using Docum...

Details

Author
telagod
Repository
telagod/code-abyss
Created
4 months ago
Last Updated
today
Language
JavaScript
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category