docx-advanced-patternslisted
Install: claude install-skill belumume/claude-skills
# DOCX Advanced Patterns Skill
Specialized patterns for python-docx that handle complex document structures not covered by basic `.text` extraction.
## When to Use This Skill
Invoke this skill when working with DOCX files that have:
- Nested tables within table cells
- Forms with checkbox options
- Complex multi-row cell layouts
- Checklists with embedded options
- Cell content that doesn't appear with `.text` property
**Use alongside** the official `docx` skill for comprehensive document handling.
## Core Pattern: Nested Table Extraction
### Problem
python-docx's `cell.text` property only extracts direct paragraph text - it **does not** traverse nested tables within cells.
**Symptom:**
```python
cell.text # Returns: '' or '\n'
# But cell visually contains content!
```
### Detection
Check if a cell contains nested tables:
```python
if cell.tables:
print(f"Found {len(cell.tables)} nested table(s)")
# Cell has nested content - need special extraction
```
### Solution (Simple)
```python
def extract_cell_content_with_nested_tables(cell):
"""
Extract all text from a cell, including text from nested tables.
Args:
cell: python-docx _Cell object
Returns:
str: Combined text from cell paragraphs and nested tables
"""
text_parts = []
# Get direct paragraph text (not inside nested tables)
for para in cell.paragraphs:
para_text = para.text.strip()
if para_text:
text_parts.append(para_text)