doc-importer
SolidImport external documents (PDF, DOCX, PPTX, XLSX, HTML) into editable markdown for rewriting or project integration.
Data & Documents 308 stars
27 forks Updated today MIT
Install
Quality Score: 96/100
Stars 20%
Recency 20%
Frontmatter 20%
Documentation 15%
Issue Health 10%
License 10%
Description 5%
Skill Content
# Document Importer
Import external documents into editable markdown.
## When To Use
- User provides a DOCX, PPTX, XLSX, PDF, or HTML file
to convert into project documentation
- User wants to extract content from a document for
rewriting or remediation
- User has a slide deck or spreadsheet to turn into
markdown documentation
## When NOT To Use
- Academic paper analysis: use `tome:papers`
- Web article knowledge intake: use
`memory-palace:knowledge-intake`
- Content already in markdown: use `scribe:doc-generator`
remediation mode directly
## Import Workflow
### Step 1: Identify Source
Determine the source document:
- **Local file path**: verify it exists with Read tool
- **URL**: verify accessibility
- **User description**: confirm format and location
### Step 2: Convert to Markdown
Apply the `leyline:document-conversion` protocol:
1. Construct URI from source (file path or URL)
2. Try the markitdown MCP tool for best quality
3. If unavailable, use native tool fallbacks
4. If format unsupported, inform user
### Step 3: Structural Cleanup
After conversion, normalize the markdown:
- Ensure ATX headings (`# style`, not setext underlines)
- Wrap prose lines at 80 characters per
`leyline:markdown-formatting`
- Fix broken tables (align columns, add headers)
- Remove conversion artifacts (page numbers,
headers/footers, watermarks, repeated logos)
- Preserve all substantive content
### Step 4: Sanitize External Content
Apply the `leyline:content-saniti...
Details
- Author
- athola
- Repository
- athola/claude-night-market
- Created
- 6 months ago
- Last Updated
- today
- Language
- Python
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
Data & Documents Listed
docx-to-md
Convert Word documents (.docx) to clean Markdown with image extraction and pandoc cleanup
0 Updated yesterday
fabioc-aloha Data & Documents Listed
kb-import
Import knowledge from existing documents into structured KB entries. Reads source documents (Markdown, PDF, DOCX, plain text), extracts key information, and creates properly formatted KB entries with YAML frontmatter.
81 Updated today
techwolf-ai Data & Documents Solid
markitdown
Convert files and office documents to Markdown. Supports PDF, DOCX, PPTX, XLSX, images (with OCR), audio (with transcription), HTML, CSV, JSON, XML, ZIP, YouTube URLs, EPubs and more.
353 Updated today
aiskillstore