doc-importer

Solid

Import external documents (PDF, DOCX, PPTX, XLSX, HTML) into editable markdown for rewriting or project integration.

Data & Documents 308 stars 27 forks Updated today MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%
83
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Document Importer Import external documents into editable markdown. ## When To Use - User provides a DOCX, PPTX, XLSX, PDF, or HTML file to convert into project documentation - User wants to extract content from a document for rewriting or remediation - User has a slide deck or spreadsheet to turn into markdown documentation ## When NOT To Use - Academic paper analysis: use `tome:papers` - Web article knowledge intake: use `memory-palace:knowledge-intake` - Content already in markdown: use `scribe:doc-generator` remediation mode directly ## Import Workflow ### Step 1: Identify Source Determine the source document: - **Local file path**: verify it exists with Read tool - **URL**: verify accessibility - **User description**: confirm format and location ### Step 2: Convert to Markdown Apply the `leyline:document-conversion` protocol: 1. Construct URI from source (file path or URL) 2. Try the markitdown MCP tool for best quality 3. If unavailable, use native tool fallbacks 4. If format unsupported, inform user ### Step 3: Structural Cleanup After conversion, normalize the markdown: - Ensure ATX headings (`# style`, not setext underlines) - Wrap prose lines at 80 characters per `leyline:markdown-formatting` - Fix broken tables (align columns, add headers) - Remove conversion artifacts (page numbers, headers/footers, watermarks, repeated logos) - Preserve all substantive content ### Step 4: Sanitize External Content Apply the `leyline:content-saniti...

Details

Author
athola
Repository
athola/claude-night-market
Created
6 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category