← ClaudeAtlas

pdf-processing-prolisted

Production-ready PDF processing with forms, tables, OCR, validation, and batch operations. Use when working with complex PDF workflows in production environments, processing large volumes of PDFs, or requiring robust error handling and validation.
anbeime/skill · ★ 1,332 · Data & Documents · score 83
Install: claude install-skill anbeime/skill
# PDF Processing Pro Production-ready PDF processing toolkit with pre-built scripts, comprehensive error handling, and support for complex workflows. ## Quick start ### Extract text from PDF ```python import pdfplumber with pdfplumber.open("document.pdf") as pdf: text = pdf.pages[0].extract_text() print(text) ``` ### Analyze PDF form (using included script) ```bash python scripts/analyze_form.py input.pdf --output fields.json # Returns: JSON with all form fields, types, and positions ``` ### Fill PDF form with validation ```bash python scripts/fill_form.py input.pdf data.json output.pdf # Validates all fields before filling, includes error reporting ``` ### Extract tables from PDF ```bash python scripts/extract_tables.py report.pdf --output tables.csv # Extracts all tables with automatic column detection ``` ## Features ### ✅ Production-ready scripts All scripts include: - **Error handling**: Graceful failures with detailed error messages - **Validation**: Input validation and type checking - **Logging**: Configurable logging with timestamps - **Type hints**: Full type annotations for IDE support - **CLI interface**: `--help` flag for all scripts - **Exit codes**: Proper exit codes for automation ### ✅ Comprehensive workflows - **PDF Forms**: Complete form processing pipeline - **Table Extraction**: Advanced table detection and extraction - **OCR Processing**: Scanned PDF text extraction - **Batch Operations**: Process multiple PDFs efficiently - **Val