ocr-document-processorlisted

Extract text from images and scanned PDFs using OCR. Supports 100+ languages, table detection, structured output (markdown/JSON), and batch processing.
rvanbaalen/skills · ★ 0 · Data & Documents · score 60

Install: claude install-skill rvanbaalen/skills

# OCR Document Processor Extract text from images, scanned PDFs, and photographs using Optical Character Recognition (OCR). Supports multiple languages, structured output formats, and intelligent document parsing. ## Core Capabilities - **Image OCR**: Extract text from PNG, JPEG, TIFF, BMP images - **PDF OCR**: Process scanned PDFs page by page - **Multi-language**: Support for 100+ languages - **Structured Output**: Plain text, Markdown, JSON, or HTML - **Table Detection**: Extract tabular data to CSV/JSON - **Batch Processing**: Process multiple documents at once - **Quality Assessment**: Confidence scoring for OCR results ## Quick Start ```python from scripts.ocr_processor import OCRProcessor # Simple text extraction processor = OCRProcessor("document.png") text = processor.extract_text() print(text) # Extract to structured format result = processor.extract_structured() print(result['text']) print(result['confidence']) print(result['blocks']) # Text blocks with positions ``` ## Core Workflow ### 1. Basic Text Extraction ```python from scripts.ocr_processor import OCRProcessor # From image processor = OCRProcessor("scan.png") text = processor.extract_text() # From PDF processor = OCRProcessor("scanned.pdf") text = processor.extract_text() # All pages # Specific pages text = processor.extract_text(pages=[1, 2, 3]) ``` ### 2. Structured Extraction ```python # Get detailed results result = processor.extract_structured() # Result contains: # - text: Full extra