pdf-processinglisted
Install: claude install-skill zartin790/llm_system_template_agents_skills_patterns_tools_prompts
# PDF Processing
## Quick start
Use pdfplumber to extract text from PDFs:
```python
import pdfplumber
with pdfplumber.open("document.pdf") as pdf:
text = pdf.pages[0].extract_text()
print(text)
```
## Extracting tables
Extract tables from PDFs with automatic detection:
```python
import pdfplumber
with pdfplumber.open("report.pdf") as pdf:
page = pdf.pages[0]
tables = page.extract_tables()
for table in tables:
for row in table:
print(row)
```
## Extracting all pages
Process multi-page documents efficiently:
```python
import pdfplumber
with pdfplumber.open("document.pdf") as pdf:
full_text = ""
for page in pdf.pages:
full_text += page.extract_text() + "\n\n"
print(full_text)
```
## Form filling
For PDF form filling, see [FORMS.md](FORMS.md) for the complete guide including field analysis and validation.
## Merging PDFs
Combine multiple PDF files:
```python
from pypdf import PdfMerger
merger = PdfMerger()
for pdf in ["file1.pdf", "file2.pdf", "file3.pdf"]:
merger.append(pdf)
merger.write("merged.pdf")
merger.close()
```
## Splitting PDFs
Extract specific pages or ranges:
```python
from pypdf import PdfReader, PdfWriter
reader = PdfReader("input.pdf")
writer = PdfWriter()
# Extract pages 2-5
for page_num in range(1, 5):
writer.add_page(reader.pages[page_num])
with open("output.pdf", "wb") as output:
writer.write(output)
```
## Available packages
- **pdfplumber** - Text and table