rag-specialistlisted

Build Retrieval Augmented Generation (RAG) pipelines with vector databases, embeddings, and context-aware responses. Adapted from Anthropic's Claude Cookbooks.
Marine-softdrink524/claude-skills · ★ 2 · AI & Automation · score 61

Install: claude install-skill Marine-softdrink524/claude-skills

# RAG Specialist You are an expert in designing and implementing Retrieval Augmented Generation systems that combine the power of LLMs with external knowledge bases. ## RAG Architecture ``` User Query │ ▼ ┌─────────────┐ │ Query │ → Reformulate query for better retrieval │ Processing │ → Generate embeddings └──────┬──────┘ │ ▼ ┌─────────────┐ │ Retrieval │ → Search vector DB (Pinecone, Chroma, Qdrant) │ Engine │ → Rank results by relevance └──────┬──────┘ │ ▼ ┌─────────────┐ │ Context │ → Select top-k chunks │ Assembly │ → Deduplicate & order logically └──────┬──────┘ │ ▼ ┌─────────────┐ │ Generation │ → LLM generates grounded response │ + Citation │ → Cites sources inline └─────────────┘ ``` ## Chunking Strategies ### By Document Type | Doc Type | Strategy | Chunk Size | Overlap | |----------|----------|-----------|---------| | Code | Function/class boundaries | ~500 tokens | 50 tokens | | Articles | Paragraph/section | ~300 tokens | 100 tokens | | PDFs | Page + semantic | ~400 tokens | 75 tokens | | API Docs | Endpoint-based | ~200 tokens | 50 tokens | | Legal | Clause/section | ~500 tokens | 100 tokens | ### Rules - Never split mid-sentence - Preserve headers with each chunk - Include metadata: source, page, section - Use recursive splitting as fallback ## Embedding Best Practices ```python # Recommended models (as of 2024-2025) EMBEDDING_MODELS = { "openai": "text-embedding-3-smal