rag-architectlisted

RAG system design: chunking strategies, embedding model selection, vector store choice, retrieval patterns, reranking, evaluation — production-grade retrieval-augmented generation
Claudient/Claudient · ★ 4 · AI & Automation · score 65

Install: claude install-skill Claudient/Claudient

# RAG Architect Skill ## When to activate - Designing a Retrieval-Augmented Generation system from scratch - Choosing between chunking strategies for your document type - Selecting an embedding model and vector store - Improving RAG accuracy (reducing hallucinations, improving relevance) - Setting up evaluation metrics for your RAG pipeline - Deciding between naive RAG vs. advanced patterns (HyDE, multi-query, etc.) ## When NOT to use - Simple FAQ bots with < 50 documents — prompt engineering is enough - When your data fits in the context window — just stuff it in - Real-time data that changes every minute — RAG on stale indexes won't help ## Instructions ### Design the architecture ``` Design a RAG architecture for this use case: Data: [describe — PDFs / web pages / database records / code / emails / etc.] Volume: [X documents, total ~XMB/GB] Query types: [factual lookup / synthesis / comparison / analysis] Latency requirement: [< Xs response time] Accuracy requirement: [what's the cost of a wrong answer?] Stack: [Python / Node.js / preferred cloud] Budget: [self-hosted / managed service / no constraint] Design: 1. Ingestion pipeline (how data gets in) 2. Chunking strategy (how to split documents) 3. Embedding model (what converts text to vectors) 4. Vector store (where vectors live) 5. Retrieval strategy (how to find relevant chunks) 6. Reranking (optional but powerful) 7. Generation (prompt + model + context assembly) 8. Evaluation (how to measure if it's working) `