← ClaudeAtlas

new-pipelinelisted

Scaffold a new data pipeline under pipelines/ with the pipeline directory structure and required docs. Use when the user wants to create a new data processing pipeline.
hsigstad/research-kit · ★ 0 · Data & Documents · score 75
Install: claude install-skill hsigstad/research-kit
# Scaffold a new data pipeline Create a new pipeline repository under `pipelines/`. ## Finding the workspace root The workspace root contains `CLAUDE.md` alongside `projects/`, `pipelines/`, `$ROOT/data_catalog/`, `research/`. If the current directory is inside a project or pipeline, search upward to find the root. Use `$ROOT` for all paths below. ## Step 1: Gather information Ask the user: 1. **Pipeline slug** — short directory name (e.g., `brazil`, `justica`, `politica`) 2. **Purpose** — what data does this pipeline clean/process? 3. **Input data** — raw data sources (check `$ROOT/data_catalog/` for existing ones) 4. **Output** — what cleaned datasets does it produce? ## Step 2: Create the directory structure Follow the pipeline structure from `$ROOT/research/rules/project_docs_contract.md`: ``` $ROOT/pipelines/<slug>/ README.md CLAUDE.md .claude/ settings.local.json docs/ summary.md thinking.md todo.md data.md decisions.md archive.md source/ build/ .gitkeep ``` Note: pipelines do NOT have `paper/`, `talk/`, or the full set of project docs (no meetings.md, feedback.md, literature.md, institutions.md, methods.md, results.md). ## Step 3: Populate files ### CLAUDE.md ```markdown # <Pipeline Name> Pipeline ## Purpose <from user input> ## Input <data sources> ## Output <cleaned datasets produced> ## Shared code - Uses `diarios` module for court/legal data utilities — check `$ROOT/research/meta/diarios_api.md` before wr