dbt-data-quality-gatelisted

Enforce data quality, testing, contracts, and PII governance in a dbt project, gated by checks that actually run over dbt's compiled artifacts (target/manifest.json, target/run_results.json) — both plain JSON, so the gate is stdlib-only Python with no warehouse connection. Use when the user wants to add a data-quality CI gate, require tests/descriptions/owners on dbt models, enforce data contracts, check source freshness, find untagged PII columns, set a minimum test count or test pass-rate, or harden a data pipeline before merge. Triggers: "dbt", "data quality", "data contracts", "PII", "data tests", "freshness", "data pipeline gate".
NeuralMedic-DE/claude-skills · ★ 0 · AI & Automation · score 73

Install: claude install-skill NeuralMedic-DE/claude-skills

# dbt data-quality gate (verified over artifacts) Hold a dbt project to a data-quality and governance policy and **prove it** — conformance is gated by a script that reads dbt's own compiled artifacts, maps each breach to a rule id + severity, and exits non-zero on blocking failures, not by assertion. ## Core principle **Quality is enforced, not assumed.** The loop is: run the gate → triage by severity → fix the root cause (add a test, a description, a tag, an owner) → re-run, until the blocking-severity count is zero. **Be honest about scope (this is the rule that keeps the skill correct):** tests only assert what you encode. A green gate means **your declared expectations held**, not that the data is correct, complete, or compliant. PII detection by column name is heuristic — it misses unnamed/encoded PII and false-positives on lookalikes. Freshness and volume anomalies need runtime data, not just the manifest. This **assists** data governance; it is **not** a guarantee of data correctness or GDPR compliance. → `references/01-data-contracts-and-quality.md` ## When to use vs. not - Use for: adding a data-quality / data-contract CI gate to a dbt project; requiring tests, descriptions, owners, and freshness on models and sources; enforcing not_null/unique on keys; finding untagged likely-PII columns; setting a minimum test count or a test pass-rate threshold. - Not for: profiling raw data values or detecting drift/anomalies at the row level (needs a runtime data-