data-quality-auditlisted
Install: claude install-skill vermapragya/analytics-skill
# Data Quality Audit
## When to use this skill
Use **before** trusting data for analysis, modeling, or reporting. Triggers:
- "Audit this table"
- "Check the data quality of…"
- "Can I trust this data?"
- "Is fct_orders fresh?"
- "Are there duplicates in…"
- Before kicking off any modeling skill (`logistic-regression`, `survival-analysis`, etc.)
## Required inputs
| Input | Why it matters |
|---|---|
| Table name | What to audit |
| Stated primary key | What grain rows should be at |
| Expected freshness | How recent data should be |
| Critical columns | Columns where null/garbage breaks downstream |
| Foreign key columns | For referential integrity |
## Workflow
Run these in order. Stop and report if any check FAILS critically.
### 1. Row count and time range
```sql
select
count(*) as row_count,
min(<time_col>) as earliest,
max(<time_col>) as latest,
datediff('hour', max(<time_col>), current_timestamp) as hours_since_latest
from <table>;
```
**Expected:** non-zero rows, latest timestamp within freshness SLA.
**Fail if:** row count = 0, or `hours_since_latest` > freshness SLA × 1.5.
### 2. Primary key uniqueness
```sql
select <pk_cols>, count(*) as dupes
from <table>
group by <pk_cols>
having count(*) > 1
limit 100;
```
**Expected:** zero rows.
**Fail if:** any duplicates. Investigate before proceeding.
### 3. Null check on critical columns
```sql
select
sum(case when <col_1> is null then 1 else 0 end) as nulls_col_1,
sum(case when <col