schema-designlisted

Data modeling for analytical workloads — star schema, snowflake schema, one big table (OBT), slowly changing dimensions (SCD), normalization tradeoffs, grain definition, and surrogate key strategies. Use this skill whenever the user is designing or reviewing a data warehouse schema, planning a fact/dimension table layout, deciding how to model a business entity (customer, order, event, product), or asking how to handle historical changes to dimension attributes. Also trigger when the user asks about dbt model design, table granularity, or how to structure data for BI tools like Looker, Tableau, or Power BI. Get this right before building the pipeline — a bad schema is expensive to fix later.
Methasit-Pun/data_engineer_claude_skills · ★ 1 · Web & Frontend · score 62

Install: claude install-skill Methasit-Pun/data_engineer_claude_skills

# Schema Design for Analytical Workloads ## Start with the Grain The single most important decision in schema design is the **grain**: what does one row represent? Be explicit and precise. Bad grain definition: "orders" Good grain definition: "one row per order line item, at the time of fulfillment" Every column in a fact table must be true at that grain. If you try to mix grains in one table (order-level and line-item-level facts), you'll produce incorrect aggregations and confuse every analyst who touches it. --- ## Star Schema — Default Choice for Analytics A star schema has one central fact table surrounded by dimension tables. It's optimized for BI tool query patterns — simple JOINs, fast aggregations. ``` fct_orders ├── order_id (PK) ├── customer_key (FK → dim_customers) ├── product_key (FK → dim_products) ├── date_key (FK → dim_date) ├── quantity └── revenue_usd dim_customers ├── customer_key (surrogate PK) ├── customer_id (natural/source key) ├── name ├── country └── segment dim_date ├── date_key ├── full_date ├── year, month, week, day_of_week └── is_holiday ``` **Rules:** - Fact tables contain measures (numbers you aggregate) and foreign keys to dimensions - Dimension tables contain descriptive attributes (text, categories, dates) - Dimension tables are denormalized — repeat values rather than normalizing them out --- ## Slowly Changing Dimensions (SCD) What happens when a customer changes their country, or a product