generate-from-schemalisted
Install: claude install-skill Rockfish-Data/tacklebox
# Generate from schema
Use `rockfish.actions.GenerateFromDataSchema` to produce synthetic datasets from a schema specification.
## When to use this skill
Use when the user wants to generate synthetic tabular or time-series data with:
- Specific column types (IDs, categoricals, numeric distributions).
- Derived columns (computed from other columns — e.g. mapping or sampling).
- Stateful behavior (state machines or timeseries).
- Cross-entity relationships (foreign keys, including composite keys).
- Realistic PII-like values (names, emails, addresses, SSNs).
If the user wants to inject *scenarios* (spikes, outages, ramps, shifts) into an existing time-series dataset, use the `inject-scenarios` skill instead.
## Concept
`rockfish.actions.GenerateFromDataSchema` takes a `DataSchema` and produces one synthetic dataset per `Entity`. A schema is a tree:
```
DataSchema
├── entities: list[Entity]
│ ├── name, cardinality
│ ├── columns: list[Column]
│ │ ├── name, data_type
│ │ ├── column_type (independent | derived | stateful | foreign_key)
│ │ └── domain (id | categorical | uniform_dist | state_machine
│ │ | timeseries | named_entity_provider | ...)
│ └── (optional) timestamp
├── entity_relationships: list[EntityRelationship]
└── (optional) global_timestamp
```
## How to use
1. Construct a `DataSchema` matching the user's data requirements. Two equivalent forms:
- **JSON dict** — convenient for simple cases, language-agnostic.