← ClaudeAtlas

clean-data-xlslisted

Clean up messy spreadsheet data — trim whitespace, fix inconsistent casing, convert numbers-stored-as-text, standardize dates, remove duplicates, and flag mixed-type columns. Use when data is messy, inconsistent, or needs prep before analysis. Triggers on "clean this data", "clean up this sheet", "normalize this data", "fix formatting", "dedupe", "standardize this column", and "this data is messy".
Borjani1577/claude-office-skills · ★ 0 · Data & Documents · score 62
Install: claude install-skill Borjani1577/claude-office-skills
# Clean Data Clean messy data in the active sheet or a specified range. ## Preflight: Dependency Check Before starting, verify required libraries are installed and install any that are missing. ```bash python3 -c "import openpyxl" 2>/dev/null || python3 -m pip install openpyxl ``` **Important**: Do not skip this step — the workflow below will fail without these libraries. ## Environment - **If running inside Excel (Office Add-in / Office JS):** Use Office JS directly. Read via `range.values`, write helper-column formulas via `range.formulas = [["=TRIM(A2)"]]`. The in-place vs helper-column decision still applies. - **If operating on a standalone `.xlsx` file:** Use Python and `openpyxl`. ## Workflow ### Step 1: Scope - If a range is given, such as `A1:F200`, use it. - Otherwise use the full used range of the active sheet. - Profile each column: detect its dominant type, text vs number vs date, and identify outliers. ### Step 2: Detect issues | Issue | What to look for | |---|---| | Whitespace | Leading/trailing spaces, double spaces | | Casing | Inconsistent casing in categorical columns like `usa`, `USA`, `Usa` | | Number-as-text | Numeric values stored as text; stray `$`, `,`, `%` in number cells | | Dates | Mixed formats in the same column like `3/8/26`, `2026-03-08`, `March 8 2026` | | Duplicates | Exact-duplicate rows and near-duplicates caused by case or whitespace differences | | Blanks | Empty cells in otherwise-populated columns | | Mixed types | A column