pandas-pro

Solid

Performs pandas DataFrame operations for data analysis, manipulation, and transformation. Use when working with pandas DataFrames, data cleaning, aggregation, merging, or time series analysis. Invoke for data manipulation tasks such as joining DataFrames on multiple keys, pivoting tables, resampling time series, handling NaN values with interpolation or forward-fill, groupby aggregations, type conversion, or performance optimization of large datasets.

Data & Documents 9,537 stars 808 forks Updated 1 weeks ago MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%

100

Recency 20%

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Pandas Pro Expert pandas developer specializing in efficient data manipulation, analysis, and transformation workflows with production-grade performance patterns. ## Core Workflow 1. **Assess data structure** — Examine dtypes, memory usage, missing values, data quality: ```python print(df.dtypes) print(df.memory_usage(deep=True).sum() / 1e6, "MB") print(df.isna().sum()) print(df.describe(include="all")) ``` 2. **Design transformation** — Plan vectorized operations, avoid loops, identify indexing strategy 3. **Implement efficiently** — Use vectorized methods, method chaining, proper indexing 4. **Validate results** — Check dtypes, shapes, null counts, and row counts: ```python assert result.shape[0] == expected_rows, f"Row count mismatch: {result.shape[0]}" assert result.isna().sum().sum() == 0, "Unexpected nulls after transform" assert set(result.columns) == expected_cols ``` 5. **Optimize** — Profile memory, apply categorical types, use chunking if needed ## Reference Guide Load detailed guidance based on context: | Topic | Reference | Load When | |-------|-----------|-----------| | DataFrame Operations | `references/dataframe-operations.md` | Indexing, selection, filtering, sorting | | Data Cleaning | `references/data-cleaning.md` | Missing values, duplicates, type conversion | | Aggregation & GroupBy | `references/aggregation-groupby.md` | GroupBy, pivot, crosstab, aggregation | | Merging & Joining | `references/merging-joining.md` ...

Details

Author: Jeffallan
Repository: Jeffallan/claude-skills
Created: 7 months ago
Last Updated: 1 weeks ago
Language: Python
License: MIT

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed

pandas-pro

Use when working with pandas DataFrames, data cleaning, aggregation, merging, or time series analysis. Invoke for data manipulation, missing value handling, groupby operations, or performance optimization.

2 Updated today

zacklecon

AI & Automation Solid

pandas-dataframe-analyzer

Automated DataFrame analysis skill for statistical summaries, missing value detection, data type inference, and memory optimization recommendations.

1,160 Updated today

a5c-ai

Data & Documents Listed

data-wrangler

Production-grade tabular data manipulation using pandas & openpyxl. This skill should be used when editing, creating, filtering, sorting, merging, pivoting, deduplicating, validating, or transforming CSV, Excel (xlsx/xls), JSON, Parquet, or TSV files. Supports 18 operations via CLI scripts, advanced Excel formatting (multi-sheet, freeze, auto-filter, validation, styling), and file-converter integration for format pipelines.

24 Updated 2 days ago

georgekhananaev

Data & Documents Listed

python-data-patterns

Pandas, Polars, and PySpark idioms for production data engineering — chunked reads, memory-safe transforms, vectorized operations, type optimization, and performance patterns. Use this skill whenever the user is writing a Python data transformation script and running into memory issues, slow performance, or correctness bugs with large datasets. Also trigger when the user asks how to handle large CSV/Parquet files, process data in batches, use Polars instead of Pandas, optimize a PySpark job, or reduce DataFrame memory usage. If you see someone iterating row-by-row over a DataFrame, this skill should trigger immediately.

0 Updated 5 days ago

Methasit-Pun

Data & Documents Listed

transforming-data

Transform raw data into analytical assets using ETL/ELT patterns, SQL (dbt), Python (pandas/polars/PySpark), and orchestration (Airflow). Use when building data pipelines, implementing incremental models, migrating from pandas to polars, or orchestrating multi-step transformations with testing and quality checks.

368 Updated 5 months ago

ancoleman