fabric-pyspark-perf-remediatelisted
Install: claude install-skill PatrickGallucci/fabric-skills
# Microsoft Fabric PySpark Performance remediate
Systematic guide for diagnosing and resolving Apache Spark performance problems in Microsoft Fabric Data Engineering workloads, including notebooks, Spark Job Definitions, and pipeline activities.
## When to Use This Skill
Activate when encountering any of these scenarios:
- PySpark notebook cells take unexpectedly long to execute
- Spark Job Definitions exceed expected duration or fail with timeouts
- Out-of-memory (OOM) errors on driver or executors
- Excessive shuffle read/write in Spark UI stage details
- Data skew causing individual tasks to run much longer than peers
- Delta Lake table writes are slow or produce many small files
- Fabric capacity utilization is high or jobs are queued/throttled
- Need to choose between resource profiles (readHeavy vs writeHeavy)
- Deciding whether to enable autotune, native execution engine, or Optimized Write
- Interpreting Spark UI metrics (stages, tasks, storage, SQL plan)
## Prerequisites
- Access to a Microsoft Fabric workspace with Data Engineering/Science experience
- Fabric capacity (F2 or higher) with Spark compute enabled
- Familiarity with PySpark DataFrames and Spark SQL
- Access to Spark UI via the Monitoring Hub or notebook session details
## Quick Diagnostic Workflow
Follow this triage sequence to identify the root cause:
1. **Check capacity status** - Is the Fabric capacity throttled or overloaded? See Monitoring Hub for queued jobs and CU utilization.
2. **Identi