optimize-for-gpu

Solid

GPU-accelerate Python code using CuPy, Numba CUDA, Warp, cuDF, cuML, cuGraph, KvikIO, cuCIM, cuxfilter, cuVS, cuSpatial, and RAFT. Use whenever the user mentions GPU/CUDA/NVIDIA acceleration, or wants to speed up NumPy, pandas, scikit-learn, scikit-image, NetworkX, GeoPandas, or Faiss workloads. Covers physics simulation, differentiable rendering, mesh ray casting, particle systems (DEM/SPH/fluids), vector/similarity search, GPUDirect Storage file IO, interactive dashboards, geospatial analysis, medical imaging, and sparse eigensolvers. Also use when you see CPU-bound Python code (loops, large arrays, ML pipelines, graph analytics, image processing) that would benefit from GPU acceleration, even if not explicitly requested.

Data & Documents 26,817 stars 2774 forks Updated today MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# GPU Optimization for Python with NVIDIA You are an expert GPU optimization engineer. Your job is to help users write new GPU-accelerated code or transform their existing CPU-bound Python code to run on NVIDIA GPUs for dramatic speedups — often 10x to 1000x for suitable workloads. ## When This Skill Applies - User wants to speed up numerical/scientific Python code - User is working with large arrays, matrices, or dataframes - User mentions CUDA, GPU, NVIDIA, or parallel computing - User has NumPy, pandas, SciPy, scikit-learn, NetworkX, or scipy.sparse.linalg code that processes large datasets - User needs low-level GPU primitives (sparse eigensolvers, device memory management, multi-GPU communication) - User is doing machine learning (training, inference, hyperparameter tuning, preprocessing) - User is doing graph analytics (centrality, community detection, shortest paths, PageRank, etc.) - User is doing vector search, nearest neighbor search, similarity search, or building a RAG pipeline - User has Faiss, Annoy, ScaNN, or sklearn NearestNeighbors code that could be GPU-accelerated - User wants GPU-accelerated interactive dashboards, cross-filtering, or exploratory data analysis on large datasets - User is doing geospatial analysis (point-in-polygon, spatial joins, trajectory analysis, distance calculations) with GeoPandas or shapely - User is doing image processing, computer vision, or medical imaging (filtering, segmentation, morphology, feature detection) with scikit-i...

Details

Author
K-Dense-AI
Repository
K-Dense-AI/scientific-agent-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

cuda-toolkit

Deep integration with NVIDIA CUDA toolkit for kernel development, compilation, and debugging. Execute nvcc compilation with optimization flags analysis, generate and validate CUDA kernel code, analyze PTX/SASS assembly output, and configure execution parameters.

1,160 Updated today
a5c-ai
AI & Automation Listed

matlab-performance-optimizer

Optimize MATLAB code for better performance through vectorization, memory management, and profiling. Use when user requests optimization, mentions slow code, performance issues, speed improvements, or asks to make code faster or more efficient.

5 Updated today
LiHongwei-cn
API & Backend Listed

matlab-performance-optimizer

Optimize MATLAB code for better performance through vectorization, memory management, and profiling. Use when user requests optimization, mentions slow code, performance issues, speed improvements, or asks to make code faster or more efficient.

117 Updated 2 days ago
matlab
AI & Automation Solid

gpu-benchmarking

Expert skill for automated GPU performance benchmarking and regression detection. Design micro-benchmarks, measure kernel execution time with CUDA events, calculate achieved vs theoretical performance, generate comparison reports, detect regressions in CI/CD, and profile power/thermal characteristics.

1,160 Updated today
a5c-ai
AI & Automation Solid

gpu-resource-optimizer

Optimize gpu resource optimizer operations. Auto-activating skill for ML Deployment. Triggers on: gpu resource optimizer, gpu resource optimizer Part of the ML Deployment skill category. Use when working with gpu resource optimizer functionality. Trigger with phrases like "gpu resource optimizer", "gpu optimizer", "gpu".

2,274 Updated today
jeremylongshore