← ClaudeAtlas

data-loadinglisted

Optimize data loading pipeline to prevent GPU starvation. Use when setting up DataLoader or data preprocessing.
thada2402/AutoResearchClaw · ★ 1 · Data & Documents · score 73
Install: claude install-skill thada2402/AutoResearchClaw
## Efficient Data Loading Best Practice 1. Use num_workers = min(8, os.cpu_count()) for DataLoader 2. Enable pin_memory=True when using GPU 3. Use persistent_workers=True to avoid re-spawning 4. Pre-compute and cache transformations when possible 5. For image data: use torchvision.transforms.v2 (faster) 6. For large datasets: consider memory-mapped files or WebDataset 7. Profile with torch.utils.bottleneck to find I/O bottlenecks