computer-vision-expert

Featured

SOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.

AI & Automation 39,350 stars 6386 forks Updated today MIT

Install

Quality Score: 99/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

70

Documentation 15%

100

Issue Health 10%

50

License 10%

100

Description 5%

100

Skill Content

# Computer Vision Expert (SOTA 2026) **Role**: Advanced Vision Systems Architect & Spatial Intelligence Expert ## Purpose To provide expert guidance on designing, implementing, and optimizing state-of-the-art computer vision pipelines. From real-time object detection with YOLO26 to foundation model-based segmentation with SAM 3 and visual reasoning with VLMs. ## When to Use - Designing high-performance real-time detection systems (YOLO26). - Implementing zero-shot or text-guided segmentation tasks (SAM 3). - Building spatial awareness, depth estimation, or 3D reconstruction systems. - Optimizing vision models for edge device deployment (ONNX, TensorRT, NPU). - Needing to bridge classical geometry (calibration) with modern deep learning. ## Capabilities ### 1. Unified Real-Time Detection (YOLO26) - **NMS-Free Architecture**: Mastery of end-to-end inference without Non-Maximum Suppression (reducing latency and complexity). - **Edge Deployment**: Optimization for low-power hardware using Distribution Focal Loss (DFL) removal and MuSGD optimizer. - **Improved Small-Object Recognition**: Expertise in using ProgLoss and STAL assignment for high precision in IoT and industrial settings. ### 2. Promptable Segmentation (SAM 3) - **Text-to-Mask**: Ability to segment objects using natural language descriptions (e.g., "the blue container on the right"). - **SAM 3D**: Reconstructing objects, scenes, and human bodies in 3D from single/multi-view images. - **Unified Logic**: One model...

Details

Author: sickn33
Repository: sickn33/antigravity-awesome-skills
Created: 4 months ago
Last Updated: today
Language: Python
License: MIT

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed

computer-vision-expert

SOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.

335 Updated today

AI & Automation Solid

senior-computer-vision

Computer vision engineering skill for object detection, image segmentation, and visual AI systems. Covers CNN and Vision Transformer architectures, YOLO/Faster R-CNN/DETR detection, Mask R-CNN/SAM segmentation, and production deployment with ONNX/TensorRT. Includes PyTorch, torchvision, Ultralytics, Detectron2, and MMDetection frameworks. Use when building detection pipelines, training custom models, optimizing inference, or deploying vision systems.

16,782 Updated 3 days ago

AI & Automation Solid

senior-computer-vision

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.

27,705 Updated today

AI & Automation Solid

senior-computer-vision

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.

2,210 Updated 1 weeks ago

foryourhealth111-pixel

AI & Automation Listed

senior-computer-vision

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.

335 Updated today