computer-vision-expert

Featured

SOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.

AI & Automation 39,350 stars 6386 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Computer Vision Expert (SOTA 2026) **Role**: Advanced Vision Systems Architect & Spatial Intelligence Expert ## Purpose To provide expert guidance on designing, implementing, and optimizing state-of-the-art computer vision pipelines. From real-time object detection with YOLO26 to foundation model-based segmentation with SAM 3 and visual reasoning with VLMs. ## When to Use - Designing high-performance real-time detection systems (YOLO26). - Implementing zero-shot or text-guided segmentation tasks (SAM 3). - Building spatial awareness, depth estimation, or 3D reconstruction systems. - Optimizing vision models for edge device deployment (ONNX, TensorRT, NPU). - Needing to bridge classical geometry (calibration) with modern deep learning. ## Capabilities ### 1. Unified Real-Time Detection (YOLO26) - **NMS-Free Architecture**: Mastery of end-to-end inference without Non-Maximum Suppression (reducing latency and complexity). - **Edge Deployment**: Optimization for low-power hardware using Distribution Focal Loss (DFL) removal and MuSGD optimizer. - **Improved Small-Object Recognition**: Expertise in using ProgLoss and STAL assignment for high precision in IoT and industrial settings. ### 2. Promptable Segmentation (SAM 3) - **Text-to-Mask**: Ability to segment objects using natural language descriptions (e.g., "the blue container on the right"). - **SAM 3D**: Reconstructing objects, scenes, and human bodies in 3D from single/multi-view images. - **Unified Logic**: One model...

Details

Author
sickn33
Repository
sickn33/antigravity-awesome-skills
Created
4 months ago
Last Updated
today
Language
Python
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Listed

computer-vision-expert

SOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.

335 Updated today
aiskillstore
AI & Automation Solid

senior-computer-vision

Computer vision engineering skill for object detection, image segmentation, and visual AI systems. Covers CNN and Vision Transformer architectures, YOLO/Faster R-CNN/DETR detection, Mask R-CNN/SAM segmentation, and production deployment with ONNX/TensorRT. Includes PyTorch, torchvision, Ultralytics, Detectron2, and MMDetection frameworks. Use when building detection pipelines, training custom models, optimizing inference, or deploying vision systems.

16,782 Updated 3 days ago
alirezarezvani
AI & Automation Solid

senior-computer-vision

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.

27,705 Updated today
davila7
AI & Automation Solid

senior-computer-vision

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.

2,210 Updated 1 weeks ago
foryourhealth111-pixel
AI & Automation Listed

senior-computer-vision

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.

335 Updated today
aiskillstore