vector-index-tuning

Solid

Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure.

AI & Automation 36,649 stars 3968 forks Updated today MIT

Install

View on GitHub

Quality Score: 93/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Vector Index Tuning Guide to optimizing vector indexes for production performance. ## When to Use This Skill - Tuning HNSW parameters - Implementing quantization - Optimizing memory usage - Reducing search latency - Balancing recall vs speed - Scaling to billions of vectors ## Core Concepts ### 1. Index Type Selection ``` Data Size Recommended Index ──────────────────────────────────────── < 10K vectors → Flat (exact search) 10K - 1M → HNSW 1M - 100M → HNSW + Quantization > 100M → IVF + PQ or DiskANN ``` ### 2. HNSW Parameters | Parameter | Default | Effect | | ------------------ | ------- | ---------------------------------------------------- | | **M** | 16 | Connections per node, ↑ = better recall, more memory | | **efConstruction** | 100 | Build quality, ↑ = better index, slower build | | **efSearch** | 50 | Search quality, ↑ = better recall, slower search | ### 3. Quantization Types ``` Full Precision (FP32): 4 bytes × dimensions Half Precision (FP16): 2 bytes × dimensions INT8 Scalar: 1 byte × dimensions Product Quantization: ~32-64 bytes total Binary: dimensions/8 bytes ``` ## Templates ### Template 1: HNSW Parameter Tuning ```python import numpy as np from typing import List, Tuple import time def benchmark_hnsw_parameters( vectors: np.ndarray, queries: np.ndarray, ground_truth...

Details

Author
wshobson
Repository
wshobson/agents
Created
10 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category