Vector DBs for AI Search (2025)
Curated comparison for RAG, hybrid search, and agentic workloads. Pick a scenario to highlight recommendations, view a compact score matrix, and grab ready-to-run compose files.
Ship fast with low ops & great DX.
Index Tuning Cheat-Sheet
HNSW
Knobs: M, efConstruction, efSearch
Start with: M=32 • efC=200 • ef=64
↑ef ⇒ ↑recall + ↑latency. Good filters; memory ↑ with M.
IVF-Flat
Knobs: nlist, nprobe
Start with: nlist≈√N • nprobe≈1–5% nlist
Good at large N; tune nprobe for recall/latency.
IVF-PQ / OPQ
Knobs: nlist, nprobe, m, bits
Start with: m | dim, 8 bits/subvector
Big memory savings; slight recall loss vs Flat.
Disk-ANN / tiered
Knobs: cache %, prefetch
Start with: warm cache for steady p95
SSD-backed; good for 100M+ with cost control.
Quick Sizing & Cost Estimator
Rules of thumb only. Real footprints vary with M/ef, nlist/nprobe, PQ code size, metadata, and filtering structures.
Benchmark Methodology (template)
{ "dataset": { "name": "internal-wiki", "N": 5000000, "dim": 768, "lang": "en" }, "query": { "k": 20, "filters": { "team": [ "eng", "support" ] }, "hybrid": { "alpha": 0.6 } }, "latency": { "budget_ms_p95": 80, "target_qps": 300, "warm": true }, "index": { "kind": "HNSW", "M": 32, "efConstruction": 200, "efSearch": 64 }, "writeMix": { "upserts_per_min": 200, "deletes_per_min": 10 }, "metrics": [ "recall@20", "p50", "p95", "p99", "error_rate" ], "rerank": { "enabled": true, "topK": 50 } }
Use the same harness across systems; keep recall targets constant and compare p95/p99 under the same filters/hybrid/rerank settings.
Hybrid Search Recipes
score = α * cosine(vec(q), vec(doc)) + (1 - α) * bm25(q, doc)
Start α at 0.6; tune by validation recall.
RRF(d) = Σ_s 1 / (k + rank_s(d)) # k≈60
Stable blend across multiple rankers (vector, keyword, metadata).
Qdrant RecommendedOpen Source
Rust-powered OSS, fast HNSW with filters
Open-source vector DB built for high RPS and low latency with robust filtering/payloads; managed cloud available.
Pros
- Great speed/recall trade-off
- Simple filters/payloads
- Cloud option
Cons
- Fewer “platform” features
- Backups/sharding planning
Pinecone RecommendedManaged
Serverless, fully managed vector DB
Managed vector database with serverless scale, low-latency query paths, and strong ecosystem integrations for RAG/agents.
Pros
- Zero-ops scaling
- Great integrations
- Multi-tenant friendly
Cons
- Proprietary & usage-priced
- Less infra control
Weaviate RecommendedOpen Source
AI-native OSS with hybrid search & filters
Open-source vector DB with first-class hybrid (dense+sparse) search, filters, and flexible schema. Cloud or self-hosted.
Pros
- Hybrid + filters
- Cloud/on-prem
- Mature tooling
Cons
- More ops at large scale
- Tuning depth/learning curve
Milvus (Zilliz Cloud)Open Source
Scale-first OSS; managed by Zilliz
Vector database for very large corpora and high throughput. Zilliz Cloud provides a managed route to reduce ops.
Pros
- Massive scale
- Active community + cloud
- Flexible indexes
Cons
- Heavier to operate on-prem
- Overkill for small apps
Links
ElasticsearchSearch Engine
kNN in a battle-tested search engine
Dense vectors + kNN alongside keyword search and analytics. Strong hybrid when you already run Elastic.
Pros
- Keyword+vector in one stack
- Mature ecosystem
- Security/observability
Cons
- Vector feature constraints
- Pure ANN perf can trail DBs
Links
VespaSearch Engine
Serving engine with ANN, filters, and ranking pipelines
Production engine with HNSW ANN, advanced ranking pipelines, and strong filtering—great for large catalogs & complex scoring.
Pros
- Advanced ranking/hybrid
- Large-scale serving patterns
Cons
- Heavier to operate
- Higher learning curve
PostgreSQL + pgvectorDatabase Extension
Bring vector search to Postgres
Postgres extension adding IVF/HNSW vector indexes. Keep vectors next to relational data for simpler stacks.
Pros
- Single DB (SQL+vectors)
- Great for moderate scale
- Simple backups
Cons
- Lower peak perf than specialists
- Recall/latency hinge on tuning
Links
OpenSearchSearch Engine
Elastic fork with k-NN plugin
Vector search via k-NN plugin (HNSW/IVF). Solid open-source option—especially AWS-centric.
Pros
- AWS integration
- Hybrid search story
Cons
- Ops/tuning involved
- Peak ANN behind specialists
Links
Quick docker-compose
Qdrant (minimal)
# Qdrant docker-compose (minimal) version: "3.9" services: qdrant: image: qdrant/qdrant:v1.11 ports: - "6333:6333" # REST - "6334:6334" # gRPC volumes: - ./qdrant_storage:/qdrant/storage environment: QDRANT__SERVICE__GRPC_PORT: 6334 # start: docker compose up -d
Milvus + deps (standalone)
# Milvus + dependencies (standalone) via docker compose version: "3.9" services: etcd: image: quay.io/coreos/etcd:v3.5.5 environment: ETCD_AUTO_COMPACTION_MODE: revision ETCD_AUTO_COMPACTION_RETENTION: "1000" ETCD_QUOTA_BACKEND_BYTES: "4294967296" volumes: - ./volumes/etcd:/etcd command: > etcd -advertise-client-urls http://etcd:2379 -listen-client-urls http://0.0.0.0:2379 minio: image: minio/minio:RELEASE.2024-05-10T01-41-38Z environment: MINIO_ROOT_USER: minioadmin MINIO_ROOT_PASSWORD: minioadmin volumes: - ./volumes/minio:/data command: server /data --console-address ":9001" milvus: image: milvusdb/milvus:v2.4.4 depends_on: [etcd, minio] ports: - "19530:19530" # gRPC - "9091:9091" # HTTP volumes: - ./volumes/milvus:/var/lib/milvus # start: docker compose up -d
Notes: adjust versions, mount persistent volumes, and lock ports. For production, add healthchecks, backups, metrics, and TLS.