Vector DBs for AI Search — 2025 field guide
Scenario-tuned recommendations for RAG, semantic search, and agent pipelines. Compare managed and open options, inspect fit scores, and launch with ready-to-run infrastructure recipes.
Latest snapshot
Q1 · 2025
Benchmarks refreshed monthly across OSS & managed offerings.
Scenario leader
Qdrant
Fit score 88/100 — tuned for startup mvp.
Ship fast with low ops & great DX.
Index tuning cheat-sheet
Quick knobs for the most common vector search strategies. Adjust for dataset scale, filter complexity, and recall targets.
HNSW
Knobs: M, efConstruction, efSearch
Start with: M=32 • efC=200 • ef=64
↑ef ⇒ ↑recall + ↑latency. Good filters; memory ↑ with M.
IVF-Flat
Knobs: nlist, nprobe
Start with: nlist≈√N • nprobe≈1–5% nlist
Good at large N; tune nprobe for recall/latency.
IVF-PQ / OPQ
Knobs: nlist, nprobe, m, bits
Start with: m | dim, 8 bits/subvector
Big memory savings; slight recall loss vs Flat.
Disk-ANN / tiered
Knobs: cache %, prefetch
Start with: warm cache for steady p95
SSD-backed; good for 100M+ with cost control.
Quick sizing & cost estimator
Approximate memory usage for your vector corpus. Iterate on dimensions, dtype, and index strategy before provisioning hardware.
Rules of thumb only. Real footprints vary with M/ef, nlist/nprobe, PQ code size, metadata, and filtering structures.
Benchmark methodology (template)
Clone, tweak, and run to keep evaluations consistent across stacks.
{
"dataset": {
"name": "internal-wiki",
"N": 5000000,
"dim": 768,
"lang": "en"
},
"query": {
"k": 20,
"filters": {
"team": [
"eng",
"support"
]
},
"hybrid": {
"alpha": 0.6
}
},
"latency": {
"budget_ms_p95": 80,
"target_qps": 300,
"warm": true
},
"index": {
"kind": "HNSW",
"M": 32,
"efConstruction": 200,
"efSearch": 64
},
"writeMix": {
"upserts_per_min": 200,
"deletes_per_min": 10
},
"metrics": [
"recall@20",
"p50",
"p95",
"p99",
"error_rate"
],
"rerank": {
"enabled": true,
"topK": 50
}
}Use the same harness across systems; keep recall targets constant and compare p95/p99 under the same filters/hybrid/rerank settings.
Hybrid search recipes
Blend neural and lexical signals for resilient retrieval.
score = α * cosine(vec(q), vec(doc)) + (1 - α) * bm25(q, doc)
Start α at 0.6; tune by validation recall.
RRF(d) = Σ_s 1 / (k + rank_s(d)) # k≈60
Stable blend across multiple rankers (vector, keyword, metadata).
Qdrant RecommendedOpen Source
Rust-powered OSS, fast HNSW with filters
Open-source vector DB built for high RPS and low latency with robust filtering/payloads; managed cloud available.
Pros
- Great speed/recall trade-off
- Simple filters/payloads
- Cloud option
Cons
- Fewer “platform” features
- Backups/sharding planning
Pinecone RecommendedManaged
Serverless, fully managed vector DB
Managed vector database with serverless scale, low-latency query paths, and strong ecosystem integrations for RAG/agents.
Pros
- Zero-ops scaling
- Great integrations
- Multi-tenant friendly
Cons
- Proprietary & usage-priced
- Less infra control
Weaviate RecommendedOpen Source
AI-native OSS with hybrid search & filters
Open-source vector DB with first-class hybrid (dense+sparse) search, filters, and flexible schema. Cloud or self-hosted.
Pros
- Hybrid + filters
- Cloud/on-prem
- Mature tooling
Cons
- More ops at large scale
- Tuning depth/learning curve
Milvus (Zilliz Cloud)Open Source
Scale-first OSS; managed by Zilliz
Vector database for very large corpora and high throughput. Zilliz Cloud provides a managed route to reduce ops.
Pros
- Massive scale
- Active community + cloud
- Flexible indexes
Cons
- Heavier to operate on-prem
- Overkill for small apps
Links
ElasticsearchSearch Engine
kNN in a battle-tested search engine
Dense vectors + kNN alongside keyword search and analytics. Strong hybrid when you already run Elastic.
Pros
- Keyword+vector in one stack
- Mature ecosystem
- Security/observability
Cons
- Vector feature constraints
- Pure ANN perf can trail DBs
Links
VespaSearch Engine
Serving engine with ANN, filters, and ranking pipelines
Production engine with HNSW ANN, advanced ranking pipelines, and strong filtering—great for large catalogs & complex scoring.
Pros
- Advanced ranking/hybrid
- Large-scale serving patterns
Cons
- Heavier to operate
- Higher learning curve
PostgreSQL + pgvectorDatabase Extension
Bring vector search to Postgres
Postgres extension adding IVF/HNSW vector indexes. Keep vectors next to relational data for simpler stacks.
Pros
- Single DB (SQL+vectors)
- Great for moderate scale
- Simple backups
Cons
- Lower peak perf than specialists
- Recall/latency hinge on tuning
Links
OpenSearchSearch Engine
Elastic fork with k-NN plugin
Vector search via k-NN plugin (HNSW/IVF). Solid open-source option—especially AWS-centric.
Pros
- AWS integration
- Hybrid search story
Cons
- Ops/tuning involved
- Peak ANN behind specialists
Links
Quick docker-compose
Kickstart evaluation locally. Swap versions, wire health checks, and snap in your observability toolkit.
Qdrant (minimal)
# Qdrant docker-compose (minimal)
version: "3.9"
services:
qdrant:
image: qdrant/qdrant:v1.11
ports:
- "6333:6333" # REST
- "6334:6334" # gRPC
volumes:
- ./qdrant_storage:/qdrant/storage
environment:
QDRANT__SERVICE__GRPC_PORT: 6334
# start: docker compose up -dMilvus + deps (standalone)
# Milvus + dependencies (standalone) via docker compose
version: "3.9"
services:
etcd:
image: quay.io/coreos/etcd:v3.5.5
environment:
ETCD_AUTO_COMPACTION_MODE: revision
ETCD_AUTO_COMPACTION_RETENTION: "1000"
ETCD_QUOTA_BACKEND_BYTES: "4294967296"
volumes:
- ./volumes/etcd:/etcd
command: >
etcd -advertise-client-urls http://etcd:2379
-listen-client-urls http://0.0.0.0:2379
minio:
image: minio/minio:RELEASE.2024-05-10T01-41-38Z
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
volumes:
- ./volumes/minio:/data
command: server /data --console-address ":9001"
milvus:
image: milvusdb/milvus:v2.4.4
depends_on: [etcd, minio]
ports:
- "19530:19530" # gRPC
- "9091:9091" # HTTP
volumes:
- ./volumes/milvus:/var/lib/milvus
# start: docker compose up -dNotes: adjust versions, mount persistent volumes, and lock ports. For production, add healthchecks, backups, metrics, and TLS.