Vector Intelligence 2025

Vector DBs for AI Search — 2025 field guide

Scenario-tuned recommendations for RAG, semantic search, and agent pipelines. Compare managed and open options, inspect fit scores, and launch with ready-to-run infrastructure recipes.

Latest snapshot

Q1 · 2025

Benchmarks refreshed monthly across OSS & managed offerings.

Scenario leader

Qdrant

Fit score 88/100 — tuned for startup mvp.

Cards & score matrix views Scenario-driven picks Launch-ready snippets

Ship fast with low ops & great DX.

Index tuning cheat-sheet

Quick knobs for the most common vector search strategies. Adjust for dataset scale, filter complexity, and recall targets.

HNSW

ANNin-memory

Knobs: M, efConstruction, efSearch

Start with: M=32 • efC=200 • ef=64

↑ef ⇒ ↑recall + ↑latency. Good filters; memory ↑ with M.

IVF-Flat

ANNcentroids

Knobs: nlist, nprobe

Start with: nlist≈√N • nprobe≈1–5% nlist

Good at large N; tune nprobe for recall/latency.

IVF-PQ / OPQ

compressioncost

Knobs: nlist, nprobe, m, bits

Start with: m | dim, 8 bits/subvector

Big memory savings; slight recall loss vs Flat.

Disk-ANN / tiered

SSDscale

Knobs: cache %, prefetch

Start with: warm cache for steady p95

SSD-backed; good for 100M+ with cost control.

Quick sizing & cost estimator

Approximate memory usage for your vector corpus. Iterate on dimensions, dtype, and index strategy before provisioning hardware.

Documents (N)

Embedding dim

Dtype

Index

Raw vectors: 7.15 GB

Index overhead: 10.73 GB

Total footprint: 17.88 GB

Rules of thumb only. Real footprints vary with M/ef, nlist/nprobe, PQ code size, metadata, and filtering structures.

Benchmark methodology (template)

Clone, tweak, and run to keep evaluations consistent across stacks.

{
  "dataset": {
    "name": "internal-wiki",
    "N": 5000000,
    "dim": 768,
    "lang": "en"
  },
  "query": {
    "k": 20,
    "filters": {
      "team": [
        "eng",
        "support"
      ]
    },
    "hybrid": {
      "alpha": 0.6
    }
  },
  "latency": {
    "budget_ms_p95": 80,
    "target_qps": 300,
    "warm": true
  },
  "index": {
    "kind": "HNSW",
    "M": 32,
    "efConstruction": 200,
    "efSearch": 64
  },
  "writeMix": {
    "upserts_per_min": 200,
    "deletes_per_min": 10
  },
  "metrics": [
    "recall@20",
    "p50",
    "p95",
    "p99",
    "error_rate"
  ],
  "rerank": {
    "enabled": true,
    "topK": 50
  }
}

Use the same harness across systems; keep recall targets constant and compare p95/p99 under the same filters/hybrid/rerank settings.

Hybrid search recipes

Blend neural and lexical signals for resilient retrieval.

Weighted fusion

score = α * cosine(vec(q), vec(doc)) + (1 - α) * bm25(q, doc)

Start α at 0.6; tune by validation recall.

Reciprocal Rank Fusion (RRF)

RRF(d) = Σ_s 1 / (k + rank_s(d))   # k≈60

Stable blend across multiple rankers (vector, keyword, metadata).

Qdrant RecommendedOpen Source

Rust-powered OSS, fast HNSW with filters

Open-source vector DB built for high RPS and low latency with robust filtering/payloads; managed cloud available.

Score matrixFit 88/100

Recall

4/5

Latency

5/5

Cost

5/5

Ops

4/5

Hybrid

4/5

Pros

Great speed/recall trade-off
Simple filters/payloads
Cloud option

Cons

Fewer “platform” features
Backups/sharding planning

Links

RAGHNSWHybridFiltersScale

Pinecone RecommendedManaged

Serverless, fully managed vector DB

Managed vector database with serverless scale, low-latency query paths, and strong ecosystem integrations for RAG/agents.

Score matrixFit 85/100

Recall

4/5

Latency

5/5

Cost

3/5

Ops

5/5

Hybrid

4/5

Pros

Zero-ops scaling
Great integrations
Multi-tenant friendly

Cons

Proprietary & usage-priced
Less infra control

Links

RAGHNSWHybridFiltersScale

Weaviate RecommendedOpen Source

AI-native OSS with hybrid search & filters

Open-source vector DB with first-class hybrid (dense+sparse) search, filters, and flexible schema. Cloud or self-hosted.

Score matrixFit 79/100

Recall

4/5

Latency

4/5

Cost

4/5

Ops

3/5

Hybrid

5/5

Pros

Hybrid + filters
Cloud/on-prem
Mature tooling

Cons

More ops at large scale
Tuning depth/learning curve

Links

RAGHNSWHybridFiltersScale

Milvus (Zilliz Cloud)Open Source

Scale-first OSS; managed by Zilliz

Vector database for very large corpora and high throughput. Zilliz Cloud provides a managed route to reduce ops.

Score matrixFit 79/100

Recall

5/5

Latency

4/5

Cost

4/5

Ops

3/5

Hybrid

4/5

Pros

Massive scale
Active community + cloud
Flexible indexes

Cons

Heavier to operate on-prem
Overkill for small apps

Links

RAGHNSWHybridFiltersScale

ElasticsearchSearch Engine

kNN in a battle-tested search engine

Dense vectors + kNN alongside keyword search and analytics. Strong hybrid when you already run Elastic.

Score matrixFit 75/100

Recall

4/5

Latency

3/5

Cost

3/5

Ops

4/5

Hybrid

5/5

Pros

Keyword+vector in one stack
Mature ecosystem
Security/observability

Cons

Vector feature constraints
Pure ANN perf can trail DBs

VespaSearch Engine

Serving engine with ANN, filters, and ranking pipelines

Production engine with HNSW ANN, advanced ranking pipelines, and strong filtering—great for large catalogs & complex scoring.

Score matrixFit 73/100

Recall

5/5

Latency

4/5

Cost

3/5

Ops

2/5

Hybrid

5/5

Pros

Advanced ranking/hybrid
Large-scale serving patterns

Cons

Heavier to operate
Higher learning curve

PostgreSQL + pgvectorDatabase Extension

Bring vector search to Postgres

Postgres extension adding IVF/HNSW vector indexes. Keep vectors next to relational data for simpler stacks.

Score matrixFit 69/100

Recall

3/5

Latency

3/5

Cost

4/5

Ops

4/5

Hybrid

3/5

Pros

Single DB (SQL+vectors)
Great for moderate scale
Simple backups

Cons

Lower peak perf than specialists
Recall/latency hinge on tuning

OpenSearchSearch Engine

Elastic fork with k-NN plugin

Vector search via k-NN plugin (HNSW/IVF). Solid open-source option—especially AWS-centric.

Score matrixFit 68/100

Recall

3/5

Latency

3/5

Cost

4/5

Ops

3/5

Hybrid

4/5

Pros

AWS integration
Hybrid search story

Cons

Ops/tuning involved
Peak ANN behind specialists

Quick docker-compose

Kickstart evaluation locally. Swap versions, wire health checks, and snap in your observability toolkit.

Qdrant (minimal)

# Qdrant docker-compose (minimal)
version: "3.9"
services:
  qdrant:
    image: qdrant/qdrant:v1.11
    ports:
      - "6333:6333" # REST
      - "6334:6334" # gRPC
    volumes:
      - ./qdrant_storage:/qdrant/storage
    environment:
      QDRANT__SERVICE__GRPC_PORT: 6334
# start:  docker compose up -d

Milvus + deps (standalone)

# Milvus + dependencies (standalone) via docker compose
version: "3.9"
services:
  etcd:
    image: quay.io/coreos/etcd:v3.5.5
    environment:
      ETCD_AUTO_COMPACTION_MODE: revision
      ETCD_AUTO_COMPACTION_RETENTION: "1000"
      ETCD_QUOTA_BACKEND_BYTES: "4294967296"
    volumes:
      - ./volumes/etcd:/etcd
    command: >
      etcd -advertise-client-urls http://etcd:2379
           -listen-client-urls http://0.0.0.0:2379
  minio:
    image: minio/minio:RELEASE.2024-05-10T01-41-38Z
    environment:
      MINIO_ROOT_USER: minioadmin
      MINIO_ROOT_PASSWORD: minioadmin
    volumes:
      - ./volumes/minio:/data
    command: server /data --console-address ":9001"
  milvus:
    image: milvusdb/milvus:v2.4.4
    depends_on: [etcd, minio]
    ports:
      - "19530:19530"  # gRPC
      - "9091:9091"    # HTTP
    volumes:
      - ./volumes/milvus:/var/lib/milvus
# start:  docker compose up -d

Notes: adjust versions, mount persistent volumes, and lock ports. For production, add healthchecks, backups, metrics, and TLS.