Question 1

What does recall@k mean in a pgvector benchmark?

Accepted Answer

Recall@k measures how many of the true k nearest neighbors a vector index returns. If exact KNN says the top-10 neighbors are A,B,C,...,J and pgvector's HNSW index returns A,B,C,...,I plus one wrong row, that's recall@10 = 0.9. pgvector-bench computes recall by running the same ORDER BY ... LIMIT k query twice — once with the index, once with sequential scan (enable_indexscan = off) — and comparing the result sets.

Question 2

How does ef_search affect pgvector query speed and recall?

Accepted Answer

In pgvector HNSW, ef_search controls how many candidate neighbors the graph traversal explores. Lower values are faster but miss some true neighbors; higher values are slower but recall climbs. A real sweep we measured on a 250k × 1536 dataset (one Starter node, 4 clients) shows ef_search=40 → ~0.88 recall at ~5ms p95, ef_search=80 → ~0.93 recall at ~6ms p95, ef_search=120 → ~0.96 recall at ~7ms p95, and ef_search=200 → ~0.99 recall but the working set spills cache and latency cliffs. pgvector-bench will sweep arbitrary values with --ef-search 40,80,120,200 and print the tradeoff table for your own workload.

Question 3

Can I benchmark pgvector HNSW vs IVFFlat with this tool?

Accepted Answer

Yes. pgvector-bench detects whichever index type already exists on the target column (HNSW or IVFFlat) and reports latency / throughput / recall against it. Create the same data with both index types in separate tables and run the tool twice to compare. For HNSW the relevant tunables are m and ef_construction (build) plus ef_search (query). For IVFFlat the build tunable is lists and the query tunable is probes.

Question 4

What is a realistic p95 latency target for pgvector at 1M vectors?

Accepted Answer

On well-tuned PostgreSQL with HNSW on local NVMe and a typical embedding model (768–1536 dims, cosine), p95 latency for 10-NN queries is usually in the 2–8 ms range at ef_search=100. Network-attached SSDs (AWS gp3, GCP pd-balanced) typically add 5–20 ms because HNSW graph traversal is pointer-chasing — every miss is a network round-trip. If you see p95 over 50 ms, the bottleneck is almost always storage or shared_buffers under-sized for the index.

Question 5

How long does a pgvector-bench run take?

Accepted Answer

A default run is ~2–4 minutes: ~5 seconds for connect + introspection, ~10 seconds for warmup + latency (1000 queries single-threaded), ~24 seconds for the throughput ramp (three 8-second levels), ~10–30 seconds for the recall sample depending on dataset size. With --synthetic at 100k rows, add ~30 seconds for index build. Larger datasets and the --ef-search sweep multiply recall time by N (one pass per ef_search value).

Question 6

Does pgvector-bench work with Supabase, Neon, RDS, and other managed Postgres?

Accepted Answer

Yes — any PostgreSQL with the vector extension installed. Tested against Supabase Pro, Neon Launch, AWS RDS PostgreSQL with pgvector, GCP Cloud SQL with pgvector, and self-hosted PostgreSQL 14/15/16/17. The tool only opens a Postgres connection to the URL you pass; it does not need IAM credentials, an SDK, or a control-plane API. If your provider exposes a Postgres connection string and the vector extension is installed, it works.

Benchmark pgvector.Trust the numbers.

Benchmark pgvector.
Trust the numbers.