Question 1

What is pgvector?

Accepted Answer

pgvector is an open-source PostgreSQL extension that adds a vector data type and vector similarity search. It lets you store embeddings (arrays of floats produced by an AI model) in a regular table column and query for nearest neighbours using distance operators, with optional HNSW or IVFFlat indexes for speed. It turns PostgreSQL into a capable vector database without adding a separate system.

Question 2

What is pgvector used for?

Accepted Answer

The common uses are retrieval-augmented generation (RAG), semantic search, recommendation systems, deduplication, and image or audio similarity. Anywhere you have embeddings and want to find the most similar items (often combined with relational filters like tenant, language, or recency), pgvector handles it in SQL.

Question 3

What is the difference between HNSW and IVFFlat in pgvector?

Accepted Answer

HNSW builds a navigable graph and is the production default: highest recall and lowest query latency, at the cost of slower builds and more memory. IVFFlat groups vectors into lists and is faster and cheaper to build but generally needs tuning (the probes parameter) and suits static datasets. For most workloads, use HNSW; choose IVFFlat when build time and memory matter more than peak query speed.

Question 4

How do I install pgvector?

Accepted Answer

On self-hosted Postgres you build or install the extension, then run CREATE EXTENSION vector. On managed services it is usually available to enable. On Rivestack, pgvector 0.8.x ships pre-installed and tuned on every database, so CREATE EXTENSION vector is a no-op and you can start inserting embeddings immediately.

Question 5

How many dimensions does pgvector support?

Accepted Answer

pgvector supports up to 16,000 dimensions for the vector type (and up to 2,000 dimensions for an indexed column with the default ops). In practice most teams use 384 to 1,536 dimensions to match popular embedding models like OpenAI text-embedding-3 or open-source sentence transformers.

Question 6

Is pgvector fast enough for production?

Accepted Answer

Yes. With an HNSW index, pgvector serves tens of millions of vectors with single-digit-millisecond latency on appropriate hardware. The main factors are memory for the index, storage latency on cache misses (NVMe helps a lot here), and the ef_search / m parameters. See our NVMe vs cloud SSD benchmarks for measured numbers.

Question 7

Does pgvector replace a dedicated vector database?

Accepted Answer

For most teams, yes. Keeping vectors in PostgreSQL means filters, joins, and tenancy are plain SQL and there is no second datastore to sync. Dedicated vector databases still make sense at very large scale (hundreds of millions to billions of vectors) or for pure vector workloads with no relational context.

Question 8

Where can I host pgvector?

Accepted Answer

Anywhere you run PostgreSQL: self-hosted, Supabase, Neon, RDS, Aiven, or a focused managed service. The differences that matter for vector search are storage latency, memory headroom, which pgvector version is supported, and whether HNSW is tuned. Rivestack runs pgvector on dedicated local NVMe with tuning and backups included. See the managed pgvector page.

pgvector:
vector search, built into PostgreSQL.

Stop overpaying for pgvector you don't control.

pgvector:vector search, built into PostgreSQL.

Stop overpaying for pgvector you don't control.

pgvector:
vector search, built into PostgreSQL.