Question 1

Can I use PostgreSQL for AI workloads?

Accepted Answer

Yes. With the pgvector extension, PostgreSQL stores and searches high-dimensional embeddings using HNSW and IVFFlat indexes, the core of retrieval-augmented generation (RAG), semantic search, and recommendation systems. Because the vectors live next to your relational data, you filter, join, and scope by tenant in standard SQL without running a separate vector database.

Question 2

Is PostgreSQL good enough for RAG, or do I need a dedicated vector database?

Accepted Answer

For most teams PostgreSQL is the simpler and cheaper choice. RAG retrieval is nearest-neighbour search with filters, which pgvector handles well into the tens of millions of vectors. A dedicated vector database earns its keep at very large scale (hundreds of millions to billions of vectors) or when the workload is pure vector search with no relational context. If your AI feature also needs users, documents, permissions, or metadata, keeping it all in Postgres removes an entire system from your stack.

Question 3

How many embeddings can PostgreSQL handle?

Accepted Answer

It depends on how much of the HNSW index fits in RAM. The practical limits are memory for the index, storage IOPS under random reads, and your latency target. Our measured in-RAM fast-search ladder for 1536-dimension vectors: a 4 GB node serves ~300K (we measured 250k at recall@10 0.90 / p50 2.6ms at 4 clients), an 8 GB node ~600K, and a 16 GB node ~1M, and that Scale node builds and serves a full 1M × 1536 index hot at ~3,600 QPS with p50 ~2.3ms (recall@10 0.75, ef_search=80, 16 clients). Larger sets still store fine on the NVMe disk, but once the index spills out of RAM search goes disk-bound and latency climbs, and HNSW index builds get memory-constrained (1M will not build below 16 GB), so benchmark your own dataset before committing.

Question 4

Which embedding models work with PostgreSQL?

Accepted Answer

Any of them. pgvector stores a plain array of floats, so embeddings from OpenAI, Cohere, Voyage, Google, or your own open-source model all work. You just match the vector column dimension to the model output (for example vector(1536) for OpenAI text-embedding-3-small). You can keep multiple embedding columns in one table if you run more than one model.

Question 5

How do I keep vectors in sync with my application data?

Accepted Answer

You do not have to. That is the main advantage of PostgreSQL for AI: the embedding is a column on the same row as the content, written in the same transaction. There is no second datastore to dual-write, no eventual-consistency window, and no reconciliation job. When you delete or update a record, its vector goes with it.

Question 6

Does pgvector support metadata filtering for RAG?

Accepted Answer

Yes, and more cleanly than vector-only databases. Because filtering is just a SQL WHERE clause, you combine vector similarity with B-tree indexes on the columns you filter on (tenant_id, language, document type, recency) in one query plan. This avoids the "filter then search" vs "search then filter" trade-offs that pure vector stores expose.

Question 7

Why does NVMe matter for AI on PostgreSQL?

Accepted Answer

HNSW graph traversal is dominated by random reads once the index no longer fits in RAM. Local NVMe delivers far lower random-read latency than general-purpose cloud block storage, which is exactly what tail latency on vector search depends on. On a Starter node (2 vCPU / 4 GB) we measure ~1,185 QPS at recall@10 0.93 (ef_search=80) with p50 3.2ms at 4 clients on a 250k × 1536 HNSW path; on a Scale node the same path reaches ~4,465 QPS at 16 clients, and Scale also builds and serves a full 1M × 1536 index hot at ~3,600 QPS / p50 ~4.2ms: high throughput and low p50 at once, with every figure carrying its recall and client count. The full methodology is in the NVMe vs cloud SSD benchmark.

Question 8

Can I migrate my AI workload from Pinecone or Supabase?

Accepted Answer

Yes. Supabase and Neon are standard PostgreSQL with pgvector, so migration is pg_dump / pg_restore for smaller databases or logical replication for always-on workloads, typically 30 to 60 minutes with no application changes. Pinecone is not Postgres, so you re-insert your embeddings into a vector column once; we help map your metadata filters back to SQL columns before you start.

Question 9

Does this work with LangChain and LlamaIndex?

Accepted Answer

Yes. Both frameworks ship first-class PGVector vector stores. Point them at your Rivestack connection string and they handle inserts and similarity queries for you. Because it is plain PostgreSQL underneath, you can also drop to raw SQL whenever you need joins or filters the framework abstraction does not expose.

Question 10

What does it cost to run AI on PostgreSQL?

Accepted Answer

A dedicated Rivestack VM with pgvector, NVMe, and daily backups starts at $29/month (Solo), flat, not metered by query volume. The free tier is enough to prototype RAG and semantic search at no cost. There is no per-query or per-vector billing, so cost stays predictable as your AI feature scales.

	PostgreSQL (Rivestack)	Pinecone	Weaviate	Vector add-on
Data model	Vectors + relational rows in one DB	Vectors only	Vectors + limited metadata	Vectors bolted onto shared Postgres
Filters & joins	Native SQL (WHERE, JOIN, RLS)	Metadata filters only	GraphQL filters	SQL, but on shared storage
Sync overhead	None, single source of truth	Dual-write to keep in sync	Dual-write to keep in sync	None
Storage	Dedicated local NVMe	Managed cloud	Managed cloud	Shared cloud block storage
Pricing model	Fixed per node, from $29/mo	Usage-metered	Usage-metered	Plan + compute add-ons
Best fit	RAG, search & recs with relational context	Pure vector search at huge scale	Vector-native apps	Prototypes outgrowing the free tier

PostgreSQL for AI:
one database for
data and vectors.

Stop overpaying for pgvector you don't control.

PostgreSQL for AI:one database fordata and vectors.

Stop overpaying for pgvector you don't control.

PostgreSQL for AI:
one database for
data and vectors.