Build RAG, semantic search, and recommendations on PostgreSQL with pgvector — embeddings stored next to your relational data, queried in plain SQL. Dedicated NVMe keeps retrieval under 4ms p50, with backups, monitoring, and HA included. No separate vector database to run or keep in sync.
The database your app already trusts, now doing vector search.
Keep embeddings next to the rows they describe. Filters, joins, and tenant scoping happen in plain SQL — no second system to keep in sync, no dual-write consistency bugs, no extra bill. Your AI features and your relational data share one transactional store.
Why managed Postgres for AI →HNSW search is bound by random-read latency. Local NVMe keeps 1M-vector queries under 4ms p50 in our benchmarks, and pgvector ships pre-installed and tuned — so retrieval for RAG and semantic search stays fast as the index outgrows memory.
NVMe vs cloud SSD benchmarks →Daily backups with point-in-time recovery, optional Patroni HA with automatic failover, and a metrics dashboard per database. The infrastructure an AI product needs in production — included, not sold back to you as add-ons.
PostgreSQL high availability →Where keeping vectors in Postgres wins, and where a specialized store still fits.
| PostgreSQL (Rivestack) | Pinecone | Weaviate | Vector add-on | |
|---|---|---|---|---|
| Data model | Vectors + relational rows in one DB | Vectors only | Vectors + limited metadata | Vectors bolted onto shared Postgres |
| Filters & joins | Native SQL (WHERE, JOIN, RLS) | Metadata filters only | GraphQL filters | SQL, but on shared storage |
| Sync overhead | None — single source of truth | Dual-write to keep in sync | Dual-write to keep in sync | None |
| Storage | Dedicated local NVMe | Managed cloud | Managed cloud | Shared cloud block storage |
| Pricing model | Fixed per node, from $15/mo | Usage-metered | Usage-metered | Plan + compute add-ons |
| Best fit | RAG, search & recs with relational context | Pure vector search at huge scale | Vector-native apps | Prototypes outgrowing the free tier |
Pick a region (EU or US-East) and a node size. pgvector is already enabled — CREATE EXTENSION vector is a no-op. PostgreSQL 18 with the supported pgvector 0.8.x line ships pre-tuned.
Add a vector column to the table that already holds the content. Insert embeddings from OpenAI, Cohere, or your own model — no separate vector store, no ETL pipeline.
Build an HNSW index, then query nearest neighbours with ORDER BY embedding <=> $1 — combined with the same WHERE clauses and joins your app already uses. Your existing client library, your existing SQL.
One price per node. No per-query or per-vector billing.
For prototyping AI features. Shared PostgreSQL with pgvector enabled.
Your own dedicated VM for small production AI apps. Never deleted.
Production dedicated PostgreSQL with HA-ready architecture.
More compute and storage for larger embedding sets.
High-performance PostgreSQL for demanding AI workloads.
What teams ask before building AI features on Postgres.
Yes. With the pgvector extension, PostgreSQL stores and searches high-dimensional embeddings using HNSW and IVFFlat indexes — the core of retrieval-augmented generation (RAG), semantic search, and recommendation systems. Because the vectors live next to your relational data, you filter, join, and scope by tenant in standard SQL without running a separate vector database.
For most teams PostgreSQL is the simpler and cheaper choice. RAG retrieval is nearest-neighbour search with filters, which pgvector handles well into the tens of millions of vectors. A dedicated vector database earns its keep at very large scale (hundreds of millions to billions of vectors) or when the workload is pure vector search with no relational context. If your AI feature also needs users, documents, permissions, or metadata, keeping it all in Postgres removes an entire system from your stack.
A well-sized node handles tens of millions of 1536-dimension vectors comfortably; hundreds of millions are reachable with partitioning. The practical limits are memory for the HNSW index, storage IOPS under random reads, and your latency target. As a rough guide on Rivestack: a 4 GB node serves ~1M vectors with sub-10ms p95, and a 16 GB node serves ~20M.
Any of them. pgvector stores a plain array of floats, so embeddings from OpenAI, Cohere, Voyage, Google, or your own open-source model all work — you just match the vector column dimension to the model output (for example vector(1536) for OpenAI text-embedding-3-small). You can keep multiple embedding columns in one table if you run more than one model.
You do not have to. That is the main advantage of PostgreSQL for AI: the embedding is a column on the same row as the content, written in the same transaction. There is no second datastore to dual-write, no eventual-consistency window, and no reconciliation job. When you delete or update a record, its vector goes with it.
Yes, and more cleanly than vector-only databases. Because filtering is just a SQL WHERE clause, you combine vector similarity with B-tree indexes on the columns you filter on — tenant_id, language, document type, recency — in one query plan. This avoids the "filter then search" vs "search then filter" trade-offs that pure vector stores expose.
HNSW graph traversal is dominated by random reads once the index no longer fits in RAM. Local NVMe delivers far lower random-read latency than general-purpose cloud block storage, which is exactly what tail latency on vector search depends on. On Rivestack we measure sub-4ms p50 on a 1M × 1536-dim HNSW path; the full methodology is in the NVMe vs cloud SSD benchmark.
Yes. Supabase and Neon are standard PostgreSQL with pgvector, so migration is pg_dump / pg_restore for smaller databases or logical replication for always-on workloads — typically 30–60 minutes with no application changes. Pinecone is not Postgres, so you re-insert your embeddings into a vector column once; we help map your metadata filters back to SQL columns before you start.
Yes. Both frameworks ship first-class PGVector vector stores — point them at your Rivestack connection string and they handle inserts and similarity queries for you. Because it is plain PostgreSQL underneath, you can also drop to raw SQL whenever you need joins or filters the framework abstraction does not expose.
A dedicated Rivestack VM with pgvector, NVMe, and daily backups starts at $15/month (Solo) — flat, not metered by query volume. The free tier is enough to prototype RAG and semantic search at no cost. There is no per-query or per-vector billing, so cost stays predictable as your AI feature scales.
Deeper reads on building AI features on Postgres.
The case for one transactional database over a separate vector store, and what to size for in production.
A practical walkthrough: embeddings, an HNSW index, and retrieval with metadata filters in plain SQL.
When PostgreSQL is the right home for your vectors, and when a dedicated vector database still wins.
Looking for hosted vector search specifically? See managed pgvector, managed PostgreSQL, and the pgvector guide.