June 27, 2026·Rivestack Team· 11 min read

pgvector vs Qdrant: Which Should You Use in 2026?

pgvector

Qdrant

vector database

PostgreSQL

The decision usually shows up the same way. You are building something with embeddings — a RAG pipeline, semantic search, a recommendation engine — and you need to store and query vectors. Someone suggests Qdrant, because it is fast, open source, and purpose-built for exactly this. Someone else points out that the PostgreSQL you already run has pgvector. The debate begins.

This is an honest attempt to settle it, or at least to give you enough to settle it for your own situation. And Qdrant deserves a fairer fight than Pinecone usually gets, because the easy arguments against a hosted vector database — proprietary, locked-in, opaque — mostly do not apply to it.

The short version: pgvector is the right choice for most teams building AI applications today. Qdrant is the right choice for a specific set of use cases around very large scale, heavy filtered search, and aggressive memory optimization. The longer version requires looking at the tradeoffs honestly.

Quick Comparison#

Dimension	pgvector	Qdrant
Type	PostgreSQL extension	Dedicated vector database (Rust)
License	PostgreSQL license (open source)	Apache 2.0 (open source)
Hosting	Self-managed or managed (e.g. Rivestack)	Self-hosted or Qdrant Cloud
Max practical scale	~1M vectors per node, more with partitioning	Hundreds of millions, sharded
Joins with relational data	Yes, native SQL	No
Filtered search	Good; iterative scans in 0.8	Excellent; filter-aware HNSW traversal
Quantization	halfvec (fp16), binary (bit)	Scalar, product, binary with rescoring
Hybrid search	Yes, with tsvector / full-text	Yes; dense + sparse, BM25, SPLADE++
Pricing model	Fixed per instance	Free tier + usage-based
Index types	HNSW, IVFFlat	HNSW

What pgvector Is and When It Shines#

pgvector is a PostgreSQL extension that adds a vector data type along with distance operators and index support. You install it, create a column of type vector(1536), build an HNSW or IVFFlat index on it, and you have a vector search engine running inside your existing database.

That last part is the thing people underestimate. It is not a separate vector database you query alongside PostgreSQL. It is PostgreSQL. Your vectors live in the same rows as your user IDs, timestamps, metadata, and application data. Which means you can do things like:

SELECT content, embedding <=> $1 AS distance
FROM documents
WHERE user_id = $2 AND created_at > now() - interval '30 days'
ORDER BY distance
LIMIT 10;

That query finds the semantically closest documents to a given embedding, scoped to a specific user, filtered to the last 30 days, in a single round trip. With Qdrant you would denormalize user_id and created_at into the payload, keep that copy in sync, run the filtered search, and then make a second round trip to Postgres to fetch any relational fields you did not duplicate. pgvector skips all of that because the data never left.

pgvector genuinely shines when your data has structure beyond the vectors themselves. If you are building document search where results need to respect access controls, team membership, subscription tiers, or document status, pgvector lets you express all of it in one query, inside one transaction. Delete a user and their chunks cascade in the same commit — no orphaned vectors, no reconciliation job, no "our chatbot just cited a document the customer deleted last week."

The HNSW index in pgvector is also fast. On NVMe storage, a well-tuned index handles roughly 1,000 to 2,500 queries per second per node at low-single-digit-millisecond p50 latency for datasets that fit in RAM. On one $15 Solo node (2 vCPU / 4 GB), a 250k × 1536-dim index serves ~1,000 QPS at recall@10 0.93 with a p50 of 3.7ms, measured same-region. That is competitive with any dedicated vector engine at this scale — and you can reproduce it yourself.

What Qdrant Is and When It Shines#

Qdrant is a purpose-built vector search engine written in Rust, released under the permissive Apache 2.0 license. You can run it yourself with a single container, or use Qdrant Cloud, the managed version. It stores vectors with a JSON "payload" attached, indexes them with HNSW, and exposes both REST and gRPC APIs. It is genuinely good software, and it is honest to say so.

Three things Qdrant does better than pgvector, and they are real.

Scale past one node. Qdrant is built for sharding and replication — a collection splits across nodes, with a replication factor (at least 2 for production) for availability. pgvector's HNSW index is built and served on a single node, and the build is memory-bound: past roughly a million 1536-dim vectors per node you are into partitioning or bigger hardware. At hundreds of millions of vectors, Qdrant's distributed design is not a nice-to-have; it is the only way the workload exists.

Filtered search at scale. This is where Qdrant is genuinely clever and where the usual "bolted-on metadata" criticism of dedicated engines does not land. Qdrant's payload index extends the HNSW graph so that filters are applied during traversal, in a single pass, rather than pre-filtering (slow) or post-filtering (wrecks recall). pgvector 0.8 added iterative index scans, which fixed the worst "my filter ate the result set" cases — but for highly selective filters over very large collections, Qdrant's filter-aware traversal is ahead, and it is good at moderate scale too, not only at billions.

Memory optimization. Qdrant ships scalar, product, and binary quantization with rescoring. Qdrant reports that binary quantization can cut memory use by up to 32x and speed searches by up to 40x on compatible embeddings (those are Qdrant's figures, not measured here). pgvector is not quantization-free — it has halfvec for fp16 and bit for binary — but Qdrant's options are more advanced and more tunable. If your corpus is large and your RAM budget is the binding constraint, that matters.

Qdrant also has a strong hybrid search story: dense plus sparse vectors in one query, with built-in support for BM25, SPLADE++, and fusion strategies like Reciprocal Rank Fusion. If your relevance quality depends on blending keyword and semantic signals at scale, it is a well-designed system for that.

The honest downside is not lock-in — it is open source, so that argument is weak. The downside is that it is still a second system. Your vectors live in Qdrant and your application data lives in Postgres, and now you own the seam between them.

pgvector vs Qdrant: Performance at Real Workloads#

This is where the conversation usually drowns in theoretical benchmarks. A more useful frame:

For datasets under about a million vectors on modern NVMe hardware, pgvector with HNSW is fast enough that the vector search layer will almost certainly never be your bottleneck. A 16 GB Scale node builds and serves a 1M × 1536 index hot at ~2,500 QPS (recall@10 0.75) with a p50 of ~5.9ms. For most applications at this scale, the network round trip dominates the query time — the actual vector search is not what is slow.

Qdrant's latency in this range is similar. Where it pulls ahead is past the single-node ceiling: tens to hundreds of millions of vectors, sustained high concurrency, or heavily filtered queries over very large collections. A single Postgres instance at 100M vectors requires real engineering — partitioning, careful tuning, and read routing you have to build yourself. Qdrant handles that horizontally by design. If that is your workload, this is not a close call, and pretending otherwise would not be honest.

One thing worth being explicit about for pgvector: the storage layer matters enormously. On standard cloud SSDs like AWS gp3, HNSW traversal is bottlenecked by IOPS; on NVMe that bottleneck disappears. A managed Postgres service on NVMe changes the pgvector performance story significantly, and the full methodology plus the head-to-head against a dedicated engine live in our Postgres vs dedicated vector databases post.

pgvector vs Qdrant: Cost Comparison#

This is more nuanced than the Pinecone comparison, because Qdrant has a real free tier and an open-source self-host path.

Qdrant Cloud has a genuinely generous permanent free tier — a single node with 0.5 vCPU, 1 GB RAM, and 4 GB disk, no credit card, which Qdrant says serves roughly a million 768-dimension vectors for prototyping. Beyond it, the Standard tier is usage-based, billed hourly on the compute, memory, and storage you consume; there is no flat monthly figure to quote, though a small always-on cluster runs in the low tens of dollars and climbs with your index's RAM footprint. Self-host the open-source build and you pay only for servers and operational time.

pgvector on a managed service like Rivestack is a flat rate regardless of query volume. A dedicated Solo VM (2 vCPU, 4 GB RAM, 55 GB NVMe) is $15/month and comfortably serves a few hundred thousand 1536-dim vectors. The 1-million-vector index above needs roughly 16 GB of RAM to build and stay hot, which lands on the Scale plan at $99/month — still flat, whether you run 100,000 queries or 10 million.

But the cost the sticker price hides is the second system. With Qdrant you run two stores: a sync pipeline (dual-write or CDC), two backup stories, two monitoring stacks, and a re-embedding migration that has to hit both systems and cut over atomically — which, across two systems, you cannot do exactly. None of these is fatal; together they are something like an engineer-month a year, indefinitely. That is a fine price if Qdrant buys you scale Postgres cannot reach, and a poor one for 400k vectors that fit in a single node. To be fair, the tax is smaller if you self-host and your team already lives in Kubernetes — but it is never zero.

Developer Experience#

pgvector has a steeper initial learning curve and a better long-term one. You need to know PostgreSQL well enough to create tables, build indexes, and write queries — no additional overhead if you already use Postgres. The payoff is that your vectors are queryable with psql, introspectable with EXPLAIN ANALYZE, backed up alongside the rest of your data, and managed with the same migrations, ORMs, and monitoring you already have. At month six, when the schema has changed three times and you are debugging a slow query at 11pm, decades of Postgres tooling are available to you.

Qdrant's initial experience is smoother and genuinely pleasant. Run the container, point a well-designed client at it, insert vectors, query them. The Python and TypeScript SDKs are good, the docs are clear, and the API maps cleanly onto vector concepts without any SQL — for a pure vector workload it is hard to beat for time-to-first-query. The gap reverses over time the way it does with any dedicated store: vectors and application data in different systems means synchronization code, more failure modes, and two things to monitor when retrieval p95 spikes. Qdrant gives you good observability for the vector side; it cannot tell you the real problem was a lock on the Postgres table feeding it.

When to Choose pgvector with Rivestack#

Choose pgvector when your dataset fits comfortably within a node or two — call it up to a few million vectors with partitioning — and your application has any relational structure at all. That covers the large majority of real applications.

If you are building a RAG pipeline that scopes results by user, organization, document status, or anything else you already store in Postgres, pgvector is the natural choice. You get vector search and relational filtering in one system, one connection, one backup strategy, one bill. And you can blend it with full-text search natively — see hybrid search with pgvector and Postgres for how that comes together without a second engine.

If you want the managed pgvector story without operating the database yourself, Rivestack handles backups, high availability, NVMe storage provisioning, and pgvector configuration. You get NVMe-backed HNSW performance without managing any of it, at a flat monthly price. pgvector hosting is the whole product, so the setup is straightforward and the support knows the AI workload you are running. The old argument against pgvector — that self-hosting Postgres is painful — no longer holds when someone else runs it.

When to Choose Qdrant#

Choose Qdrant when you are operating at genuinely large scale — tens to hundreds of millions of vectors — where a single Postgres node stops and horizontal sharding becomes the only sane architecture. Choose it when filtered search is the heart of your product and your filters are highly selective over very large collections, because Qdrant's filter-aware HNSW traversal is a real, designed-in advantage there, not marketing.

Choose it, too, when memory is your binding constraint and you want aggressive quantization with rescoring to keep a huge index resident, or when your workload is purely vectors with little relational context so you lose almost nothing by not having SQL joins — especially if your team already runs distributed infrastructure. If that describes you, Qdrant is excellent, and self-hosting the open-source build keeps you off any vendor's meter. It was built for this problem in a way that single-node Postgres simply was not.

The Bottom Line#

For most developers building AI applications in 2026, pgvector is the better choice. It lives in your existing database, handles joins and transactions with relational data natively, costs a flat predictable rate, and performs well on any dataset you are likely to have for the first few years of a product.

The case for Qdrant is real and, unlike the Pinecone case, it is not mainly about scale-or-nothing — Qdrant's filtering and quantization are genuinely strong engineering. But the dividing line is the same: a dedicated engine wins on scale, advanced filtering, and memory efficiency, while Postgres wins on correctness, simplicity, and cost — the things every product needs from day one.

The practical test: look at a real query your application needs to run. If it touches both vectors and relational data, pgvector wins. If it is pure vector search over a very large, heavily filtered corpus, Qdrant earns its place. And if you are coming the other direction — already on a dedicated engine, fighting metadata sync, under a million vectors — that migration runs in the direction the marketing does not mention. See the Postgres vs dedicated vector databases breakdown, or the pgvector vs Pinecone comparison, for where that line actually sits.

Try Rivestack for pgvector in Production#

If pgvector is the right choice for your stack, Rivestack removes the only real friction point: managing the database itself. You get a PostgreSQL instance on NVMe storage with pgvector pre-configured, automated backups, and high availability. Setup takes minutes and the pricing starts at $15/month.

For benchmarks, plans, and migration paths, see managed pgvector hosting on Rivestack.

Keep reading