All posts Self-Hosted vs Managed pgvector: The Real Total Cost of Ownership
·Rivestack Team· 11 min read

Self-Hosted vs Managed pgvector: The Real Total Cost of Ownership

pgvector
cost
TCO
PostgreSQL
managed database
AI

The decision usually starts as a spreadsheet. Someone prices a cloud VM with enough RAM to hold the HNSW index, sees a number smaller than any managed plan, and concludes that self-hosting pgvector is obviously cheaper. They are not wrong about the VM. They are wrong about the bill.

This is an honest attempt to lay out the real total cost of ownership for pgvector — self-hosted versus managed — without fabricating competitor prices or pretending the answer is the same for everyone. The short version: raw compute is genuinely cheaper when you run it yourself, especially on budget hosts. But pgvector cost is dominated by engineer-hours — setup, HNSW tuning, connection pooling, backups, monitoring, failover, upgrades, and on-call — and for a small team those hours usually dwarf the VM. Managed wins on time-to-value and reliability. Self-hosting wins when you already have spare DBA capacity or a hard control requirement.

The longer version requires actually adding up both columns.


Quick Comparison#

All non-Rivestack figures below are rough estimates — they swing a lot by cloud provider and team. The only firm prices here are Rivestack's flat tiers.

Dimension Self-hosted pgvector Managed pgvector (e.g. Rivestack)
Raw compute (1M × 1536 index, ~16 GB RAM) ~$25–90/mo, depends heavily on host $99/mo flat (Scale)
Smaller workloads A small VM is a few dollars $0 Shared / $15 Solo (~250k × 1536)
Setup time Hours to days (install, tune, harden) Minutes
HNSW parameter tuning Yours to learn and own Sane defaults, guidance included
Connection pooling gotchas Yours to discover Configured and documented
Backups + PITR Build and restore-test it yourself Included
HA / failover Standby + promotion you build and test Included on HA-ready tiers
Monitoring + on-call Yours, 24/7 Provider's
Upgrades (minor + major) Your project, on your calendar Handled
Billing model Mostly your time, variable Flat, predictable
Best when You have spare DBA capacity or strict control needs You want time-to-value and reliability

What "Self-Hosted pgvector" Actually Means#

The seductive part of self-hosting pgvector is that the compute really can be cheap. pgvector is a PostgreSQL extension, not a separate product with a separate license. You install Postgres, run CREATE EXTENSION vector, build an HNSW index, and you have a vector search engine. On a budget host, a VM with enough RAM to keep that index resident can cost less than a managed plan covering the same workload. That part of the spreadsheet is honest.

The cost the spreadsheet hides is everything around the VM. Most of it is the same checklist that applies to any production Postgres — restore-tested backups, a tested failover path, patching, monitoring, disk management, on-call — and it is real, ongoing, and silently fails when nobody owns it. Rather than re-run that accounting here, it has its own honest writeup: managed PostgreSQL vs self-hosted walks through the full bill of materials and lands on roughly 4–6 hours a month of attention in the steady state for a single instance run properly, with the real damage coming from variance — the one incident that eats a week.

What's worth spending words on here is the part of the bill that is specific to pgvector, because that's the work a generic "I can run Postgres" instinct underestimates.

Your VM is sized to your index, not your traffic. The HNSW graph has to live in RAM to serve queries at low latency. That means your hardware requirement is a function of vector count times dimensions, not query volume. A 1M × 1536-dim index needs roughly 16 GB of RAM to build and stay hot; 250k × 1536 fits comfortably in a 4 GB node. Get this wrong and the symptom isn't an error — it's an index that spills, traverses cold pages off disk, and serves p99s that quietly wreck your retrieval. Sizing pgvector is its own skill, and it's the one that ties cost directly to your corpus.

HNSW builds are memory-bound and have a cliff. Building the index isn't a steady background task — it wants memory proportional to the index, and when the index doesn't fit, builds don't fail loudly, they grind. A 500k-vector build that takes minutes on a right-sized node can run for hours on one that's a hair too small. Choosing m and ef_construction, deciding whether to rebuild after a Postgres or pgvector upgrade to pick up improvements, and trading recall against build time — that's recurring judgment, not a one-time setup. The HNSW vs IVFFlat tradeoffs are a whole decision on their own.

Recall tuning never quite ends. ef_search is the dial between recall and latency, and the right value depends on your data, your filters, and your latency budget. It's not set-and-forget; it drifts as your corpus grows and your query mix changes.

Connection pooling will bite you in a way that's pure pgvector tax. Under PgBouncer transaction pooling, a session-level SET hnsw.ef_search = 100; is silently dropped before your next query runs — your recall tuning does nothing, and nothing errors. The fix is SET LOCAL inside the query's transaction, or an ALTER DATABASE default. This is exactly the kind of sharp edge a managed provider configures and documents up front, while a DIY setup hands it to you at the worst time.

Re-embedding is a migration, not a config change. The day you switch embedding models, every vector has to be regenerated, the column re-typed if dimensions changed, and the index rebuilt — a planned project with a memory budget and a maintenance window you own.

None of these is hard for someone who's done it. The point is that "I'll just run a VM" prices in none of them.


What "Managed pgvector" Actually Means#

Managed pgvector is the same Postgres and the same vector extension — the difference is who owns the checklist. On a service like Rivestack, the node ships on NVMe with pgvector pre-configured, automated backups with point-in-time recovery, failover machinery on the HA-ready tiers, and pooling already configured and documented, so the ef_search trap above doesn't catch you cold. You get an endpoint and a connection string; the sizing guidance, the upgrade cadence, and the 2 AM disk alert are someone else's job.

The pricing is flat and tied to honest, stated working-set ceilings rather than query volume:

  • Shared — $0. A free shared tier for prototyping and small experiments.
  • Solo — $15/mo. A dedicated 2 vCPU / 4 GB NVMe node that comfortably serves around 250k × 1536-dim vectors.
  • Scale — $99/mo. Enough memory (~16 GB) to build and serve a ~1M × 1536-dim index hot.

Flat matters more than it looks. Usage-based vector services punish exactly the success you're hoping for — more traffic, bigger bill. A flat tier means the number you budget in month one is the number you pay in month twelve, whether you run 100,000 queries or 10 million.

To be fair about the limits: on Rivestack today, added nodes are streaming-replication standbys for automatic failover, not read replicas — every query, reads included, routes to the primary, and read-serving replicas are roadmap, not reality. A provider that won't tell you that is hiding the part you most need to know. Managed pgvector buys you back the checklist and the sizing risk, not infinite horizontal scale.


The Real Tradeoff: Compute Is Cheap, Hours Are Not#

Here's the cost comparison done honestly, for a representative production workload of around a million 1536-dim vectors that needs to be reliable.

The self-hosted column, done properly:

Line item DIY, rough estimate
Primary VM (~16 GB RAM, NVMe) ~$25–90/mo, host-dependent
Standby VM for failover ~$25–90/mo
Object storage for backups + WAL ~$5/mo
Monitoring (small VM or hosted) ~$10/mo
Your time, 4–6 h/mo at $80–120/h ~$320–720/mo
Total ~$385–915/mo, mostly time

Managed, same workload: the Scale tier at $99/mo flat, with backups, PITR, NVMe, and pooling included.

Two honest observations. First, the compute line really can favor self-hosting: on a budget or bare-metal host a 16 GB NVMe VM can run well under the $99 tier, and if you skip the standby because you don't need HA, the raw infrastructure is cheap. The spreadsheet that started this whole debate is correct as far as it goes. Second, the compute line is not where the money is. The engineer-hours dominate every other row combined, and they don't show up until you've signed up for them. At a loaded $80–120/hour, even a conservative four hours a month is more than the entire managed bill.

And the steady state isn't the part that hurts. The variance is. A botched re-embedding migration, an index that fell out of RAM after the corpus grew, a backup chain that had been writing zero-byte files since a path changed in an OS upgrade — any one of those eats a week and burns the trust of whoever was waiting on the feature you didn't ship. Self-hosting pgvector doesn't cost four hours a month; it costs four hours a month most months.

There is a real crossover, and it's worth being precise about it: self-hosting gets cheaper when the labor amortizes across many databases, not within one. A platform team already running a fleet of Postgres pays near-zero marginal cost for one more pgvector instance. A solo founder pays the whole checklist for exactly one. The math flips on headcount you already have, not on the size of any single index. If you want to pressure-test where your own crossover sits, estimate your own numbers with your real vector count, hourly rate, and whether you actually need HA — and if you'd rather verify the performance side, the pgvector-bench harness reproduces the throughput and recall figures on your own hardware rather than trusting a marketing number.


When Self-Hosting pgvector Wins#

You already employ the team. If you have platform engineers or DBAs running Postgres at scale, the checklist above is already someone's staffed job. The marginal cost of one more pgvector instance is close to zero, and a good in-house team will beat any vendor's defaults for your specific workload. Don't pay twice for ops you already own.

You have a hard control requirement. Some regulatory or contractual regimes genuinely require the data to live on hardware you control. If your auditor says the vectors cannot leave your own cloud account or on-prem hardware, that ends the discussion. Be precise about what's actually required, though — "data must stay in the EU" is a hosting-region requirement, not a self-hosting one.

Your corpus is huge and your RAM budget is the binding constraint. If you're past what a single node holds and you're already building partitioning, quantization, and sharding, you're doing real database engineering either way, and owning the box can be the right call. The principle holds: at genuine scale, the managed convenience premium shrinks relative to the engineering you're doing regardless.

You're learning. Running your own pgvector on low-stakes data is the best possible education. Fill the disk on purpose, blow up an HNSW build, restore from backup with a timer running. Six months of that and you'll understand exactly what you're paying any vendor to do — which makes you a far sharper customer.

What's not on this list: saving money at small scale with one or two databases. That's the case the VM spreadsheet makes, and it's the one the engineer-hours quietly demolish.


When Managed pgvector Wins#

You're a small team and attention is your scarce resource. A database is the worst possible place to spend it, because database work is invisible when it succeeds and catastrophic when it fails. If shipping product is the goal, managed pgvector removes the only real friction without removing any of the capability — it's the same Postgres, the same SQL, the same extension.

You want predictable cost. Flat tiers mean no surprise invoice when your app gets traffic and no usage meter rewarding you for staying small. You can budget pgvector cost in advance and be right twelve months later.

You want reliability you didn't have to build. Restore-tested PITR, NVMe-backed HNSW performance, configured pooling, and automatic failover are included rather than assembled. The pgvector hosting guide lays out what "production-ready" actually means and what to demand from any provider, including this one.

You want time-to-value measured in minutes. Setup is a connection string, not a weekend of installing, tuning, hardening, and writing your first restore drill. For most teams that gap — minutes versus days, then hours-per-month forever — is the whole decision.


The Bottom Line#

For most teams building AI applications in 2026, managed pgvector is the better economic choice — not because the VM is expensive, but because the VM was never the cost. The cost is the HNSW tuning, the pooling gotchas, the backups you have to restore-test, the failover you have to build, the upgrades, and the on-call, and for a team running one or two databases those hours dwarf any infrastructure bill. Managed wins on time-to-value and on reliability you didn't have to assemble.

Self-hosting wins, and wins cleanly, in two situations: you already staff the ops checklist across a fleet, or a hard control requirement takes the decision out of your hands. In both, the marginal cost of running pgvector yourself is genuinely low, and the raw-compute advantage the spreadsheet promised actually materializes.

The honest test is the same one that settles most of these debates: count both columns, not one. If your self-hosted estimate is only the VM, you haven't finished the estimate. Put your real numbers — vector count, dimensions, hourly rate, whether you need HA — into a cost model and let the two totals sit side by side. The compute line will favor DIY. The total usually won't, until you have the team to make it.

Try Managed pgvector on Rivestack#

If the math comes out the way it does for most small teams, Rivestack removes the only real friction point: owning the database. You get PostgreSQL on NVMe with pgvector pre-configured, automated backups with point-in-time recovery, and pooling configured so the sharp edges above are documented up front rather than discovered in production — flat-priced, starting at $0 for the Shared tier and $15/month for a dedicated Solo node. For benchmarks, plans, sizing ceilings, and migration paths, see managed pgvector hosting on Rivestack.

Keep reading