Every RAG pipeline needs a vector database. It's where your embeddings live, and where similarity search happens when a user asks a question. Pick the wrong one and you're either overpaying for infrastructure you don't need, or locked into a vendor that'll raise prices once you're dependent.
I've deployed production RAG systems on four of these. Here's what I actually think.
What a Vector Database Does
Quick primer if you're new to this. A vector database stores high-dimensional vectors (arrays of numbers, typically 384-1536 dimensions) and performs similarity search — finding the vectors closest to a query vector. If you want the full breakdown, read our interactive guide to how RAG works.
The core operation is simple: "Given this query vector, find me the top 5 most similar vectors in the database." The differences between vector databases come down to:
- Performance — How fast is that search at scale?
- Cost — What does it cost to store and query?
- Operations — How much babysitting does it need?
- Integration — Does it fit your existing stack?
- Lock-in — Can you leave without rewriting everything?
The Contenders
pgvector — The Pragmatist's Choice
What it is: A PostgreSQL extension that adds vector storage and similarity search to your existing Postgres database.
Cost: Free. It's open source. Your only cost is the PostgreSQL instance you're probably already running.
Performance: Handles 1-5 million vectors comfortably with HNSW indexing. Sub-100ms queries for most workloads. At 10M+ vectors, you'll need to tune carefully — but if you're past 10M vectors, you already know what you're doing.
The case for it: Your business data already lives in PostgreSQL. Adding pgvector means your vectors live in the same database as your users, permissions, documents, and metadata. One connection pool. One backup strategy. One set of credentials. Joins between vector results and business tables are just SQL.
The case against it: If you need to search across 100M+ vectors with sub-10ms latency, pgvector will hit a ceiling. It's also not a managed service — you manage Postgres yourself (or use a managed Postgres like Supabase/Neon/RDS).
Watch out for: IVFFlat indexing is a trap for small datasets. If you have fewer than 10,000 vectors, use exact (brute-force) search — it's actually faster. HNSW is the right index type for everything else. Also, pgvector 0.8.x has a known query planner issue with JOINs on vector-sorted results — use a subquery pattern to avoid truncated results.
Verdict: Default choice for 90% of business RAG systems. We use it in production. Our live demo runs on it.
Pinecone — The Managed Option
What it is: Fully managed vector database as a service. You upload vectors via API, query via API. No infrastructure to manage.
Cost: Free tier (100K vectors, 1 index). Starter: $70/month. Standard: starts at ~$120/month and scales with usage. Enterprise: custom pricing (read: expensive).
Performance: Excellent. Purpose-built for vector search. Sub-50ms queries at scale. Handles billions of vectors. Automatic scaling, replication, backups.
The case for it: If you genuinely need to search across hundreds of millions of vectors and don't want to manage infrastructure, Pinecone is the best managed option. The API is clean. The dashboard is good. It just works.
The case against it: Vendor lock-in. Your vectors live in Pinecone's infrastructure. If they raise prices (and they've restructured pricing multiple times), your options are "pay more" or "migrate everything." Also, your vector data is separated from your business data — every query requires a round-trip to Pinecone plus a join against your own database for metadata.
Watch out for: The free tier is generous for prototyping but don't build production on it. The jump from free to paid is significant, and you'll hit the limits faster than you expect with real data.
Verdict: Good for large-scale, well-funded teams who value managed ops over control. Overkill for most SMBs.
Chroma — The Prototyper
What it is: Lightweight, open-source embedding database. Python-native. Designed for developer experience.
Cost: Free (open source). They also offer a hosted version.
Performance: Fine for small datasets (under 500K vectors). Uses DuckDB + Parquet under the hood. Not designed for high-concurrency production workloads.
The case for it: pip install chromadb and you're running in 30 seconds. Brilliant for prototyping, hackathons, local development, and proof-of-concept work. The API is the simplest of any option here.
The case against it: It's not a production database. No replication. No built-in backup beyond file copies. Concurrency handling is limited. The team is working on a distributed version, but as of early 2026, it's still primarily a single-node embedded database.
Watch out for: Don't prototype on Chroma and then try to "just swap in Pinecone/pgvector later." The APIs are different enough that you'll rewrite your retrieval layer anyway. If you know you're going to production, start with your production database.
Verdict: Great for learning and prototyping. Not for production. Build your PoC here if you want speed, but plan the migration from day one.
Weaviate — The Feature-Rich Option
What it is: Open-source vector database with built-in modules for vectorization, generative search, and hybrid (keyword + vector) search.
Cost: Free (self-hosted). Weaviate Cloud: starts at ~$25/month for serverless, ~$145/month for dedicated instances.
Performance: Strong. HNSW indexing, filtering, multi-tenancy. Handles millions of vectors well. Good concurrent query performance.
The case for it: If you need hybrid search (combining traditional keyword search with vector similarity), Weaviate does it natively and does it well. The built-in vectorization modules mean you can skip the embedding step — Weaviate handles it internally. GraphQL API is powerful if you like that kind of thing.
The case against it: It's a separate service to deploy and manage. More moving parts than pgvector. The learning curve is steeper — it has its own schema system, its own query language, its own concepts. If you just need "store vectors, query vectors," Weaviate is over-engineered for the job.
Watch out for: Resource consumption. Weaviate is a Go application that maintains HNSW indexes in memory. For large datasets, memory requirements can surprise you. Plan your VM sizing carefully.
Verdict: Good choice if hybrid search is a core requirement. Otherwise, you're adding complexity without proportional benefit.
Qdrant — The Performance Play
What it is: Vector database written in Rust, focused on performance and advanced filtering.
Cost: Free (self-hosted). Qdrant Cloud: free tier (1GB), then ~$30/month and up.
Performance: Among the fastest. Rust implementation gives it strong single-node performance. Excellent filtered search (searching vectors where metadata matches certain conditions).
The case for it: If you need vector search with complex metadata filtering (e.g., "find similar documents, but only from department X, created after 2024, with security clearance Y"), Qdrant handles this more efficiently than most alternatives. The REST and gRPC APIs are well-designed.
The case against it: It's another service to run. Smaller community than Postgres or Pinecone. If your filtering needs are simple (most are), you're not benefiting from Qdrant's main advantage.
Verdict: Impressive engineering. Worth considering if you have complex multi-dimensional filtering requirements. For straightforward RAG, pgvector does the same job with less infrastructure.
Milvus — The Enterprise Scale
What it is: Distributed vector database designed for massive scale. Backed by Zilliz.
Cost: Free (self-hosted, but requires etcd + MinIO + the Milvus cluster). Zilliz Cloud: free tier, then usage-based pricing.
Performance: Built for billion-scale datasets. GPU-accelerated indexing available. Handles multi-tenancy and sharding natively.
The case for it: If you're building a vector search system at genuine internet scale — hundreds of millions to billions of vectors, high QPS, multi-region — Milvus is designed for that. It's the Elasticsearch of vector search.
The case against it: Operationally complex. A production Milvus deployment involves multiple services (proxy, data node, query node, index node, etcd, object storage). For a team of 20-200 employees with thousands of documents, this is like buying a semi truck to get groceries.
Verdict: Right tool if you're building a product that serves millions of users with vector search. Wrong tool for your company's internal knowledge base.
The Decision Matrix
Cut through the noise:
- You already use PostgreSQL → pgvector. No question. You're adding an extension, not a new system.
- You have <100K documents → pgvector. It'll handle this without breaking a sweat.
- You need managed + massive scale → Pinecone. You're paying for ops you don't want to do.
- You need hybrid keyword + vector search → Weaviate. Its best-in-class feature.
- You need complex metadata filtering → Qdrant. Built specifically for this.
- You're prototyping this weekend → Chroma. Ship the demo, plan the migration.
- You're building at billion-vector scale → Milvus. But you already knew that.
- You're not sure → pgvector. It's the safest default with the lowest regret.
What We Use and Why
At Deep Conduit, we default to pgvector for client projects. Here's why:
- Every client already has PostgreSQL (or gets one). Zero new infrastructure.
- Vectors live next to business data. Permissions, audit logs, metadata — all in one place.
- No vendor lock-in. pgvector is open source and part of the Postgres ecosystem.
- For the document volumes most businesses deal with (1K-500K documents, producing 10K-5M chunks), pgvector with HNSW indexing delivers sub-100ms queries reliably.
- Our clients own their data. Not a third-party cloud service. Not a SaaS provider's infrastructure. Their Postgres instance.
We've built production systems with pgvector handling real workloads. Our live demo runs on PostgreSQL 16 + pgvector 0.8, using 384-dimensional embeddings from a local sentence-transformers model. It searches across 60 chunks from 9 documents in under 50ms. Scale that to 100K chunks and you're still under 100ms with HNSW.
Bottom Line
The vector database choice matters less than people think. What actually matters is your chunking strategy, embedding model selection, retrieval quality, and prompt engineering. Get those right and any of these databases will serve you well.
But if you're asking "which should I pick?" — pgvector. It's the boring, correct answer. And in infrastructure, boring is a compliment.
Want to see a pgvector-powered RAG system in action? Try our live demo, read the full pipeline breakdown, or get in touch to discuss building one for your business.