Search has always run on keywords. You type a word, the system finds the documents with that word in them. It’s fast, it’s everywhere, and it falls apart the moment the words don’t match.
Someone searches your help docs for “cancel my plan” and the page that answers it says “end your subscription.” Same meaning, no shared words. Keyword search misses it completely. That gap is the whole reason vector databases exist.
What it actually is
A vector database stores meaning instead of text.
I wrote about embeddings before. You run a piece of text through a model and get back a list of numbers that captures what it’s about. Two things that mean the same end up with similar numbers, even when they share no words. “Cancel my plan” and “end your subscription” land right next to each other.
A vector database is built to hold millions of those number-lists and answer one question fast: what’s closest to this? You hand it the numbers for a search, it hands back the nearest matches by meaning. That’s the whole job. It’s a search engine for similarity instead of keywords.
Why it needs to be its own thing
You might wonder why a normal database can’t just do this. It can, technically. It just can’t do it fast.
Finding the nearest matches means comparing your search against everything in the store. Do that across millions of records, on every single query, and a regular database crawls. Vector databases are built around one trick. They organize the vectors ahead of time so they can find the closest ones without checking every last one. They give up a sliver of accuracy for a huge jump in speed, and for this job that’s the right trade.
Where it fits
If you read the RAG piece, this is the retrieval half of it.
The flow goes like this. Chunk your documents, turn each chunk into an embedding, store them in a vector database. A question comes in, you embed the question, ask the database for the closest chunks, and hand those to the model to answer from. The vector database is the part that decides what the model gets to see. Get that wrong and nothing downstream can save you.
Do you actually need one
Maybe not, and this is the part the hype skips.
If you’ve got a few hundred documents, you don’t need a dedicated vector database. You can keep the vectors in memory, or add the capability to the database you already run. Postgres does it with an extension called pgvector. Plenty of real systems never outgrow that.
You reach for a dedicated one, something like Pinecone, Weaviate, or Qdrant, when the numbers get big. Millions of chunks, fast, and staying fast as you keep adding. That’s a real problem to have. It’s just not the problem most people actually have yet.
Start with the simplest thing that works. Move up when something forces you to, not because a tool’s landing page told you to.