What are Vector Databases

Nnaemezue Obi-Eyisi
3 min readJul 14, 2023

Vector databases are specialized databases designed to store and retrieve vector representations of data. In the context of natural language processing and language models like ChatGPT, vector databases can be useful for various tasks, including semantic search, recommendation systems, and similarity matching.

Let’s break it down in simple terms.

Imagine you have a bunch of documents, like articles or blog posts, and you want to find similar ones quickly. A vector database can help you with that.

But what are vectors? Well, think of a vector as a unique fingerprint for each document. It’s like a special code that represents the essence of the text.

Now, a vector database is like a special storage system that keeps all these fingerprints organized. It knows how to compare the fingerprints of different documents and find the ones that are most similar.

So, when you have a new document and you want to find similar ones, you give it to the vector database. The database quickly calculates the fingerprint for the new document and compares it with all the other fingerprints it has stored. It then tells you which documents are the closest matches based on their fingerprints.

This can be super helpful because it saves you a lot of time searching through all the documents one by…

--

--

Nnaemezue Obi-Eyisi

I am passionate about empowering, educating, and encouraging individuals pursuing a career in data engineering. Currently a Senior Data Engineer at Capgemini