What is a vector database?

A vector database is a collection of data stored as mathematical representations. Vector databases make it easier for machine learning models to remember previous inputs, allowing machine learning to be used to power search, recommendations, and text generation use-cases. Data can be identified based on similarity metrics instead of exact matches, making it possible for a computer model to understand data contextually.

When one visits a shoe store, a salesperson may suggest shoes that are similar to the pair one prefers. Likewise, when shopping in an ecommerce store, the store may suggest similar items under a header like "Customers also bought..." Vector databases enable machine learning models to identify similar objects, just as the salesperson can find comparable shoes and the ecommerce store can suggest related products. (In fact, the ecommerce store may use such a machine learning model for doing so.)

To summarize, vector databases make it possible for computer programs to draw comparisons, identify relationships, and understand context. This enables the creation of advanced artificial intelligence (AI) programs like large language models (LLMs).

Embeddings - Documents in vector space clustered together

In this simple vector database, the documents in the upper right are likely similar to each other.

What is a vector?

A vector is an array of numerical values that expresses the location of a floating point along several dimensions.

In more everyday language, a vector is a list of numbers, like: {12, 13, 19, 8, 9}. These numbers indicate a location within a space, just as a row and column number indicates a certain cell in a spreadsheet (e.g. "B7").

How do vector databases work?

Each vector in a vector database corresponds to an object or item, whether that is a word, an image, a video, a movie, a document, or any other piece of data. These vectors are likely to be lengthy and complex, expressing the location of each object along dozens or even hundreds of dimensions.

For example, a vector database of movies may locate movies along dimensions like running time, genre, year released, parental guidance rating, number of actors in common, number of viewers in common, and so on. If these vectors are created accurately, then similar movies are likely to end up clustered together in the vector database.

How are vector databases used?

Similarity and semantic searches: Vector databases allow applications to connect pertinent items together. Vectors that are clustered together are similar and likely relevant to each other. This can help users search for relevant information (e.g. an image search), but it also helps applications:
- Recommend similar products
- Suggest songs, movies, or shows
- Suggest images or video
Machine learning and deep learning: The ability to connect relevant items of information makes it possible to construct machine learning (and deep learning) models that can do complex cognitive tasks.
Large language models (LLMs) and generative AI: LLMs, like that on which ChatGPT and Bard are built, rely on the contextual analysis of text made possible by vector databases. By associating words, sentences, and ideas with each other, LLMs can understand natural human language and even generate text.

What are embeddings?

Embeddings are vectors generated by neural networks. A typical vector database for a deep learning model is composed of embeddings. Once a neural network is properly fine-tuned, it can generate embeddings on its own so that they do not have to be created manually. These embeddings can then be used for similarity searches, contextual analysis, generative AI, and so on, as described above.

What are the advantages of using a vector database?

Querying a machine learning model on its own, without a vector database, is neither fast nor cost-effective. Machine learning models cannot remember anything beyond what they were trained on. They have to be the context every single time (which is how many simple chatbots work).

Passing the context of a query to the model every time is very slow, as it is likely to be a lot of data; and expensive, as data has to move around, and computing power has to be expended repeatedly having the model parse the same data. And in practice, most machine learning APIs are likely constrained in how much data they can accept at once anyway.

This is where a vector database comes in handy: a dataset goes through the model only once (or periodically as it changes), and the model's embeddings of that data are stored in a vector database.

This saves a tremendous amount of processing time. It makes building user-facing applications around semantic search, classification, and anomaly detection possible, because results come back within tens of milliseconds, without waiting for the model to crunch through the whole data set.

For queries, developers ask the machine learning model for a representation (embedding) of just that query. Then the embedding can be passed to the vector database, and it can return similar embeddings — which have already been run through the model. Those embeddings can then be mapped back to their original content: whether that is a URL for a page, a link to an image, or product SKUs.

To summarize: Vector databases work at scale, work quickly, and are more cost-effective than querying machine learning models without them.

Does Cloudflare offer the ability to use vector databases?

Vectorize is a globally distributed vector database offered by Cloudflare. Applications built on Cloudflare Workers can use Vectorize to query documents stored in Workers KV, images stored in R2, or user profiles stored in D1. Just as Workers allows developers to build applications without spinning up any backend infrastructure, Vectorize allows developers to build AI capabilities into their applications without constructing their own vector database infrastructure. And for creating embeddings, Cloudflare offers Workers AI.

Learn about building AI-driven applications on Cloudflare.

FAQs

What is a vector database?

A vector database stores data as mathematical representations called vectors. It is designed to cluster related items, which enables powerful capabilities like similarity searches. Vector databases are foundational for building advanced AI applications.

How do vector databases work?

Each object — whether it's a word, an image, or a document — is represented by a vector, which is a list of numbers. These numbers define the object's location across many different dimensions or characteristics. The database then groups or clusters vectors that are close to each other, allowing a machine learning model to quickly find similar items.

What is a "vector" in the context of AI?

A vector is an array of numerical values that represents an object. Think of it as a list of coordinates, like {12, 13, 19, 8, 9}, that pinpoints the object's location within a multi-dimensional space based on its various attributes.

What are the main uses for vector databases?

Vector databases are primarily used for similarity and semantic searches, machine learning and deep learning, and large language models (LLMs), which power AI agents and other advanced AI applications.

What are the advantages of using a vector database with a machine learning model?

Using a vector database is much faster and more cost-effective than querying a machine learning model directly for every task. The model only needs to process a dataset once to create embeddings, which are then stored in the vector database. This saves a huge amount of processing time and makes it possible to build user-facing applications that return results in milliseconds.