Technical

What is Embeddings?

Numerical vector representations of text that encode semantic meaning.

Definition

Embeddings are dense numerical vectors produced by machine learning models that capture the semantic meaning of text (or images, audio, etc.). Words, sentences, or documents with similar meanings produce similar vectors — meaning their vectors are close in high-dimensional space. Embeddings are the prerequisite for semantic search and RAG: you first embed your documents, store the vectors, then embed a query to find semantically similar documents.

Example

The sentences 'the cat sat on the mat' and 'a feline rested on the rug' produce similar embeddings — even though they share no words. A search for either would find the other.

Embeddings vs vector-database: What's the difference?

Embeddings

Numerical vector representations of text that encode semantic meaning.

vector-database

Embeddings are the vectors themselves. A vector database is where they're stored and searched.

Related terms

Back to glossary