Scaleup Infotech
Scaleup Infotech.
Back to Blog
AI & ML9 min read

Vector Databases Explained: pgvector, Pinecone and Embeddings

Scaleup Infotech

Scaleup Infotech

Software & Marketing Agency

Jun 08, 2026
Vector Databases Explained: pgvector, Pinecone and Embeddings
Vector DatabaseEmbeddingspgvectorAI

Vector databases power semantic search and RAG. Instead of matching keywords, they find items by *meaning* — represented as high-dimensional vectors. Here's the concept and how to choose one.

Embeddings: Meaning as Numbers

An embedding model turns text (or images) into a vector — a list of numbers where similar meanings sit close together in space. 'Dog' and 'puppy' land near each other; 'dog' and 'invoice' do not.

Similarity Search

To answer a query, you embed it and find the nearest vectors using cosine similarity. Vector databases use Approximate Nearest Neighbor (ANN) indexes (HNSW, IVF) to do this fast across millions of vectors.

pgvector vs Dedicated Stores

  • pgvector — a Postgres extension. Perfect if your data already lives in Postgres; one database, transactional, no new infra.
  • Pinecone / Weaviate / Qdrant — purpose-built for vectors at massive scale, with managed hosting and advanced filtering.
  • Start with pgvector for most apps; graduate to a dedicated store when you hit tens of millions of vectors or need specialized features.
sql
-- pgvector: store and query embeddings in Postgres
CREATE EXTENSION vector;
CREATE TABLE docs (id serial, content text, embedding vector(1536));

-- Nearest neighbors by cosine distance
SELECT content FROM docs ORDER BY embedding <=> $1 LIMIT 5;

Match Dimensions

Your column dimension must match your embedding model's output (e.g. 1536). Mixing models with different dimensions in one index breaks search.

Share this article:

Keep Reading

Ready to implement these ideas?

Work With Scaleup Infotech