Master vector search with pgvector in PostgreSQL. Covers embedding fundamentals, similarity metrics, IVFFlat vs HNSW indexes, database selection, and complete Bun/TypeScript implementation for AI-powered semantic search applications.
Written by: Chia1104 CC BY-NC-SA 4.0

Vector search is essentially the process of finding similar points in multidimensional space. It converts complex data (text, images, audio) into mathematical vectors and determines similarity by calculating the distance between vectors.
Embeddings are normalized unit vectors typically generated by AI models, with each value ranging between -1 and 1. For example, OpenAI's text-embedding-3-small model converts a piece of text into a vector array of 1536 dimensions. These vectors capture semantic information, making "Taipei is the capital" and "Taiwan's administrative center" close together in vector space.
Common methods for vector search include:
Cosine similarity is a commonly used metric for measuring the directional similarity between two vectors. The formula is as follows: milvus

Cosine Similarity Formula
Cosine Distance Formula
Cosine distance is the complement of cosine similarity, used to measure the distance between two vectors:
Use Cases: Text similarity comparison, recommendation systems, RAG (Retrieval-Augmented Generation).
Euclidean distance measures the straight-line distance between two points in multidimensional space. It's the most intuitive distance metric, suitable for scenarios requiring consideration of absolute vector magnitude.

Euclidean Distance Formula
Range: 0 to ∞, smaller values indicate greater similarity. Use Cases: Image feature matching, numerical data comparison, anomaly detection.
When selecting a vector database, pgvector as a PostgreSQL extension offers these core advantages:
| Advantage | Description |
|---|---|
| Seamless Integration | Integrates seamlessly with existing relational databases without introducing new database systems |
| Native SQL Support | Uses familiar SQL syntax for vector queries with low learning curve |
| Hybrid Queries | Supports both vector similarity search and traditional SQL conditional filtering |
| Mature Ecosystem | Built on PostgreSQL's stability and rich toolchain |
| Cost-Effective | Open-source and free, no additional vector database subscription fees |
Index Dimension Limit: pgvector's indexing functionality supports up to 2000-dimensional vectors. This is due to PostgreSQL's default page size (8KB) limitation. While higher-dimensional vectors can be stored, they cannot be indexed, leading to significant query performance degradation.
Performance: In ultra-large-scale (billion-level) vector retrieval scenarios, performance is not as good as dedicated vector databases like Milvus.
If you need to use vector models exceeding 2000 dimensions (such as OpenAI's text-embedding-3-large), consider these solutions:
Small to Medium Projects: Vector quantities within tens of millions, without extreme latency requirements. Hybrid Query Needs: Requires combining structured data (like user IDs, timestamps) with vector similarity for complex queries. Rapid Prototype Development: Teams already familiar with PostgreSQL wanting to quickly validate AI functionality. Budget Constraints: Cannot afford the operational costs of dedicated vector databases.
Install pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;You can directly use the pgvector image, then execute this method after startup.
Create table with vector column
Choose an OpenAI Embedding model based on your needs:
| Model | Dimensions | Adjustable | Use Case |
|---|---|---|---|
| text-embedding-3-small | 1536 (adjustable to 512) | Yes | General text retrieval, best cost-effectiveness |
| text-embedding-3-large | 3072 (adjustable to 256) | Yes | High-precision needs, complex semantics |
| text-embedding-ada-002 | 1536 | No | Legacy model, not recommended for new projects |
Important Note: text-embedding-ada-002 was marked as deprecated in January 2025 and is expected to be officially retired after June 2025. New projects should directly use text-embedding-3-small or text-embedding-3-large for higher accuracy.
Note that pgvector's indexing functionality supports up to 2000 dimensions. If using text-embedding-3-large's default 3072 dimensions, it's recommended to adjust to 1536 or lower via the API's dimensions parameter.
CREATE TABLE items (
id serial PRIMARY KEY,
embedding vector(1536) -- Assuming each vector has 1536 dimensions
);Insert vector data
INSERT INTO items (embedding) VALUES
('[0.1, 0.2, ..., 0.3]'),
('[0.4, 0.5, ..., 0.6]'),
...;Execute vector search
Use cosine similarity to query the most similar vectors:
SELECT id, embedding
FROM items
ORDER BY embedding <=> '[query_vector]' -- pgvector's built-in cosine similarity operator
LIMIT 10;For Euclidean distance calculations, you can use <-> instead
pgvector supports two index types:
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);Index performance comparison (based on 1 million vector benchmark): neon
| Metric | IVFFlat | HNSW |
|---|---|---|
| Build Time (seconds) | 128 | 4,065 |
| Index Size (MB) | 257 | 729 |
| Query Speed (QPS) | 2.6 | 40.5 |
| Recall Stability | Medium | High |
IVFFlat recall rate decreases as data volume grows, requiring periodic index rebuilding.
-- Recommended production environment settings
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (
m = 16, -- Default value, suitable for most scenarios
ef_construction = 64 -- Default value, balances build speed and quality
);
-- Dynamically adjust precision during queries (no index rebuild needed)
SET hnsw.ef_search = 100; -- Range 80-120, higher values = higher recall but slower speedHere we use OpenAI's model to calculate embeddings, then use pgvector to calculate similarity.
We'll use text-embedding-ada-002 as the embedding calculation model.
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY ?? "st-SECRETKEY",
});
// text-embedding-ada-002, text-embedding-3-small, text-embedding-3-large
const MODEL = "text-embedding-ada-002";
export const generateEmbedding = async (value: string) => {
const input = value.replaceAll("\n", " ");
const { data } = await openai.embeddings.create({
model: MODEL,
input,
// Set output dimensions here, but only `text-embedding-3` supports adjustment
dimensions: 1536,
});
return data[0]?.embedding;
};Here we define database initialization commands, then query for Taiwan's capital using cosine similarity.
import { SQL } from "bun";
import { faker } from "@faker-js/faker";
import { generateEmbedding } from "./embeddings";
import pgvector from "pgvector";
const sql = new SQL({ url: process.env.DATABASE_URL });
const initDb = async () => {
/**
* Create extension if not exists
*/
await sql`CREATE EXTENSION IF NOT EXISTS vector;`;
/**
* Create table if not exists
*/
await sql`CREATE TABLE IF NOT EXISTS documents (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL,
embedding VECTOR(1536) NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);`;
};
const seedDb = async () => {
const documents = await Promise.all(
Array.from({ length: 3 }, async () => {
const content = faker.lorem.paragraph();
return {
title: faker.lorem.sentence(),
content,
embedding: pgvector.toSql(await generateEmbedding(content)),
};
})
);
await sql`INSERT INTO documents ${sql(documents)}`;
console.log("Database seeded successfully");
};
const seedDbWithCapitals = async () => {
const documents = await Promise.all([
{
title: "Taiwan",
content: "The capital of Taiwan is Taipei",
embedding: pgvector.toSql(await generateEmbedding("The capital of Taiwan is Taipei")),
},
{
title: "Japan",
content: "The capital of Japan is Tokyo",
embedding: pgvector.toSql(await generateEmbedding("The capital of Japan is Tokyo")),
},
{
title: "United States",
content: "The capital of United States is Washington, D.C.",
embedding: pgvector.toSql(await generateEmbedding("The capital of United States is Washington, D.C.")),
},
]);
await sql`INSERT INTO documents ${sql(documents)}`;
console.log("Database seeded successfully");
};
const searchDb = async (query: string) => {
const embedding = await generateEmbedding(query);
if (!embedding) {
throw new Error("Embedding generation failed");
}
const sqlEmbedding = pgvector.toSql(embedding);
const results = await sql`SELECT * FROM documents ORDER BY embedding <=> ${sqlEmbedding} LIMIT 3`.values();
return results;
};
const demo = async (search = "What is the capital of Taiwan?", seed?: true | "capitals") => {
await initDb();
if (seed) {
if (seed === "capitals") {
await seedDbWithCapitals();
} else {
await seedDb();
}
}
const results = await searchDb(search);
console.log(results);
};
const Script = {
init: async (seed?: true | "capitals") => {
await initDb();
if (seed) {
if (seed === "capitals") {
await seedDbWithCapitals();
} else {
await seedDb();
}
}
},
initDb,
seedDb,
searchDb,
demo,
};
export default Script;Finally, execute:
import Script from "./src/scripts";
await Script.demo(
"What is the capital of Taiwan?",
// seed the database or not
"capitals"
);You can see the results:
[
[ 4, "Taiwan", "The capital of Taiwan is Taipei", "[0.000814143,-0.019345103,-0.0027284368,...,-0.02137615]",
2025-05-08T08:29:22.591Z
], [ 7, "Japan", "The capital of Japan is Tokyo", "[0.000814143,-0.019345103,-0.0027284368,...,-0.02137615]",
2025-05-08T08:29:48.303Z
], [ 6, "United States", "The capital of United States is Washington, D.C.", "[0.0072439546,-0.01591172,-0.016960844,...,-0.0050426666]",
2025-05-08T08:29:22.591Z
], count: 3, command: "SELECT"
]In addition to pgvector mentioned above, there are two other databases specifically designed for vector data processing: Milvus and Weaviate. They each have different strengths and can be used for different scenarios.
As mentioned earlier, pgvector's biggest advantage is its better integration with existing relational data, and it allows text search using regular SQL syntax with a relatively simple learning curve. However, its overall functionality is not as comprehensive as Milvus or Weaviate, and its maximum index dimension only supports 2000, making optimization difficult with more data.
Milvus is an open-source dedicated vector database designed for large-scale high-dimensional vector similarity search and AI applications, supporting billion-level vectors.
Advantages
Disadvantages
Weaviate is an open-source vector database supporting semantic search and hybrid search (vector + keyword), featuring a GraphQL interface.
import { dataType, type WeaviateClient } from "weaviate-client";
import { vectorizer } from "weaviate-client";
export class Script {
constructor(private client: WeaviateClient) {}
/**
* Directly use OpenAI for real-time vector transformation
*/
private async createCollection() {
await this.client.collections.create({
name: "Documents",
vectorizers: vectorizer.text2VecOpenAI(),
properties: [
{ name: "title", dataType: dataType.TEXT },
{ name: "content", dataType: dataType.TEXT },
],
});
}
async getDocuments() {
return this.client.collections.get("Documents");
}
private generateData() {
return [
{
title: "Taiwan",
content: "The capital of Taiwan is Taipei.",
},
{
title: "Japan",
content: "The capital of Japan is Tokyo.",
},
{
title: "United States",
content: "The capital of the United States is Washington, D.C.",
},
];
}
public async seedCollection() {
const collection = await this.getDocuments();
const data = this.generateData();
await collection.data.insertMany(data);
console.log(`Inserted ${data.length} documents`);
}
private async init() {
await this.createCollection();
await this.seedCollection();
console.log("Collection initialized");
}
/**
* Execute vector search
*/
public async search(query: string) {
const collection = await this.getDocuments();
const results = await collection.query.nearText(query, {
limit: 3,
returnMetadata: ["distance"],
});
console.log(results.objects.map((obj) => obj.properties));
return results;
}
}Advantages
HNSW algorithm, supporting full CRUD operations.Disadvantages
Weaviate is more suitable for applications requiring semantic search, multimodal data processing, and deep integration with AI models, particularly for building recommendation or knowledge retrieval systems.
| Feature | pgvector | Milvus | Weaviate |
|---|---|---|---|
| Deployment | PostgreSQL Extension | Cloud + On-Premises | Cloud + On-Premises |
| Open Source License | PostgreSQL License | Apache 2.0 | BSD 3-Clause |
| Vector Dimension Limit | 2,000 (index limit) | 32,768 | 65,535 |
| Index Types | IVFFlat, HNSW | HNSW, DiskANN, GPU-accelerated | HNSW |
| Hybrid Queries | ✅ Native SQL Support | ⚠️ Via API | ✅ GraphQL Support |
| Multimodal Support | ❌ | ✅ Text, Images, Audio | ✅ Text, Images |
| Scalability | Medium (vertical scaling) | Extremely Strong (distributed architecture) | Strong (automatic sharding) |
| Learning Curve | Low (SQL only) | High (requires specialized API) | Medium (GraphQL) |
| Query Performance (QPS) | 2.6-40.5 (depends on index) | > 100 (GPU-accelerated) | 50-80 |
| Best Use Case | Small-medium projects, hybrid queries | Large-scale AI applications | Semantic search, RAG |
| If You Need... | Recommended Solution |
|---|---|
| Integration with existing PostgreSQL | pgvector |
| Billion-level vectors + GPU acceleration | Milvus |
| Multimodal + real-time vectorization | Weaviate |
| Lowest learning cost | pgvector |
| Highest query performance | Milvus |
| Best semantic search | Weaviate |
| Data volume < 1 million | pgvector |
| Data volume > 10 million | Milvus |
| Need hybrid search (keyword + vector) | Weaviate |