User-managed pgvector

Powabase projects ship with pgvector preloaded. The vector extension is created at project provision time and used internally by ai.embeddings. You can use it for your own tables too: store embeddings in public.* and run vector similarity queries via PostgREST or direct SQL. For most users, the typed Sources and Knowledge Bases surface is the right path. It manages chunking, embedding, indexing, RLS, billing, and retrieval reranking for you. Rolling your own pgvector tables is for cases where the typed surface doesn’t fit: custom embedding models, specific schema constraints, integration with non-Powabase pipelines. This page covers the user-managed approach so you can choose the right tool for the job.

When to use which

You want…	Use
RAG over uploaded documents	Sources + KB (`/api/sources`, `/api/knowledge-bases`)
Hybrid search (vector + BM25 + rerank)	KB search (`/api/knowledge-bases/{id}/search`)
Custom embedding model / non-text data (image embeddings, audio)	User-managed pgvector
Schema-coupled embeddings (“each row has its own embedding column”)	User-managed pgvector
Integration with an external indexing pipeline	User-managed pgvector
Per-user RLS that the typed KB surface doesn’t yet support	User-managed pgvector

The typed surface charges credits per indexing operation and per search; user-managed pgvector only charges credits for vector_search calls (see Billing model). For high-volume retrieval against a fixed corpus, user-managed can be cheaper, at the cost of more setup.

The basic shape

A user-managed embeddings table is just a regular table with a vector(N) column where N is your embedding dimension.

CREATE TABLE public.product_embeddings (
  id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  product_id uuid NOT NULL REFERENCES public.products(id) ON DELETE CASCADE,
  embedding vector(1536) NOT NULL,  -- OpenAI text-embedding-3-small dimension
  meta jsonb DEFAULT '{}'::jsonb,
  created_at timestamptz NOT NULL DEFAULT now()
);

-- Enable RLS like any other table
ALTER TABLE public.product_embeddings ENABLE ROW LEVEL SECURITY;

CREATE POLICY read_product_embeddings ON public.product_embeddings
  FOR SELECT TO authenticated
  USING (true);  -- adapt to your ownership model

The dimension (1536 above) must match what your embedding model produces. Common choices:

OpenAI text-embedding-3-small: 1536
OpenAI text-embedding-3-large: 3072
Cohere embed-english-v3.0: 1024
Voyage voyage-3-large: 1024
BGE M3: 1024

Get the dimension wrong and inserts will fail with expected N dimensions, not M.

Indexing for fast search

A vector column without an index does a sequential scan for every search: fine for thousands of rows, slow above tens of thousands. Two index types: HNSW and IVFFlat. HNSW (Hierarchical Navigable Small World) is what Powabase uses internally and what most production setups want. It gives faster searches, slower builds, and no parameter tuning at query time:

CREATE INDEX product_embeddings_hnsw_idx
ON public.product_embeddings
USING hnsw ((embedding::vector(1536)) vector_cosine_ops);

The double cast embedding::vector(1536) is what pgvector wants for HNSW with explicit dimension typing. Without it the index may not be picked up at query time. vector_cosine_ops is the distance operator class:

vector_cosine_ops: cosine distance (most common for text embeddings)
vector_l2_ops: Euclidean distance
vector_ip_ops: inner product (negative dot product)

Match the distance to whatever your embedding model recommends. Most text embedding models are L2-normalized, so cosine and inner-product give the same ranking; for non-normalized embeddings the choice matters. IVFFlat is an older index type: faster to build, slower to query, needs lists tuning:

CREATE INDEX product_embeddings_ivfflat_idx
ON public.product_embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

lists is roughly rows / 1000 for the rule-of-thumb default. Use HNSW unless you have a specific reason; it’s strictly better for most workloads.

Inserting embeddings

You compute the embedding in your application code (calling OpenAI, Cohere, etc.) and insert via PostgREST or SQL:

const embedding = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: productDescription,
});

await fetch(`${BASE_URL}/rest/v1/product_embeddings`, {
  method: "POST",
  headers: {
    apikey: ANON_KEY,
    Authorization: `Bearer ${accessToken}`,
    "Content-Type": "application/json",
    Prefer: "return=representation",
  },
  body: JSON.stringify({
    product_id: productId,
    embedding: embedding.data[0].embedding,  // number[1536]
    meta: { source: "product_description" },
  }),
});

The embedding column expects a JSON number array; PostgREST converts it to pgvector’s internal format on insert.

Searching

Vector search uses the distance operators (<=> for cosine, <-> for L2, <#> for negative inner product):

SELECT id, product_id, 1 - (embedding <=> $1::vector) AS similarity
FROM public.product_embeddings
ORDER BY embedding <=> $1::vector
LIMIT 10;

The 1 - (embedding <=> $1) converts cosine distance (lower = more similar) to similarity (higher = more similar) for client-friendly results. Via PostgREST RPC:

CREATE OR REPLACE FUNCTION public.search_products(
  query_embedding vector(1536),
  match_count int DEFAULT 10
)
RETURNS TABLE (id uuid, product_id uuid, similarity float)
LANGUAGE sql STABLE
AS $$
  SELECT id, product_id, 1 - (embedding <=> query_embedding) AS similarity
  FROM public.product_embeddings
  ORDER BY embedding <=> query_embedding
  LIMIT match_count;
$$;

GRANT EXECUTE ON FUNCTION public.search_products(vector, int) TO authenticated;

Call from your app:

POST /rest/v1/rpc/search_products
{
  "query_embedding": [0.123, 0.456, ...],
  "match_count": 10
}

The RPC pattern keeps the query logic in the database and lets you index/optimize it independently of your app code.

Hybrid search (vector + BM25)

For RAG-quality retrieval, combine vector similarity with BM25 keyword scoring. You’ll need a tsvector column and matching GIN index:

ALTER TABLE public.product_embeddings
  ADD COLUMN content_tsv tsvector
  GENERATED ALWAYS AS (
    to_tsvector('english', coalesce((meta->>'content'), ''))
  ) STORED;

CREATE INDEX product_embeddings_content_tsv_idx
  ON public.product_embeddings
  USING gin (content_tsv);

Then combine in an RPC:

CREATE OR REPLACE FUNCTION public.hybrid_search_products(
  query_embedding vector(1536),
  query_text text,
  vector_weight float DEFAULT 0.7,
  match_count int DEFAULT 10
)
RETURNS TABLE (id uuid, product_id uuid, score float)
LANGUAGE sql STABLE
AS $$
  WITH vec AS (
    SELECT id, product_id,
           1 - (embedding <=> query_embedding) AS vec_score
    FROM public.product_embeddings
    ORDER BY embedding <=> query_embedding
    LIMIT match_count * 4
  ),
  bm25 AS (
    SELECT id, product_id,
           ts_rank(content_tsv, plainto_tsquery('english', query_text)) AS bm25_score
    FROM public.product_embeddings
    WHERE content_tsv @@ plainto_tsquery('english', query_text)
    ORDER BY bm25_score DESC
    LIMIT match_count * 4
  )
  SELECT
    coalesce(vec.id, bm25.id) AS id,
    coalesce(vec.product_id, bm25.product_id) AS product_id,
    (vector_weight * coalesce(vec.vec_score, 0)) +
    ((1 - vector_weight) * coalesce(bm25.bm25_score, 0)) AS score
  FROM vec
  FULL OUTER JOIN bm25 USING (id)
  ORDER BY score DESC
  LIMIT match_count;
$$;

This is a basic weighted-sum hybrid; the typed KB surface uses reciprocal rank fusion (RRF), which holds up better across query types. For real RAG quality, the typed surface is still the right answer.

When pgvector isn’t enough

For very large corpora (10M+ embeddings), even HNSW gets slow. Three options:

Use the typed KB surface. Powabase’s KB indexing strategies handle large corpora with techniques like PageIndex and GraphIndex that go beyond flat vector search.
Specialized vector DB: Pinecone, Weaviate, Qdrant. These add cost but are sometimes the right shape for billion-scale workloads.
Quantization or dimensionality reduction. Reduce the embedding size to 256 or 512 dimensions; faster but lower recall.

For most apps under 1M embeddings, user-managed pgvector is fine.

Next steps

Sources reference

The typed surface for document ingestion if you decide pgvector-direct is too much work.

Knowledge Bases reference

The typed search surface with hybrid retrieval and reranking.

Extensions

What else is preloaded alongside pgvector.

Direct Postgres

For the SQL patterns this guide builds on.

Architecture

Database

PostgREST

Row Level Security

Auth

Storage

Realtime

Operations

AI + BaaS Cookbook

Admin proxies

Reference

User-managed pgvector

When to use which

The basic shape

Indexing for fast search

Inserting embeddings

Searching

Hybrid search (vector + BM25)

When pgvector isn’t enough

Next steps

Sources reference

Knowledge Bases reference

Extensions

Direct Postgres

​When to use which

​The basic shape

​Indexing for fast search

​Inserting embeddings

​Searching

​Hybrid search (vector + BM25)

​When pgvector isn’t enough

​Next steps

Sources reference

Knowledge Bases reference

Extensions

Direct Postgres

When to use which

The basic shape

Indexing for fast search

Inserting embeddings

Searching

Hybrid search (vector + BM25)

When pgvector isn’t enough

Next steps