1. Create Table and Prepare Data

First, we need to create a table with vector fields and insert test data.

Create Table Structure

CREATE TABLE test_tbl (
  id serial primary key, 
  embedding vector(3), 
  ts TIMESTAMP
);

COMMENT ON COLUMN test_tbl.id IS 'Primary key, auto-increment sequence';
COMMENT ON COLUMN test_tbl.embedding IS '3-dimensional vector field';
COMMENT ON COLUMN test_tbl.ts IS 'Timestamp field';

Insert Test Data

INSERT INTO test_tbl (embedding, ts)
SELECT
  ARRAY[random(), random(), random()]::real[ ]::vector,
  now() - '1 days'::interval * (random() * 10)::int
FROM generate_series(1, 1000);
This statement will insert 1000 records, each containing:
  • Randomly generated 3-dimensional vectors
  • Random timestamps within the past 10 days

2. Create Vector Index

To improve vector search performance, we need to create an HNSW index.
CREATE INDEX test_vector_idx 
ON test_tbl 
USING vectors (embedding vector_l2_ops);
Index Parameter Descriptions:
  • USING vectors: Use vector index type
  • vector_l2_ops: Operator class that calculates squared Euclidean distance (L2 distance)
Find the 5 most similar vectors to the target vector [3,1,2]:
SELECT id, embedding <-> '[3,1,2]' as dist
FROM test_tbl 
ORDER BY dist 
LIMIT 5;
Operator Description:
  • <->: Calculate L2 distance between two vectors
  • Smaller distance indicates higher similarity
Query data from the past 5 days with similarity scores above a threshold:
SELECT 
  b.* 
FROM (
  SELECT 
    a.id, 
    a.dist * 10 AS similarity 
  FROM (
    SELECT 
      id, 
      embedding <-> '[3, 1, 2]' AS dist
    FROM test_tbl 
    WHERE ts > now() - '5 days'::interval
    ORDER BY dist ASC 
    LIMIT 100
  ) as a
) as b
WHERE b.similarity > 50
OFFSET 0 LIMIT 20;
Query Logic:
  1. Inner query: Filter data from the past 5 days, sort by distance and take top 100 records
  2. Middle query: Convert distance to similarity score (multiply by 10)
  3. Outer query: Filter records with similarity greater than 50, return 20 results