pg_duckdb hands-on

Relyt ONE ships with pg_duckdb, a DuckDB-based columnar execution engine. It accelerates analytical SQL workloads on your operational data with no schema changes or ETL. This hands-on guide walks you through enabling pg_duckdb, running starter queries, and stress-testing a larger dataset so you can feel the speedup directly in psql or your favorite client.

Step 1. Create a sample table and seed data

The following SQL creates a transactional orders table and inserts a few rows for testing.

-- Create a standard PostgreSQL table
CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    product_name TEXT,
    amount DOUBLE PRECISION,
    order_date DATE
);

-- Insert sample rows
INSERT INTO orders (product_name, amount, order_date)
VALUES 
    ('Laptop', 1200.00, '2024-07-01'),
    ('Keyboard', 75.50, '2024-07-01'),
    ('Mouse', 25.00, '2024-07-02');

Step 2. Route queries through DuckDB

Enable the DuckDB execution engine for the current session. With this single GUC, all supported queries automatically run on the vectorized engine.

SET duckdb.force_execution = true;

To make DuckDB the default executor for every new connection, set the GUC at the role or database level (changes apply only to sessions started after the ALTER runs):

ALTER ROLE <role> SET duckdb.force_execution = true;

ALTER DATABASE <database> SET duckdb.force_execution = true;

No query rewrites are required—keep using PostgreSQL syntax.

Step 3. Run analytical queries

Aggregate revenue by product

SELECT
    product_name,
    SUM(amount) AS total_sales,
    COUNT(*) AS order_count
FROM orders
GROUP BY product_name
ORDER BY total_sales DESC;

Aggregate revenue by day

SELECT
    order_date,
    SUM(amount) AS daily_sales,
    COUNT(*) AS order_count
FROM orders
GROUP BY order_date
ORDER BY order_date;

These queries now execute on DuckDB’s columnar engine, delivering lower latency and more predictable performance for aggregation-heavy workloads.

Real analytical workload example

For a more realistic benchmark, seed a sales_log fact table with three million synthetic transactions and then route a complex window query through DuckDB.

Create the transactional table

CREATE TABLE sales_log (
    sale_id SERIAL PRIMARY KEY,
    sale_date DATE NOT NULL,
    amount NUMERIC(10, 2),
    user_id INT,
    region TEXT,
    category TEXT,
    comment TEXT
);

Insert 3 million sample rows

INSERT INTO sales_log (
    sale_date, amount, user_id, region, category, comment
)
SELECT 
    '2020-01-01'::DATE + (random() * 1500)::INT, 
    (random() * 1000)::NUMERIC(10, 2),
    (random() * 10000)::INT,
    CASE WHEN random() > 0.5 THEN 'North America' ELSE 'Asia Pacific' END,
    'Category-' || (random() * 50)::INT,
    md5(random()::TEXT) || md5(random()::TEXT)
FROM generate_series(1, 3000000);

Refresh planner statistics so both executors have accurate metadata:

ANALYZE sales_log;

Optionally confirm the on-disk size:

SELECT pg_size_pretty(pg_total_relation_size('sales_log')) AS table_size;

Benchmark the window-heavy query

Run with PostgreSQL’s executor (default)
Leave duckdb.force_execution unset (or explicitly SET duckdb.force_execution = false;) and execute the analytical query below. Record the elapsed time from EXPLAIN ANALYZE.
Enable DuckDB and rerun
Turn on DuckDB for the current session and immediately run the same SQL again to compare results:
```
SET duckdb.force_execution = true;
```

The query operates directly on the sales_log row-store table, grouping three million rows into 10,000 user partitions, applying a moving average window, filtering anomalies, and sorting the top 100 largest deviations:

EXPLAIN ANALYZE
WITH user_history AS (
    SELECT 
        sale_id,
        user_id,
        sale_date,
        amount,
        AVG(amount) OVER (
            PARTITION BY user_id 
            ORDER BY sale_date 
            ROWS BETWEEN 50 PRECEDING AND 1 PRECEDING
        ) AS user_recent_avg
    FROM sales_log
)
SELECT *
FROM user_history
WHERE amount > (user_recent_avg * 5)
ORDER BY amount DESC
LIMIT 100;

Executor	`duckdb.force_execution`	Runtime observed
PostgreSQL	`false` (default)	~5.8 seconds
DuckDB engine	`true`	~1.5 second

Your mileage may vary based on hardware and background workload, but the pattern—run once with PostgreSQL, flip the GUC, rerun with DuckDB—makes it easy to validate the speedup yourself.

Key benefits

Seamless integration: Works with existing PostgreSQL SQL and tooling.
High performance: Vectorized execution with automatic parallelism for scans and aggregations.
Zero friction: A single session parameter (duckdb.force_execution) controls the engine.

Operational tips

Automatic fallback

If a statement includes features not supported by DuckDB, Relyt automatically falls back to the default PostgreSQL executor. No manual intervention is required.

Inspect the execution engine

Use EXPLAIN to verify whether DuckDB handled the query:

EXPLAIN
SELECT
    product_name,
    SUM(amount) AS total_sales
FROM orders
GROUP BY product_name;

Look for DuckDB nodes in the plan to confirm the handoff. If you only see PostgreSQL plan nodes, the fallback path was used.

Onboarding

Features

Analytics (DuckDB)

API Introduction

AI Tools

Cloud

Security

Blog

Step 1. Create a sample table and seed data

Step 2. Route queries through DuckDB

Step 3. Run analytical queries

Aggregate revenue by product

Aggregate revenue by day

Real analytical workload example

Create the transactional table

Insert 3 million sample rows

Benchmark the window-heavy query

Key benefits

Operational tips

Automatic fallback

Inspect the execution engine

Onboarding

Features

Analytics (DuckDB)

API Introduction

AI Tools

Cloud

Security

Blog

​Step 1. Create a sample table and seed data

​Step 2. Route queries through DuckDB

​Step 3. Run analytical queries

​Aggregate revenue by product

​Aggregate revenue by day

​Real analytical workload example

​Create the transactional table

​Insert 3 million sample rows

​Benchmark the window-heavy query

​Key benefits

​Operational tips

​Automatic fallback

​Inspect the execution engine

Step 1. Create a sample table and seed data

Step 2. Route queries through DuckDB

Step 3. Run analytical queries

Aggregate revenue by product

Aggregate revenue by day

Real analytical workload example

Create the transactional table

Insert 3 million sample rows

Benchmark the window-heavy query

Key benefits

Operational tips

Automatic fallback

Inspect the execution engine