An Interest In:
Web News this Week
- April 3, 2024
- April 2, 2024
- April 1, 2024
- March 31, 2024
- March 30, 2024
- March 29, 2024
- March 28, 2024
Filtering on DENSE_RANK() optimized as pushed-down DISTINCT in YugabyteDB
The SQL analytic functions (aka window functions) are powerful and you may use them to filter the first row in a window. However, the analytic function is processed in a subquery to be filtered later, which may prevent some optimizations in data access. I will show how the same can be done more efficiently by filtering during the Index Scan.
Here is a simple table with observation type, name, text and timestamp:
drop table observations;create extension if not exists orafce;create table observations ( primary key ( observation_type asc,observation_name asc, observation_date desc ) , observation_type text , observation_name text , observation_date timestamptz , observation_text text);insert into observations select dbms_random.string('U',1),dbms_random.string('U',2),clock_timestamp(), dbms_random.string('L',100) from generate_series(1,1000);\watch 0.01
I want the last value for each observation in a specific type.
Here is the execution plan when using DENSE_RANK() and filtering on the rank number 1:
explain (analyze, dist, costs off)select o."observation_type", o."observation_name", o."observation_text" from ( select *, dense_rank () over ( partition by o."observation_type", o."observation_name" order by o."observation_date" desc ) as rank_number from observations o where o."observation_type" = 'Z') as o where "o".rank_number=1; QUERY PLAN--------------------------------------------------------------------------------------------------------------------- Subquery Scan on o (actual time=3.176..466.641 rows=143 loops=1) Filter: (o.rank_number = 1) Rows Removed by Filter: 162857 -> WindowAgg (actual time=3.174..456.372 rows=163000 loops=1) -> Index Scan using observations_pkey on observations o_1 (actual time=3.156..342.562 rows=163000 loops=1) Index Cond: (observation_type = 'Z'::text) Storage Index Read Requests: 160 Storage Index Execution Time: 228.001 ms Planning Time: 0.103 ms Execution Time: 466.741 ms Storage Read Requests: 160 Storage Write Requests: 0 Storage Execution Time: 228.001 ms Peak Memory Usage: 491 kB(14 rows)
This had read rows=163000
rows from the distributed storage with Storage Index Read Requests: 160
network calls and then the postgres backend has discarded Rows Removed by Filter: 162857
from those rows to keep only rows=143
that verify the filter.
This could be optimized by reading only the required rows, limiting the network calls, transfer, and further single-process filtering.
This is possible on this index thanks to YugabyteDB hybrid scan but, at least in this version (YB-2.17), the filtering on the window function is not pushed down to the index scan.
To allow the hybrid scan optimization, I'll use DISTINCT in a subquery to replace the PARTITION BY of the analytic window. Then, for each window I'll get the last value with a scalar subquery doing the ORDER BY and LIMIT to get the first row for each, which is equivalent to DENSE_RANK()=1
explain (analyze, dist, costs off)select d."observation_type", d."observation_name" , ( select observation_text from observations where ( observation_type , observation_name ) =(d."observation_type", d."observation_name") order by observation_date desc limit 1) from ( -- get all distinct values with push down optimization select distinct o."observation_type", o."observation_name" from observations o -- use range condition to force an index scan where o."observation_type" between 'Z' and 'Z') d; QUERY PLAN------------------------------------------------------------------------------------------------------------------- Subquery Scan on d (actual time=1.757..37.104 rows=143 loops=1) -> Unique (actual time=1.392..1.695 rows=143 loops=1) -> Index Scan using observations_pkey on observations o_1 (actual time=1.391..1.607 rows=143 loops=1) Index Cond: ((observation_type >= 'Z'::text) AND (observation_type <= 'Z'::text)) Storage Index Read Requests: 1 Storage Index Execution Time: 0.000 ms SubPlan 1 -> Limit (actual time=0.246..0.246 rows=1 loops=143) -> Index Scan using observations_pkey on observations (actual time=0.237..0.237 rows=1 loops=143) Index Cond: ((observation_type = d.observation_type) AND (observation_name = d.observation_name)) Storage Index Read Requests: 1 Storage Index Execution Time: 0.224 ms Planning Time: 1.614 ms Execution Time: 37.187 ms Storage Read Requests: 144 Storage Write Requests: 0 Storage Execution Time: 32.000 ms Peak Memory Usage: 56 kB(18 rows)
I have added between 'Z' and 'Z'
because in this YugabyteDB version an Index Scan would not be used with = 'Z'
. It is always important to check the execution plan when doing such optimization, and add a comment in the query.
When you need more than one column, it is better to query from a LATERAL join rather than a scalar subquery:
explain (analyze, dist, costs off)select o.* from ( -- get all distinct values with push down optimization select distinct o."observation_type", o."observation_name" from observations o -- use range condition to force an index scan where o."observation_type" between 'Z' and 'Z') d , lateral (-- get additional information for each distinct valueselect * from observationswhere ( observation_type , observation_name ) =(d."observation_type", d."observation_name")order by observation_date desclimit 1) o;
The execution plan is similar. The most important is to verify the absence of Rows Removed by Filter
, the Index Scan
or Index Only Scan
for both subqueries, the small Storage Index Read Requests
for the Unique
branch, and the total Storage Read Requests
matching the rows=
for the Unique
branch.
Original Link: https://dev.to/yugabyte/filtering-on-denserank-optimized-as-pushed-down-distinct-in-yugabytedb-5mp
Dev To
An online community for sharing and discovering great ideas, having debates, and making friendsMore About this Source Visit Dev To