A simple quiery rewrite example using materialised views and dimension metadata to control how the optimizer can tune SQL statements. The example has been adapted from the Ask Tom forum.
SQL> alter session set query_rewrite_enabled = true
Session altered.
SQL> alter session set query_rewrite_integrity = stale_tolerated
Session altered.
Create the fact table sales
SQL> create table sales
( trans_date date, cust_id int, sales_amount number )
Table created.
Insert some data into the fact table
SQL> insert /*+ APPEND */ into sales
select trunc( sysdate, 'year' ) + mod( rownum, 366 ) as trans_date
, mod( rownum, 100 ) as cust_id
, abs( dbms_random.random ) / 100 as sales_amount
from all_objects
211433 rows created.
SQL> commit
Commit complete.
SQL> analyze table sales compute statistics
Table analyzed.
We want to bulk out the size of this table so increase the number of records fourfold
SQL> begin
for i in 1 .. 4
loop
insert /*+ APPEND */ into sales
select trans_date, cust_id, abs( dbms_random.random ) / 100
from sales;
commit;
end loop;
end;
PL/SQL procedure successfully completed.
SQL> commit
Commit complete.
SQL> analyze table sales compute statistics
Table analyzed.
SQL> select count(*) from sales
COUNT(*)
----------
3382928
1 row selected.
We now have a large fact table. Now create a time hierarchy dimension table
SQL> create table time_hierarchy
( day, mmyyyy, mon_yyyy, qtr_yyyy, yyyy )
as
select distinct trans_date day
, cast( to_char( trans_date, 'mmyyyy' ) as number ) mmyyyy
, to_char( trans_date, 'mon-yyyy' ) as mon_yyyy
, 'Q' || ceil( to_char( trans_date, 'mm' ) / 3 ) || ' FY' || to_char( trans_date, 'yyyy' ) qtr_yyyy
, cast( to_char( trans_date, 'yyyy' ) as number ) yyyy
from sales
Table created.
SQL> commit
Commit complete.
Generated:
1. MMYYYY: The month including the year
2. MON_YYYY: Same as above but we 'spelled' out the month
3. QTR_YYYY: The quarter of the year including the year
4. YYYY: The year itself
SQL> analyze table time_hierarchy compute statistics
Table analyzed.
Create an MV with query rewrite enabled
SQL> create materialized view sales_mv
build immediate
refresh on demand
enable query rewrite
as
select sales.cust_id, sum(sales.sales_amount) sales_amount
, time_hierarchy.mmyyyy
from sales, time_hierarchy
where sales.trans_date = time_hierarchy.day
group by
sales.cust_id
, time_hierarchy.mmyyyy
Materialized View created.
Now let’s use this MV in a query rewrite example. The following SQL is at the same aggregation as the MV. I.e. we are aggregating the data up to the month (MMYYYY) time level.
SQL> set timing on
SQL> set autotrace on
SQL> select time_hierarchy.mmyyyy
, sum( sales_amount )
from sales, time_hierarchy
where sales.trans_date = time_hierarchy.day
group by
time_hierarchy.mmyyyy
MMYYYY SUM(SALES_AMOUNT)
---------- -----------------
32013 3.0801E+12
42013 2.9816E+12
12014 9.9197E+10
82013 3.0755E+12
102013 3.0675E+12
112013 2.9687E+12
22013 2.7786E+12
52013 3.0828E+12
72013 3.0764E+12
12013 3.0817E+12
62013 2.9745E+12
92013 2.9795E+12
122013 3.0780E+12
13 rows selected.
Elapsed: 00:00:00.15
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=5 Card=1 K Bytes=31 K)
1 0 HASH GROUP BY (Cost=5 Card=1 K Bytes=31 K)
2 1 MAT_VIEW REWRITE ACCESS FULL A315250.SALES_MV (Cost=4 Card=1 K Bytes=31 K)
Statistics
----------------------------------------------------------
0 user rollbacks
0 global enqueue releases
0 physical read requests optimized
0 physical write total multi block requests
0 hot buffers moved to head of LRU
0 commit wait performed
0 global undo segment hints helped
0 global undo segment hints were stale
0 IMU commits
0 IMU Flushes
13 rows processed
As you can see in the execution plan Oracle has automatically rewrote the SQL to use the SALES_MV and the overall cost is 5.
Now let’s look at an SQL where the aggregation is at a different level to the SALES_MV. This time we’re rolling up to quarter year (QTR_YYYY) time hierarchy.
SQL> set timing on
SQL> set autotrace on
SQL> select time_hierarchy.qtr_yyyy
, sum( sales_amount )
from sales, time_hierarchy
where sales.trans_date = time_hierarchy.day
group by
time_hierarchy.qtr_yyyy
QTR_YYYY SUM(SALES_AMOUNT)
------------------------------------------------ -----------------
Q2 FY2013 9.0389E+12
Q4 FY2013 9.1142E+12
Q1 FY2013 8.9403E+12
Q3 FY2013 9.1314E+12
Q1 FY2014 9.9197E+10
5 rows selected.
Elapsed: 00:00:02.81
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=2254 Card=5 Bytes=180)
1 0 HASH GROUP BY (Cost=2254 Card=5 Bytes=180)
2 1 HASH JOIN (Cost=2253 Card=366 Bytes=12 K)
3 2 VIEW SYS.VW_GBC_5 (Cost=2248 Card=366 Bytes=7 K)
4 3 HASH GROUP BY (Cost=2248 Card=366 Bytes=4 K)
5 4 TABLE ACCESS FULL A315250.SALES (Cost=2063 Card=3 M Bytes=41 M)
6 2 TABLE ACCESS FULL A315250.TIME_HIERARCHY (Cost=4 Card=366 Bytes=5 K)
Statistics
----------------------------------------------------------
0 user rollbacks
0 global enqueue releases
0 physical read requests optimized
0 physical write total multi block requests
0 hot buffers moved to head of LRU
0 commit wait performed
0 global undo segment hints helped
0 global undo segment hints were stale
0 IMU commits
0 IMU Flushes
5 rows processed
As you can see the execution plan is not clever enough know it can still use the MV. Query rewrite hasn’t happened and the fact and dimension tables are queried and the overall cost is 2254.
With the use of a dimension we can tell Oracle how the aggregated rollup of the time hierarchy works. Oracle will then use this dimension in the query rewrite when deciding if the MV is still applicable.
SQL> create dimension time_hierarchy_dim
level day is time_hierarchy.day
level mmyyyy is time_hierarchy.mmyyyy
level qtr_yyyy is time_hierarchy.qtr_yyyy
level yyyy is time_hierarchy.yyyy
hierarchy time_rollup
(
day child of
mmyyyy child of
qtr_yyyy child of
yyyy
)
attribute mmyyyy
determines mon_yyyy
Dimension created.
The dimension is stored as metadata and details how the time levels rollup through the child to parent hierarchy.
Now if we re-run the same SQL the optimizer should know to use the MV in the query rewrite.
SQL> set autotrace on
SQL> select time_hierarchy.qtr_yyyy
, sum( sales_amount )
from sales, time_hierarchy
where sales.trans_date = time_hierarchy.day
group by
time_hierarchy.qtr_yyyy
QTR_YYYY SUM(SALES_AMOUNT)
------------------------------------------------ -----------------
Q2 FY2013 9.0389E+12
Q4 FY2013 9.1142E+12
Q1 FY2013 8.9403E+12
Q3 FY2013 9.1314E+12
Q1 FY2014 9.9197E+10
5 rows selected.
Elapsed: 00:00:00.17
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=11 Card=5 Bytes=195)
1 0 HASH GROUP BY (Cost=11 Card=5 Bytes=195)
2 1 HASH JOIN (Cost=10 Card=4 K Bytes=168 K)
3 2 VIEW (Cost=5 Card=46 Bytes=598)
4 3 HASH UNIQUE (Cost=5 Card=46 Bytes=598)
5 4 TABLE ACCESS FULL A315250.TIME_HIERARCHY (Cost=4 Card=366 Bytes=4 K)
6 2 MAT_VIEW REWRITE ACCESS FULL A315250.SALES_MV (Cost=4 Card=1 K Bytes=31 K)
Statistics
----------------------------------------------------------
0 user rollbacks
0 global enqueue releases
0 physical read requests optimized
0 physical write total multi block requests
0 hot buffers moved to head of LRU
0 commit wait performed
0 global undo segment hints helped
0 global undo segment hints were stale
0 IMU commits
0 IMU Flushes
5 rows processed
As you can see the optimizer has correctly rewrote the query to use the SALES_MV and the overall cost has reduced to 11.
We can amend the SQL again to rollup to the year (YYYY) level and see if the optimizer uses the SALES_MV still.
SQL> set autotrace on
SQL> select time_hierarchy.yyyy
, sum( sales_amount )
from sales, time_hierarchy
where sales.trans_date = time_hierarchy.day
group by
time_hierarchy.yyyy
YYYY SUM(SALES_AMOUNT)
---------- -----------------
2013 3.6225E+13
2014 9.9197E+10
2 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer Mode=ALL_ROWS (Cost=11 Card=2 Bytes=66)
1 0 HASH GROUP BY (Cost=11 Card=2 Bytes=66)
2 1 HASH JOIN (Cost=10 Card=1 K Bytes=58 K)
3 2 VIEW (Cost=5 Card=19 Bytes=133)
4 3 HASH UNIQUE (Cost=5 Card=19 Bytes=133)
5 4 TABLE ACCESS FULL A315250.TIME_HIERARCHY (Cost=4 Card=366 Bytes=2 K)
6 2 MAT_VIEW REWRITE ACCESS FULL A315250.SALES_MV (Cost=4 Card=1 K Bytes=31 K)
Statistics
----------------------------------------------------------
0 user rollbacks
0 global enqueue releases
0 physical read requests optimized
0 physical write total multi block requests
0 hot buffers moved to head of LRU
0 commit wait performed
0 global undo segment hints helped
0 global undo segment hints were stale
0 IMU commits
0 IMU Flushes
2 rows processed
Sure enough the dimension metadata means that Oracle knew to rewrite the query to use the MV and the overall cost is 11.
Drop the objects.
drop dimension time_hierarchy_dim;
drop table sales cascade constraints;
drop table time_hierarchy cascade constraints;
drop materialized view sales_mv;