performance - top-N query doing too much work in spite of STOPKEY optimization -
this going long, here's quick summary draw in: top-n query count stopkey , order stopkey in plan still slow no reason.
now, details. starts slow function. in real life involves string manipulations regexps. demonstration purposes, here's intentionally stupid recursive fibonacci algorithm. find pretty fast inputs 25, slow around 30, , ridiculous @ 35.
-- repeat: please no advice on how fibonacci correctly. -- slow on purpose! create or replace function tmp_fib ( n integer ) return integer begin if n = 0 or n = 1 return 1; end if; return tmp_fib(n-2) + tmp_fib(n-1); end; / now input: list of names , numbers.
create table tmp_table ( name varchar2(20) unique not null, num number(2,0) ); insert tmp_table (name,num) select 'alpha', 10 dual union select 'bravo', 11 dual union select 'charlie', 33 dual; here's example of slow query: use slow fibonacci function select rows num generates fibonacci number doubled digit.
select p.name, p.num tmp_table p regexp_like(tmp_fib(p.num), '(.)\1') order p.name; this true 11 , 33, bravo , charlie in output. takes 5 seconds run, of slow calculation of tmp_fib(33). want faster version of slow query converting top-n query. n=1, looks this:
select * ( select p.name, p.num tmp_table p regexp_like(tmp_fib(p.num), '(.)\1') order p.name ) rownum <= 1; and returns top result, bravo. still takes 5 seconds run! explanation it's still calculating tmp_fib(33), though result of calculation irrelevant result. should have been able decide bravo going output, there's no need test condition rest of table.
i've thought maybe optimizer needs told tmp_fib expensive. tried tell that, this:
associate statistics functions tmp_fib default cost (999999999,0,0); that alters of cost numbers in plan, doesn't make query run faster.
output of select * v$version in case version-dependent:
oracle database 11g enterprise edition release 11.2.0.2.0 - 64bit production pl/sql release 11.2.0.2.0 - production core 11.2.0.2.0 production tns 64-bit windows: version 11.2.0.2.0 - production nlsrtl version 11.2.0.2.0 - production and here's autotrace of top-1 query. appears claiming query took 1 second, that's not true. ran 5 seconds.
name num -------------------- ---------- bravo 11 execution plan ---------------------------------------------------------- plan hash value: 548796432 ------------------------------------------------------------------------------------- | id | operation | name | rows | bytes | cost (%cpu)| time | ------------------------------------------------------------------------------------- | 0 | select statement | | 1 | 55 | 4 (25)| 00:00:01 | |* 1 | count stopkey | | | | | | | 2 | view | | 1 | 55 | 4 (25)| 00:00:01 | |* 3 | sort order stopkey| | 1 | 55 | 4 (25)| 00:00:01 | |* 4 | table access full | tmp_table | 1 | 55 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------- predicate information (identified operation id): --------------------------------------------------- 1 - filter(rownum<=1) 3 - filter(rownum<=1) 4 - filter( regexp_like (to_char("tmp_fib"("p"."num")),'(.)\1')) note ----- - dynamic sampling used statement (level=2) statistics ---------------------------------------------------------- 27 recursive calls 0 db block gets 25 consistent gets 0 physical reads 0 redo size 593 bytes sent via sql*net client 524 bytes received via sql*net client 2 sql*net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processed update: mentioned in comments, index hint helps query lot. enough accepted correct answer, though doesn't translate real-world scenario. , in ironic twist, oracle seems have learned experience, , chooses index plan default; have tell no_index reproduce original slow behavior.
in real-world scenario i've applied more complex solution, rewriting query pl/sql function. here's how technique looks, applied fib problem:
create or replace package tmp_package type t_namenum table of tmp_table%rowtype; function get_interesting_names (howmany integer) return t_namenum pipelined; end; / create or replace package body tmp_package function get_interesting_names (howmany integer) return t_namenum pipelined cursor c select name, num tmp_table order name; rec c%rowtype; outcount integer; begin open c; outcount := 0; while outcount < howmany loop fetch c rec; exit when c%notfound; if regexp_like(tmp_fib(rec.num), '(.)\1') pipe row(rec); outcount := outcount + 1; end if; end loop; end; end; / select * table(tmp_package.get_interesting_names(1)); thanks responders read question , ran tests , helped me understand execution plans, , dispose of question suggest.
follow-up comment big. running under 11.2.0.3 (oel), query:
select * ( select p.name, p.num tmp_table p regexp_like(tmp_fib(p.num), '(.)\1') order p.name ) rownum <= 1; name num -------------------- ---------- bravo 11 elapsed: 00:00:00.094 plan hash value: 1058933870 ---------------------------------------------------------------------------------- | id | operation | name | rows | bytes | cost (%cpu)| time | ---------------------------------------------------------------------------------- | 0 | select statement | | 1 | 25 | 4 (25)| 00:00:01 | |* 1 | count stopkey | | | | | | |* 2 | view | | 3 | 75 | 4 (25)| 00:00:01 | | 3 | sort order | | 3 | 75 | 4 (25)| 00:00:01 | | 4 | table access full| tmp_table | 3 | 75 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------- predicate information (identified operation id): --------------------------------------------------- 1 - filter(rownum<=1) 2 - filter( regexp_like (to_char("tmp_fib"("num")),'(.)\1')) note ----- - dynamic sampling used statement (level=2) note variation in sort order by saw, , corresponding rows values. moving order-by sub-select looks more yours:
select * ( select * ( select p.name, p.num tmp_table p order p.name ) regexp_like(tmp_fib(num), '(.)\1') ) rownum <= 1; name num -------------------- ---------- bravo 11 elapsed: 00:00:07.894 plan hash value: 548796432 ------------------------------------------------------------------------------------- | id | operation | name | rows | bytes | cost (%cpu)| time | ------------------------------------------------------------------------------------- | 0 | select statement | | 1 | 25 | 171 (99)| 00:00:03 | |* 1 | count stopkey | | | | | | | 2 | view | | 1 | 25 | 171 (99)| 00:00:03 | |* 3 | sort order stopkey| | 1 | 25 | 171 (99)| 00:00:03 | |* 4 | table access full | tmp_table | 1 | 25 | 170 (99)| 00:00:03 | ------------------------------------------------------------------------------------- predicate information (identified operation id): --------------------------------------------------- 1 - filter(rownum<=1) 3 - filter(rownum<=1) 4 - filter( regexp_like (to_char("tmp_fib"("p"."num")),'(.)\1')) note ----- - dynamic sampling used statement (level=2) don't know how helpful or practical in real scenario, in case (in environment, anyway), adding index across fetched columns - full index scan instead of full table scan - seems change behaviour:
create index tmp_index on tmp_table(name, num); index tmp_index created. select * ( select p.name, p.num tmp_table p regexp_like(tmp_fib(p.num), '(.)\1') order p.name ) rownum <= 1; name num -------------------- ---------- bravo 11 elapsed: 00:00:00.093 plan hash value: 1841475998 ------------------------------------------------------------------------------- | id | operation | name | rows | bytes | cost (%cpu)| time | ------------------------------------------------------------------------------- | 0 | select statement | | 1 | 25 | 1 (0)| 00:00:01 | |* 1 | count stopkey | | | | | | |* 2 | view | | 3 | 75 | 1 (0)| 00:00:01 | | 3 | index full scan| tmp_index | 3 | 75 | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------- predicate information (identified operation id): --------------------------------------------------- 1 - filter(rownum<=1) 2 - filter( regexp_like (to_char("tmp_fib"("num")),'(.)\1')) note ----- - dynamic sampling used statement (level=2) select * ( select * ( select p.name, p.num tmp_table p order p.name ) regexp_like(tmp_fib(num), '(.)\1') ) rownum <= 1; name num -------------------- ---------- bravo 11 elapsed: 00:00:00.093 plan hash value: 1841475998 ------------------------------------------------------------------------------- | id | operation | name | rows | bytes | cost (%cpu)| time | ------------------------------------------------------------------------------- | 0 | select statement | | 1 | 25 | 1 (0)| 00:00:01 | |* 1 | count stopkey | | | | | | | 2 | view | | 1 | 25 | 1 (0)| 00:00:01 | |* 3 | index full scan| tmp_index | 1 | 25 | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------- predicate information (identified operation id): --------------------------------------------------- 1 - filter(rownum<=1) 3 - filter( regexp_like (to_char("tmp_fib"("p"."num")),'(.)\1')) note ----- - dynamic sampling used statement (level=2) incidentally, aftr i've run threw several times of rownum variants start getting ora-01000: maximum open cursors exceeded errors. i'm dropping objects @ end of each run staying connected. think suggests bug somewhere, though not related you're seeing since happens index scan.
Comments
Post a Comment