postgresql - How can I speed up a window query with string prefix matching? -

June 15, 2011

i have table 7.5million records , trying implement autocomplete form based on said table, performance pretty bad.

the schema (irrelevant fields omitted) follows:

companies --------- sid (integer primary key) world_hq_sid (integer) name (varchar(64)) marketing_alias (varchar(64)) address_country_code (char(4)) address_state (varchar(64)) sort_order integer search_weight integer annual_sales integer

the fields passed in optional country_code , state, along search term. want search term match (case insensitive) beginning of either name or marketing_alias. want top ten results, results match country , state @ top, country only, no state/country match. after that, want results sorted sort_order.

also, want 1 match per world_hq_sid. finally, when have top match per world_hq_sid, want final results sorted search_weight.

i'm using window query achieve world_hq_sid part. here query:

select * (     select row_number() on (partition world_hq_sid order case when address_country_code = 'us' , address_state = 'ca' 2 when address_country_code = 'us' 1 else 0 end desc, sort_order asc) r,     companies.*     companies     ((upper(name) upper('co%')) or (upper(marketing_alias) upper('co%')))   ) x   x.r = 1   order case when address_country_code = 'us' , address_state = 'ca' 2 when address_state = 'ca' 1 else 0 end desc, search_weight asc, annual_sales desc   limit 10;

i have normal btree indexes on address_state, address_country_code, world_hq_sid, sort_order, , search_weight.

i have following indexes on name , marketing_alias fields:

create index companies_alias_pattern_upper_idx on companies(upper(marketing_alias) varchar_pattern_ops); create index companies_name_pattern_upper_idx on companies(upper(name) varchar_pattern_ops)

and here explain analyze when pass ca state , 'co' search term

limit  (cost=676523.01..676523.03 rows=10 width=939) (actual time=18695.686..18695.687 rows=10 loops=1)  ->  sort  (cost=676523.01..676526.67 rows=1466 width=939) (actual time=18695.686..18695.687 rows=10 loops=1)      sort key: x.search_weight, x.annual_sales      sort method: top-n heapsort  memory: 30kb      ->  subquery scan on x  (cost=665492.58..676491.33 rows=1466 width=939) (actual time=18344.715..18546.830 rows=151527 loops=1)            filter: (x.r = 1)            rows removed filter: 20672            ->  windowagg  (cost=665492.58..672825.08 rows=293300 width=931) (actual time=18344.710..18511.625 rows=172199 loops=1)                  ->  sort  (cost=665492.58..666225.83 rows=293300 width=931) (actual time=18344.702..18359.145 rows=172199 loops=1)                        sort key: companies.world_hq_sid, (case when ((companies.address_state)::text = 'ca'::text) 1 else 0 end), companies.sort_order                        sort method: quicksort  memory: 108613kb                        ->  bitmap heap scan on companies  (cost=17236.64..518555.98 rows=293300 width=931) (actual time=1861.665..17999.806 rows=172199 loops=1)                              recheck cond: ((upper((name)::text) ~~ 'co%'::text) or (upper((marketing_alias)::text) ~~ 'co%'::text))                              filter: ((upper((name)::text) ~~ 'co%'::text) or (upper((marketing_alias)::text) ~~ 'co%'::text))                              ->  bitmapor  (cost=17236.64..17236.64 rows=196219 width=0) (actual time=1829.061..1829.061 rows=0 loops=1)                                    ->  bitmap index scan on companies_name_pattern_upper_idx  (cost=0.00..8987.98 rows=97772 width=0) (actual time=971.331..971.331 rows=169390 loops=1)                                          index cond: ((upper((name)::text) ~>=~ 'co'::text) , (upper((name)::text) ~<~ 'cp'::text))                                    ->  bitmap index scan on companies_alias_pattern_upper_idx  (cost=0.00..8102.02 rows=98447 width=0) (actual time=857.728..857.728 rows=170616 loops=1)                                          index cond: ((upper((marketing_alias)::text) ~>=~ 'co'::text) , (upper((marketing_alias)::text) ~<~ 'cp'::text))

i've bumped work_mem , shared_buffers 100m.

as can see, query returns in 18 seconds. odd results on board different starting characters, 400ms (acceptable) 30 seconds (very not acceptable). postgres gurus, question is, expecting of postgresql perform such query consistently? there way can speed up?

select * (     select distinct on (world_hq_sid)         world_hq_sid,         (address_country_code = 'us')::int + (address_state = 'ca')::int address_weight,         sort_order,         search_weight, annual_sales,         sid, name, marketing_alias,         address_country_code, address_state     companies             upper(name) upper('co%')         or upper(marketing_alias) upper('co%')     order 1, 2 desc, 3 ) s order     address_weight desc,     search_weight,     annual_sales desc limit 10

Search This Blog

KHS

postgresql - How can I speed up a window query with string prefix matching? -

Comments

Post a Comment

Popular posts from this blog

user interface - Python attempting to create a simple gui, getting "AttributeError: 'MainMenu' object has no attribute 'intro_screen'" -

jquery - Common JavaScript snippet to share files on Google Drive, Dropbox, Box.net or SkyDrive -

Android Gson.fromJson error -