performance - PostgreSQL: speed up SELECT query in table with millions of rows -


i have table > 4.5 million rows , select query far slow needs.

the table created with:

create table all_legs (                 carrier text,                 dep_hub text,                 arr_hub text,                 dep_dt timestamp time zone,                 arr_dt timestamp time zone,                 price_ct integer,                 ... 5 more cols ...,                 primary key (carrier, dep_hub, arr_hub, dep_dt, arr_dt, ...3 other cols...)                 ) 

when want select rows date, query slow; takes between 12sec , 20 seconds. aim takes max 1 sec. expect query return between 0.1% , 1% of rows contained in table.

the query quite simple:

select * all_legs dep_dt::date = '2017-08-15' order price_ct asc 

explain analyze returns:

sort  (cost=197154.69..197212.14 rows=22982 width=696) (actual time=14857.300..14890.565 rows=31074 loops=1)   sort key: price_ct   sort method: external merge  disk: 5256kb   ->  seq scan on all_legs  (cost=0.00..188419.85 rows=22982 width=696) (actual time=196.738..14581.143 rows=31074 loops=1)         filter: ((dep_dt)::date = '2017-08-15'::date)         rows removed filter: 4565249 planning time: 0.572 ms execution time: 14908.274 ms 

note: learned yesterday command, still not able understand returned.

i have tried using index-only scans, suggested here, running command: create index idx_all_legs on all_legs(dep_dt); did not notice difference in running time. tried creating index columns, want columns return.

another thought sorting rows dep_dt, search of rows fulfilling condition should faster not scattered. unfortunately, don't know how implement this.

is there way make fast aiming to?


solution

as suggested in laurenz' answer, adding index create index if not exists idx_dep_dt_price on all_legs(dep_dt, price_ct); , adapting condition in select where dep_dt >= '2017-08-15 00:00:00' , dep_dt < '2017-08-16 00:00:00' has reduced running time 1/4. if improvement, means running times between 2 , 6 seconds.

any additional idea reduce running time further appreciated.

the index won't help.

two solutions:

  1. you chould either change query to:

    where dep_dt >= '2017-08-15 00:00:00' , dep_dt < '2017-08-16 00:00:00' 

    then index can used.

  2. create index on expression:

    create index on all_legs(((dep_dt @ time zone 'utc')::date)); 

    (or different time zone) , change query to

    where (dep_dt @ time zone 'utc')::date = '2017-08-16' 

    the at time zone necessary because otherwise result of cast depend on current timezone setting.

the first solution simpler, second has advantage can add price_ct index this:

create index on all_legs(((dep_dt @ time zone 'utc')::date), price_ct); 

then don't need sort more, , query fast can theoretically get.


Comments

Popular posts from this blog

python - Selenium remoteWebDriver (& SauceLabs) Firefox moseMoveTo action exception -

html - How to custom Bootstrap grid height? -

transpose - Maple isnt executing function but prints function term -