sql - mySQL - Query to count rows and total percentage too slow -
table named 'log', has 50 million rows:
| id | domainip | | foo | 158.132.34.5 | | bob | 128.12.244.3 | | bob | 128.12.244.3 | | bob | 19.152.134.4 | | bob | 168.152.34.9 | | alice | 178.132.64.10 | | alice | 188.152.214.200 | | peter | 208.162.36.153 | | peter | 208.162.36.153 | | peter | 208.162.36.153 | | peter | 198.168.94.201 |
i have following query, number of times id
used each 'domainip', , percentage of each:
select `log`.`id`, `log`.`domainip`, count(`log`.`domainip`) "times", totalstable.totals, (count(`log`.`domainip`)/totalstable.totals)*100 "percentage" `log` join ( select `id`, count(`domainip`) totals `log` group `id` ) totalstable on (`log`.`id` = totalstable.`id`) group `log`.`domainip` order `log`.`id` asc, "percentage" desc
it returns:
| id | domainip | times | totals | percentage | foo | 158.132.34.5 | 1 | 1 | 100 | bob | 128.12.244.3 | 2 | 4 | 50 | bob | 19.152.134.4 | 1 | 4 | 25 | bob | 168.152.34.9 | 1 | 4 | 25 | alice | 178.132.64.10 | 1 | 2 | 50 | alice | 188.152.214.200 | 1 | 2 | 50 | peter | 208.162.36.153 | 3 | 4 | 75 | peter | 198.168.94.201 | 1 | 4 | 25
the result need, it's unusable slow (takes several minutes).
here's table structure exported phpmyadmin.
create table `log` ( `id` varchar(150) collate utf8_unicode_ci default null, `edate` datetime default null, `domainip` varchar(150) collate utf8_unicode_ci default null, `event` varchar(150) collate utf8_unicode_ci default null ) engine=innodb default charset=utf8 collate=utf8_unicode_ci; alter table `log` add unique key `logunique` (`id`,`edate`,`event`), add key `edate` (`edate`), add key `id` (`id`,`edate`), add key `event` (`id`,`edate`,`event`);
results of explain query on smaller version of table:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | 1 | primary | <derived2> | | null | null | null | null | 100 | using where; using temporary; using filesort 1 | primary | log | ref | logunique,id,event | logunique | 453 | totalstable.id | 1 | 2 | derived | log | index | null | id | 459 | null | 100 |
i need formulate query returns same thing usable (returns results in manner of seconds, not minutes), don't know how
note: adding index domainip improves response of small size sample, full table still takes more 10 minutes return result.
the table created other purposes, , i'd prefer modify it's structure least possible if @ all.
you may find bit faster. start version:
select l.id, l.domainip, count(*) times, (select count(*) log l2 l2.id = l.id) total log l group l.id, l.domainip order l.id asc;
your existing index starting id
should sufficient.
actually, can remove correlated subquery measure performance of group by
. if not enough, know cannot improve more complicated query. need try other method, such using triggers maintain total counts.
Comments
Post a Comment