sql - mySQL - Query to count rows and total percentage too slow -


table named 'log', has 50 million rows:

| id     | domainip        | | foo    | 158.132.34.5    | | bob    | 128.12.244.3    | | bob    | 128.12.244.3    | | bob    | 19.152.134.4    | | bob    | 168.152.34.9    | | alice  | 178.132.64.10   | | alice  | 188.152.214.200 | | peter  | 208.162.36.153  | | peter  | 208.162.36.153  | | peter  | 208.162.36.153  | | peter  | 198.168.94.201  | 

i have following query, number of times id used each 'domainip', , percentage of each:

select     `log`.`id`,     `log`.`domainip`,     count(`log`.`domainip`) "times",     totalstable.totals,     (count(`log`.`domainip`)/totalstable.totals)*100 "percentage" `log` join     (     select         `id`,         count(`domainip`) totals     `log` group `id`     ) totalstable  on (`log`.`id` = totalstable.`id`)  group `log`.`domainip` order `log`.`id` asc, "percentage"  desc 

it returns:

| id     | domainip        | times | totals | percentage | foo    | 158.132.34.5    | 1     | 1      | 100 | bob    | 128.12.244.3    | 2     | 4      | 50 | bob    | 19.152.134.4    | 1     | 4      | 25 | bob    | 168.152.34.9    | 1     | 4      | 25 | alice  | 178.132.64.10   | 1     | 2      | 50 | alice  | 188.152.214.200 | 1     | 2      | 50 | peter  | 208.162.36.153  | 3     | 4      | 75 | peter  | 198.168.94.201  | 1     | 4      | 25 

the result need, it's unusable slow (takes several minutes).

here's table structure exported phpmyadmin.

create table `log` (   `id` varchar(150) collate utf8_unicode_ci default null,   `edate` datetime default null,   `domainip` varchar(150) collate utf8_unicode_ci default null,   `event` varchar(150) collate utf8_unicode_ci default null ) engine=innodb default charset=utf8 collate=utf8_unicode_ci;  alter table `log`   add unique key `logunique` (`id`,`edate`,`event`),   add key `edate` (`edate`),   add key `id` (`id`,`edate`),   add key `event` (`id`,`edate`,`event`); 

results of explain query on smaller version of table:

id | select_type | table | type  | possible_keys      | key       | key_len | ref            | rows  | 1 | primary | <derived2> |   | null               | null      | null    | null           | 100   | using where; using temporary; using filesort  1 | primary | log        | ref   | logunique,id,event | logunique | 453     | totalstable.id | 1     |   2 | derived | log        | index | null               | id        | 459     | null           | 100   | 

i need formulate query returns same thing usable (returns results in manner of seconds, not minutes), don't know how

note: adding index domainip improves response of small size sample, full table still takes more 10 minutes return result.

the table created other purposes, , i'd prefer modify it's structure least possible if @ all.

you may find bit faster. start version:

select l.id, l.domainip, count(*) times,        (select count(*) log l2 l2.id = l.id) total log l group l.id, l.domainip order l.id asc; 

your existing index starting id should sufficient.

actually, can remove correlated subquery measure performance of group by. if not enough, know cannot improve more complicated query. need try other method, such using triggers maintain total counts.


Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -