sql - efficently get nearest id based on counting a column in one-to-many table -
i have relational table (one-to-many) , need efficiently similarities between ids giving associated items. table this:
id item 1 a2231 1 a2134 2 a2134 2 b2313 ...
what need how many rows common between ids:
a_id b_id count_items 1 2 1 1 3 0 2 1 1 ...
i have made query, o(n2), , doesn't work because spool space.
select a.id a_id, b.id b_id, count(b.item) count_items tab left join tab b --same table on (a.item = b.item) group a.id, b.id
edit:
n_rows ~ 50mm n_items ~ 100k n_ids ~ 170k combinations id/item unique
it'a there way efficiently accomplish this? in advance!
i start using inner join:
select a.id, b.id, count(*) count_items tab left join tab b --same table on a.item = b.item group a.id, b.id;
next, if table has duplicates, might work:
with t ( select distinct id, item tab ) select a.id, b.id, count(*) t join t b on a.item = b.item group a.id, b.id;
and finally, if want pairs of items, then:
with t ( select distinct id, item tab ) select i1.id, i2.id, count(b.id) (select distinct id tab) i1 cross join (select distinct id tab) i2 left join t on t.id = i1.id left join t b on b.id = i2.id , a.item = b.item group i1.id, i2.id;
Comments
Post a Comment