sql - efficently get nearest id based on counting a column in one-to-many table -

March 15, 2011

i have relational table (one-to-many) , need efficiently similarities between ids giving associated items. table this:

id  item  1   a2231 1   a2134 2   a2134 2   b2313 ...

what need how many rows common between ids:

a_id  b_id  count_items 1     2     1  1     3     0 2     1     1  ...

i have made query, o(n2), , doesn't work because spool space.

select a.id a_id, b.id b_id, count(b.item) count_items tab left join  tab b --same table on (a.item = b.item) group a.id, b.id

edit:

n_rows ~ 50mm  n_items ~ 100k  n_ids ~ 170k combinations id/item unique

it'a there way efficiently accomplish this? in advance!

i start using inner join:

select a.id, b.id, count(*) count_items tab left join      tab b --same table      on a.item = b.item group a.id, b.id;

next, if table has duplicates, might work:

with t (       select distinct id, item       tab      ) select a.id, b.id, count(*) t join      t b      on a.item = b.item group a.id, b.id;

and finally, if want pairs of items, then:

with t (       select distinct id, item       tab      ) select i1.id, i2.id, count(b.id) (select distinct id tab) i1 cross join      (select distinct id tab) i2 left join      t      on t.id = i1.id left join      t b      on b.id = i2.id , a.item = b.item group i1.id, i2.id;

Search This Blog

RT

sql - efficently get nearest id based on counting a column in one-to-many table -

Comments

Post a Comment

Popular posts from this blog

Ansible warning on jinja2 braces on when -

Parsing a protocol message from Go by Java -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -