multilabel classification - Multi-label grouping with SQL query -
i want able go through list of multi-labeled entries, can find data on common groups of labels. have table structured following:
--------------------------------------------------------------------- |gameid | title |label1 |label2 |label3 |label4 |... |labeln | |-------|-----------|-------|-------|-------|-------|-------|-------| |1 | | 1 | 0 | 1 | 0 | ... | 1 | |2 | b | 1 | 1 | 0 | 1 | ... | 0 | |3 | c | 0 | 0 | 1 | 1 | ... | 0 | |4 | d | 1 | 0 | 0 | 0 | ... | 1 | |... | ... | ... | ... | ... | ... | ... | ... | ---------------------------------------------------------------------
if entry has 1 under label, means entry associated label. otherwise, not associated lable. example, game has labels "label1", "label3", ..., , "labeln" associated it.
now, take example sql query:
select gameid, title gametagsbinary "gun customization" = 1 , "zombies" = 1
this query return following table:
------------------------------------- |gameid | title | |-------|---------------------------| |263070 | blockstorm | |209870 | blacklight: retribution | |436520 | line of sight | -------------------------------------
what have query go through every column, label1 through labeln, , print out number of games correlate labels.
------------------------------------------------- |combination | numberofgames | |---------------------------|-------------------| |label1 + label2 | 5 | |label1 + label3 | 11 | |label1 + label4 | 9 | |... | ... | |gun customization + zombies| 3 | |... | ... | |labeln + label(n-1) | 7 | -------------------------------------------------
try below logic, (replace your_table_name actual table name)
you need create stored procedure run sql script getting result.
btw, code sql server, if use other database, of code different.
code finding distinct combination of columns
select result.* #temp ( select row_number() on (order a.name) id, a.name a_name, b.name b_name (select name sys.columns object_id=object_id('your_table_name')) cross join (select name sys.columns object_id=object_id('your_table_name')) b a.name <> b.name ) result select a_name, b_name #combination #temp temp1 not exists(select 1 #temp temp2 temp1.a_name = temp2.b_name , temp1.b_name = temp2.a_name , temp1.id > temp2.id)
cursor loop through combination , insert temp table
create table #result ( combination varchar(100), numberofgames int ) declare @a_name varchar(100); declare @b_name varchar(100); declare @combination_string varchar(100); declare @count int; declare @count_sql varchar(1000); declare combo_cursor cursor select a_name, b_name #combination open combo_cursor fetch next combo_cursor @a_name, @b_name; while @@fetch_status = 0 begin set @combination_string = @a_name + ' + ' + @b_name set @sql = 'select @count = count(*) your_table_name ' + @a_name + ' = 1 , ' + @b_name + ' = 1' exec sp_executesql @sql insert #result (combination, numberofgames) values (@combination_string, @count) fetch next combo_cursor @a_name, @b_name; end close combo_cursor; deallocate combo_cursor;
your final result
select * #result
drop temp table after execution
drop table #temp drop table #combination drop table #result
Comments
Post a Comment