tf idf - Solr- Find "Significant Terms" on Subset of Documents -
i'm trying "significant terms" subset of documents in solr. may or may not best way, i'm attempting use solr's tf-idf functionality since have data stored in solr , it's lightning fast. want restrict "df" count subset of documents, through search or filter. tried this, i'm searching "apple" in name field:
and of course, gives me documents have "apple" in name, document frequency gives counts entire dataset, doesn't seem want. think solr can this, maybe not. i'm open suggestions.
thanks, adrian
it 1 works have in backlog[1].
what need document frequency in foreground set ( subset of docs) , document frequency in background set(your corpus). solr won't out of box, can work on it. elastic search has module can inspiration from[2]
Comments
Post a Comment