lda - gensim.interfaces.TransformedCorpus

i'm relative new in world of latent dirichlet allocation. able generate lda model following wikipedia tutorial , i'm able generate lda model own documents. step try understand how can use previus generated model classify unseen documents. i'm saving "lda_wiki_model" with

id2word =gensim.corpora.dictionary.load_from_text('ptwiki_wordids.txt.bz2')      mm = gensim.corpora.mmcorpus('ptwiki_tfidf.mm')      lda = gensim.models.ldamodel.ldamodel(corpus=mm, id2word=id2word, num_topics=100, update_every=1, chunksize=10000, passes=1)     lda.save('lda_wiki_model.lda')

and i'm loading same model with:

new_lda = gensim.models.ldamodel.load(path + 'lda_wiki_model.lda') #carrega o modelo

i have "new_doc.txt", , turn document id<-> term dictionary , converted tokenized document "document-term matrix"

but when run new_topics = new_lda[corpus] receive 'gensim.interfaces.transformedcorpus object @ 0x7f0ecfa69d50'

how can extract topics that?

i tried

`lsa = models.ldamodel(new_topics, id2word=dictionary, num_topics=1, passes=2) corpus_lda = lsa[new_topics] print(lsa.print_topics(num_topics=1, num_words=7)

and

print(corpus_lda.print_topics(num_topics=1, num_words=7) `

but return topics not relationed new document. mistake? i'm miss understanding something?

**if run new model using dictionary , corpus created above, receive correct topics, point is: how re-use model? correctly re-use wiki_model?

thank you.

i facing same problem. code solve problem:

new_topics = new_lda[corpus]  topic in new_topics:        print(topic)

this give list of tuples of form (topic number, probability)

Search This Blog

RT

lda - gensim.interfaces.TransformedCorpus - How use? -

Comments

Post a Comment