gensim - Using a Word2Vec model pre-trained on wikipedia -
i need use gensim vector representations of words, , figure best thing use word2vec module that's pre-trained on english wikipedia corpus. know download it, how install it, , how use gensim create vectors?
@imanzabet provided useful links pre-trained vectors, if want train models using genism need 2 things:
acquire wikipedia data, can access here. looks recent snapshot of english wikipedia on 20th, , can found here. believe other english-language "wikis" e.g. quotes captured separately, if want include them you'll need download well.
load data , use generate models. that's broad question, i'll link excellent genism documentation , word2vec tutorial.
finally, i'll point out there seems blog post describing precisely use case.
Comments
Post a Comment