nltk, python: collocation strength of hashtags in tweets? -
i wondering if there way adapt nltk's collocation package measure collocation strengths of 2 hashtags in collection of tweets.
suppose tweet can contain 0 or multiple hashtags , extract list of hashtags each tweet in collection. calculate, e.g., pmi of each pair of tags, based on frequency pair co-occur in same tweet.
certainly can write own program wonder if there's python package can adapted this?
for example, nltk collocation package closest know of, using collocation extract n-grams, , not know if can customise http://www.nltk.org/howto/collocations.html
Comments
Post a Comment