TransWikia.com

How to map topic to a document after topic modeling is done with LDA

Data Science Asked by user1845926 on December 5, 2020

Is there any way I can map generated topic from LDA to the list of documents and identify to which topic it belongs to ?
I am interested in clustering documents using unsupervised learning and segregating it into appropriate cluster.
Any link, code example, paper will greatly be appreciated.

One Answer

After training your LDA topic model you can input documents into the model and it will classify them into the pre defined number of topics. In gensim (python), this would look something like this:

ques_vec = dictionary.doc2bow(tokenized_document)
topic_vec = ldamodel[ques_vec]
  • The dictionary is something you should have created for training
  • ldamodel is the model that you trained.
  • The topic_vec will contain the classified topic number (class) and the probability that the document belongs to that class.

At this point, you will not know what is the meaning of each topic (class), because it is the result of unsupervised classification. To know what is the meaning of each topic that your lda model clusters your documents into, you have to look into the trained parameters like this:

words = ldamodel.show_topic(topic_number, topn = 200)

If you print that, you'll see the top 200 words that make up that topic number. Based on the meaning of the words in each topic, you name that topic as an appropriate class.

Answered by Sid on December 5, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP