Topic modeling is one way of implementing clustering for a document collection. In this article, by the term "clustering" I mean a popular clustering mechanism such as K-means, fuzzy K-means etc.
So, the difference is the way how these both mechanisms have been implemented. Even though both of them returns similar type of outcome, the actual data/ knowledge embedded in the outcome can be different.
In topic modeling, each document is represented as a distribution of topics. And essentially, topic is a probability distribution over words. As opposed to topic modelling, in document clustering, cluster is composed of collection of documents. (not topics)
.. to be continued!
No comments:
Post a Comment