Information Retrieval

Topic Model with Nonnegative Matrix Factorization

We define a topic modeling to partition documents into associated topics with nonnegative matrix factorization (NMF). We apply classical approach and also Bayesian inference with Kullback-Leibler error measure for approximating the decompositions of the matrix. As a baseline, we use topic model with LDA to compare it to our model. We evaluate our proposed method by defining domain specific metrics such as topic-uniqueness and overall topic-uniqueness. With our topic modeling approach, we provide efficient processing of large collections while preserving the essential statistical relationships.