Business analysis and intelligence powered by Big Data Topic Modeling with Mahout
Topic modeling is a form of text mining, a way of identifying patterns in a corpus. When a corpus is run through a tool that groups words across the corpus into ‘topics’. Topic modeling for big data provides a key opportunity to address the needs of data-driven businesses in a way to deliver genuine value to business users by simplifying search and summary processes via the vast amount of information. The approach captures the evolution of topics in a sequentially organized corpus of documents into two main phases, mapping and reducing phases. In the mapping phase the probabilistic on each word, in collected documents, is calculated by using collapsed space of latent variables and parameters for summarizing words in each topic, and reducing phase to utilize the various results from map phase while predicting a
new topic model from a given trained models.
Keywords: Big Data, Topic modeling, Mahout, Business Intelligence
Citation: *, ( 2018), Business analysis and intelligence powered by Big Data Topic Modeling with Mahout. Scientific Transactions in Environment and Technovation Journal(STET), 12(1): 12-16
Received: 04/20/2017; Accepted: 07/02/2018;