November 1, 2011

DH-TRIE frequent pattern mining on Hadoop using JPA

Key Points

Key points are not available for this paper at this time.

Abstract

The FPgrowth is a famous frequent pattern's algorithm in data mining when working with high-dimensional, large-scale data sets. It is also known as great complexity on memory for the recursively processing. In general, FPgrowth cannot handle large-scale data set unless dividing a whole data set into small blocks. Based on Hadoop, the open cloud computing model, a distributed DH-TRIE frequent pattern algorithm using JPA is proposed, which solved the three problems (globalization, random-write and duration). The algorithm is shown good flexibility and scalability by comparisons to mahout project. By applied to a virtualization platform Vega Cloud, the algorithm will be used in far-ranging situations.

DH-TRIE frequent pattern mining on Hadoop using JPA

Key Points

Abstract

Cite This Study