What does this research mean for the field?

The CPAR (Classification based on Predictive Association Rules) algorithm effectively combines associative and traditional rule-based classification techniques to achieve high accuracy while avoiding high processing overhead and overfitting. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

May 1, 2003Open Access

CPAR: Classification based on Predictive Association Rules

Key Points

Key points are not available for this paper at this time.

Abstract

Recent studies in data mining have proposed a new classification approach, called associative classification, which, according to several reports, such as 7, 6, achieves higher classification accuracy than traditional classification approaches such as C4.5. However, the approach also suffers from two major deficiencies: (1) it generates a very large number of association rules, which leads to high processing overhead; and (2) its confidence-based rule evaluation measure may lead to overfitting. In comparison with associative classification, traditional rule-based classifiers, such as C4.5, FOIL and RIPPER, are substantially faster but their accuracy, in most cases, may not be as high. In this paper, we propose a new classification approach, CPAR (Classification based on Predictive Association Rules), which combines the advantages of both associative classification and traditional rule-based classification. Instead of generating a large number of candidate rules as in associative classification, CPAR adopts a greedy algorithm to generate rules directly from training data. Moreover, CPAR generates and tests more rules than traditional rule-based classifiers to avoid missing important rules. To avoid overfitting, CPAR uses expected accuracy to evaluate each rule and uses the best k rules in prediction.

Bookmark

View Full Paper