August 26, 2001

A streaming ensemble algorithm (SEA) for large-scale classification

WSW. Nick StreetUniversity of Iowa YKYongSeog KimUtah State University

Key Points

Key points are not available for this paper at this time.

Abstract

Ensemble methods have recently garnered a great deal of attention in the machine learning community. Techniques such as Boosting and Bagging have proven to be highly effective but require repeated resampling of the training data, making them inappropriate in a data mining context. The methods presented in this paper take advantage of plentiful data, building separate classifiers on sequential chunks of training points. These classifiers are combined into a fixed-size ensemble using a heuristic replacement strategy. The result is a fast algorithm for large-scale or streaming data that classifies as well as a single decision tree built on all the data, requires approximately constant memory, and adjusts quickly to concept drift.

AIに質問

Bookmark

View Full Paper