December 1, 2009

Evaluation of Approaches for Dimensionality Reduction Applied with Naive Bayes Anti-Spam Filters

Key Points

Key points are not available for this paper at this time.

Abstract

There are different approaches able to automatically detect e-mail spam messages, and the best-known ones are based on Bayesian decision theory. However, the most of these approaches have the same difficulty: the high dimensionality of the feature space. Many term selection methods have been proposed in the literature. Nevertheless, it is still unclear how the performance of naive Bayes anti-spam filters depends on the methods applied for reducing the dimensionality of the feature space. In this paper, we compare the performance of most popular methods used as term selection techniques, such as document frequency, information gain, mutual information, X 2 statistic, and odds ratio used for reducing the dimensionality of the term space with four well-known different versions of naive Bayes spam filter.

Perguntar à IA

Bookmark

Cite This Study

Almeida et al. (Tue,) studied this question.

synapsesocial.com/papers/6a1c1a6400ee29383e9d74ed https://doi.org/https://doi.org/10.1109/icmla.2009.22

Perguntar à IA

Bookmark