Key points are not available for this paper at this time.
Standard SVM training has O(m3) time and O(m2) space complexities, where m is the training set size. It is thus computationally infeasible on very large data sets. By observing that practical SVM implementations only approximate the optimal solution by an iterative strategy, we scale up kernel methods by exploiting such "approximateness" in this paper. We first show that many kernel methods can be equivalently formulated as minimum enclosing ball (MEB) problems in computational geometry. Then, by adopting an efficient approximate MEB algorithm, we obtain provably approximately optimal solutions with the idea of core sets. Our proposed Core Vector Machine (CVM) algorithm can be used with nonlinear kernels and has a time complexity that is linear in m and a space complexity that is independent of m. Experiments on large toy and real-world data sets demonstrate that the CVM is as accurate as existing SVM implementations, but is much faster and can handle much larger data sets than existing scale-up methods. For example, CVM with the Gaussian kernel produces superior results on the KDDCUP-99 intrusion detection data, which has about five million training patterns, in only 1.4 seconds on a 3.2GHz Pentium-4 PC.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ivor W. Tsang
Agency for Science, Technology and Research
James T. Kwok
Hong Kong University of Science and Technology
Pak-Ming Cheung
Tuen Mun Hospital
Journal of Machine Learning Research
Hong Kong University of Science and Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Tsang et al. (Thu,) studied this question.
synapsesocial.com/papers/6a11d0a7cc504890b2563b93 — DOI: https://doi.org/10.5555/1046920.1058114
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: