Key points are not available for this paper at this time.
Using dedicated hardware to do machine learning typically ends up in disaster because of cost, obsolescence, and poor software. The popularization of graphic processing units (GPUs), which are now available on every PC, provides an attractive alternative. We propose a generic 2-layer fully connected neural network GPU implementation which yields over 3/spl times/ speedup for both training and testing with respect to a 3 GHz P4 CPU.
Steinkraus et al. (Sat,) studied this question.