Key points are not available for this paper at this time.
Recently, deep neural networks(DNNs) have achieved excellent results on benchmarks for acoustic modeling of speech recognition. By randomly discarding network units, a strategy which is called as dropout can improve the performance of DNNs by reducing the influence of over-fitting. However, the random dropout strategy treats units indiscriminately, which may lose information on distributions of units outputs. In this paper, we improve the dropout strategy by differential treatment to units according to their outputs. Only minor changes to an existing neural network system can achieve a significant improvement. Experiments of phone recognition on TIMIT show that the sparse dropout fine-tuning gets significant performance improvement.
Zheng et al. (Tue,) studied this question.