Los puntos clave no están disponibles para este artículo en este momento.
The exponential rise of daily emails raises concerns about spam, which can be intrusive and harmful to user data. Effective email classification is crucial to address this issue. This study proposes a system using the DistilBERT model to identify spam and non-spam (ham) emails. We leverage distributed training with Hugging Face's Accelerate library to significantly reduce training time. Compared to a non-distributed approach, this method achieves a 46.39% reduction in training time while maintaining 96% accuracy. We recommend exploring multi-GPU training in future work for further efficiency gains.
Padilla et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: