January 1, 2023Open Access

MIL-Decoding: Detoxifying Language Models at Token-Level via Multiple Instance Learning

Key Points

Key points are not available for this paper at this time.

Abstract

Despite advances in large pre-trained neural language models, they are prone to generating toxic language, which brings security risks to their applications.We introduce MIL-Decoding, which detoxifies language models at token-level by interpolating it with a trained multiple instance learning (MIL) network.MIL model is trained on a corpus with a toxicity label for each text to predict the overall toxicity and the toxicity of each token in its context.Intuitively, the MIL network computes a toxicity distribution over next tokens according to the generated context which supplements the original language model to avoid toxicity.We evaluate MIL-Decoding with automatic metrics and human evaluation, where MIL-Decoding outperforms other baselines in detoxification while it only hurts generation fluency a little bit.

Perguntar à IA

Bookmark

View Full Paper

Cite This Study

Zhang et al. (Sun,) studied this question.

synapsesocial.com/papers/6a10a5ef10ed65f1d0fd2474 https://doi.org/https://doi.org/10.18653/v1/2023.acl-long.11

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Perguntar à IA

Bookmark

View Full Paper