Los puntos clave no están disponibles para este artículo en este momento.
Previous work has shown that human evaluations in NLP are notoriously under-powered.
Howcroft et al. (Fri,) studied this question.