January 1, 2021Open Access

What happens if you treat ordinal ratings as interval data? Human evaluations in NLP are even more under-powered than you think

Key points are not available for this paper at this time.

Previous work has shown that human evaluations in NLP are notoriously under-powered.

Bookmark

View Full Paper

Cite This Study

Howcroft et al. (Fri,) studied this question.

Bookmark

View Full Paper