Los puntos clave no están disponibles para este artículo en este momento.
We describe an approach to crowdsource the evaluation of TTS systems by preference tests and report on lessons learnt from running 127 real-life crowdsourced tests. We show that at least one type of cheating becomes more prevalent over time if left unchecked and develop metrics to exclude cheaters. We demonstrate that their exclusion improves test outcomes. Index Terms: TTS, speech synthesis, listening test, preference test, crowdsourcing, cheating
Buchholz et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: