Key points are not available for this paper at this time.
Reading comprehension has recently seen rapid progress, with systems matching on the most popular datasets for the task. However, a large body of work highlighted the brittleness of these systems, showing that there is much left to be done. We introduce a new English reading comprehension, DROP, which requires Discrete Reasoning Over the content of. In this crowdsourced, adversarially-created, 96k-question, a system must resolve references in a question, perhaps to multiple positions, and perform discrete operations over them (such as addition, , or sorting). These operations require a much more comprehensive of the content of paragraphs than what was necessary for prior. We apply state-of-the-art methods from both the reading comprehension semantic parsing literature on this dataset and show that the best systems achieve 32. 7% F1 on our generalized accuracy metric, while expert human is 96. 0%. We additionally present a new model that combines reading methods with simple numerical reasoning to achieve 47. 0% F1.
Dua et al. (Fri,) studied this question.