Key points are not available for this paper at this time.
Recently, large-scale pretrained language models have demonstrated impressive performance on several commonsense-reasoning benchmark datasets. However, building machines with commonsense to compose realistically plausible sentences remains challenging. In this paper, we present a constrained text generation task, COMMONGEN associated with a benchmark dataset, to explicitly test machines for the ability of generative commonsense reasoning. Given a set of common concepts (e. g. , dog, frisbee, catch, throw) ; the task is to generate a coherent sentence describing an everyday scenario using these concepts (e. g. , "a man throws a frisbee and his dog catches it").
Lin et al. (Wed,) studied this question.