SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

Zellers, Rowan; Bisk, Yonatan; Schwartz, Roy; Choi, Yejin

Computer Science > Computation and Language

arXiv:1808.05326 (cs)

[Submitted on 16 Aug 2018]

Title:SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

Authors:Rowan Zellers, Yonatan Bisk, Roy Schwartz, Yejin Choi

View PDF

Abstract:Given a partial description like "she opened the hood of the car," humans can reason about the situation and anticipate what might come next ("then, she examined the engine"). In this paper, we introduce the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning.
We present SWAG, a new dataset with 113k multiple choice questions about a rich spectrum of grounded situations. To address the recurring challenges of the annotation artifacts and human biases found in many existing datasets, we propose Adversarial Filtering (AF), a novel procedure that constructs a de-biased dataset by iteratively training an ensemble of stylistic classifiers, and using them to filter the data. To account for the aggressive adversarial filtering, we use state-of-the-art language models to massively oversample a diverse set of potential counterfactuals. Empirical results demonstrate that while humans can solve the resulting inference problems with high accuracy (88%), various competitive models struggle on our task. We provide comprehensive analysis that indicates significant opportunities for future research.

Comments:	EMNLP 2018
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1808.05326 [cs.CL]
	(or arXiv:1808.05326v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1808.05326

Submission history

From: Rowan Zellers [view email]
[v1] Thu, 16 Aug 2018 02:21:01 UTC (5,782 KB)

Computer Science > Computation and Language

Title:SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators