The Natural Stories Corpus

Futrell, Richard; Gibson, Edward; Tily, Hal; Blank, Idan; Vishnevetsky, Anastasia; Piantadosi, Steven T.; Fedorenko, Evelina

Computer Science > Computation and Language

arXiv:1708.05763 (cs)

[Submitted on 18 Aug 2017]

Title:The Natural Stories Corpus

Authors:Richard Futrell, Edward Gibson, Hal Tily, Idan Blank, Anastasia Vishnevetsky, Steven T. Piantadosi, Evelina Fedorenko

View PDF

Abstract:It is now a common practice to compare models of human language processing by predicting participant reactions (such as reading times) to corpora consisting of rich naturalistic linguistic materials. However, many of the corpora used in these studies are based on naturalistic text and thus do not contain many of the low-frequency syntactic constructions that are often required to distinguish processing theories. Here we describe a new corpus consisting of English texts edited to contain many low-frequency syntactic constructions while still sounding fluent to native speakers. The corpus is annotated with hand-corrected parse trees and includes self-paced reading time data. Here we give an overview of the content of the corpus and release the data.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1708.05763 [cs.CL]
	(or arXiv:1708.05763v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1708.05763

Submission history

From: Richard Futrell [view email]
[v1] Fri, 18 Aug 2017 21:27:34 UTC (393 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2017-08

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Richard Futrell
Edward Gibson
Hal Tily
Idan Blank
Anastasia Vishnevetsky

…

export BibTeX citation

Computer Science > Computation and Language

Title:The Natural Stories Corpus

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:The Natural Stories Corpus

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators