Google Scholar

Mining knowledge for natural language inference from wikipedia categories

M Chen, Z Chu, K Stratos, K Gimpel - arXiv preprint arXiv:2010.01239, 2020 - arxiv.org

arXiv preprint arXiv:2010.01239, 2020•arxiv.org

Accurate lexical entailment (LE) and natural language inference (NLI) often require large quantities of costly annotations. To alleviate the need for labeled data, we introduce WikiNLI: a resource for improving model performance on NLI and LE tasks. It contains 428,899 pairs of phrases constructed from naturally annotated category hierarchies in Wikipedia. We show that we can improve strong baselines such as BERT and RoBERTa by pretraining them on WikiNLI and transferring the models on downstream tasks. We conduct systematic comparisons with phrases extracted from other knowledge bases such as WordNet and Wikidata to find that pretraining on WikiNLI gives the best performance. In addition, we construct WikiNLI in other languages, and show that pretraining on them improves performance on NLI tasks of corresponding languages.

arxiv.org

Show moreShow less

Save Cite Cited by 9 Related articles All 4 versions View as HTML

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Mining knowledge for natural language inference from wikipedia categories