Why rankings of biomedical image analysis competitions should be interpreted with care

Maier-Hein, Lena; Eisenmann, Matthias; Reinke, Annika; Onogur, Sinan; Stankovic, Marko; Scholz, Patrick; Arbel, Tal; Bogunovic, Hrvoje; Bradley, Andrew P.; Carass, Aaron; Feldmann, Carolin; Frangi, Alejandro F.; Full, Peter M.; van Ginneken, Bram; Hanbury, Allan; Honauer, Katrin; Kozubek, Michal; Landman, Bennett A.; März, Keno; Maier, Oskar; Maier-Hein, Klaus; Menze, Bjoern H.; Müller, Henning; Neher, Peter F.; Niessen, Wiro; Rajpoot, Nasir; Sharp, Gregory C.; Sirinukunwattana, Korsuk; Speidel, Stefanie; Stock, Christian; Stoyanov, Danail; Taha, Abdel Aziz; van der Sommen, Fons; Wang, Ching-Wei; Weber, Marc-André; Zheng, Guoyan; Jannin, Pierre; Kopp-Schneider, Annette

doi:10.1038/s41467-018-07619-7

Computer Science > Computer Vision and Pattern Recognition

arXiv:1806.02051 (cs)

[Submitted on 6 Jun 2018 (v1), last revised 18 Sep 2019 (this version, v2)]

Title:Why rankings of biomedical image analysis competitions should be interpreted with care

View PDF

Abstract:International challenges have become the standard for validation of biomedical image analysis methods. Given their scientific impact, it is surprising that a critical analysis of common practices related to the organization of challenges has not yet been performed. In this paper, we present a comprehensive analysis of biomedical image analysis challenges conducted up to now. We demonstrate the importance of challenges and show that the lack of quality control has critical consequences. First, reproducibility and interpretation of the results is often hampered as only a fraction of relevant information is typically provided. Second, the rank of an algorithm is generally not robust to a number of variables such as the test data used for validation, the ranking scheme applied and the observers that make the reference annotations. To overcome these problems, we recommend best practice guidelines and define open research questions to be addressed in the future.

Comments:	Article published in Nature Communications: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1806.02051 [cs.CV]
	(or arXiv:1806.02051v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1806.02051
Journal reference:	Nature communications 9.1 (2018): 5217
Related DOI:	https://doi.org/10.1038/s41467-018-07619-7

Submission history

From: Lena Maier-Hein [view email]
[v1] Wed, 6 Jun 2018 08:13:27 UTC (1,206 KB)
[v2] Wed, 18 Sep 2019 11:32:07 UTC (1,206 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Why rankings of biomedical image analysis competitions should be interpreted with care

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Why rankings of biomedical image analysis competitions should be interpreted with care

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators