DOI: 10.1093/jamia/ocad159 ISSN:

Evaluation of crowdsourced mortality prediction models as a framework for assessing artificial intelligence in medicine

Timothy Bergquist, Thomas Schaffter, Yao Yan, Thomas Yu, Justin Prosser, Jifan Gao, Guanhua Chen, Łukasz Charzewski, Zofia Nawalany, Ivan Brugere, Renata Retkute, Alidivinas Prusokas, Augustinas Prusokas, Yonghwa Choi, Sanghoon Lee, Junseok Choe, Inggeol Lee, Sunkyu Kim, Jaewoo Kang, Sean D Mooney, Justin Guinney, Aaron Lee, Ali Salehzadeh-Yazdi, Alidivinas Prusokas, Anand Basu, Anas Belouali, Ann-Kristin Becker, Ariel Israel, Augustinas Prusokas, B Winter, Carlos Vega Moreno, Christoph Kurz, Dagmar Waltemath, Darius Schweinoch, Enrico Glaab, Gang Luo, Guanhua Chen, Helena U Zacharias, Hezhe Qiao, Inggeol Lee, Ivan Brugere, Jaewoo Kang, Jifan Gao, Julia Truthmann, JunSeok Choe, Kari A Stephens, Lars Kaderali, Lav R Varshney, Marcus Vollmer, Maria-Theodora Pandi, Martin L Gunn, Meliha Yetisgen, Neetika Nath, Noah Hammarlund, Oliver Müller-Stricker, Panagiotis Togias, Patrick J Heagerty, Peter Muir, Peter Banda, Renata Retkute, Ron Henkel, Sagar Madgi, Samir Gupta, Sanghoon Lee, Sean Mooney, Shabeeb Kannattikuni, Shamim Sarhadi, Shikhar Omar, Shuo Wang, Soumyabrata Ghosh, Stefan Neumann, Stefan Simm, Subha Madhavan, Sunkyu Kim, Thomas Von Yu, Venkata Satagopam, Vikas Pejaver, Yachee Gupta, Yonghwa Choi, Zofia Nawalany, Łukasz Charzewski, Aaron Lee, Ali Salehzadeh-Yazdi, Alidivinas Prusokas, Anand Basu, Anas Belouali, Ann-Kristin Becker, Ariel Israel, Augustinas Prusokas, B Winter, Carlos Vega Moreno, Christoph Kurz, Dagmar Waltemath, Darius Schweinoch, Enrico Glaab, Gang Luo, Guanhua Chen, Helena U Zacharias, Hezhe Qiao, Inggeol Lee, Ivan Brugere, Jaewoo Kang, Jifan Gao, Julia Truthmann, JunSeok Choe, Kari A Stephens, Lars Kaderali, Lav R Varshney, Marcus Vollmer, Maria-Theodora Pandi, Martin L Gunn, Meliha Yetisgen, Neetika Nath, Noah Hammarlund, Oliver Müller-Stricker, Panagiotis Togias, Patrick J Heagerty, Peter Muir, Peter Banda, Renata Retkute, Ron Henkel, Sagar Madgi, Samir Gupta, Sanghoon Lee, Sean Mooney, Shabeeb Kannattikuni, Shamim Sarhadi, Shikhar Omar, Shuo Wang, Soumyabrata Ghosh, Stefan Neumann, Stefan Simm, Subha Madhavan, Sunkyu Kim, Thomas Von Yu, Venkata Satagopam, Vikas Pejaver, Yachee Gupta, Yonghwa Choi, Zofia Nawalany, Łukasz Charzewski, Aaron Lee, Ali Salehzadeh-Yazdi, Alidivinas Prusokas, Anand Basu, Anas Belouali, Ann-Kristin Becker, Ariel Israel, Augustinas Prusokas, B Winter, Carlos Vega Moreno, Christoph Kurz, Dagmar Waltemath, Darius Schweinoch, Enrico Glaab, Gang Luo, Guanhua Chen, Helena U Zacharias, Hezhe Qiao, Inggeol Lee, Ivan Brugere, Jaewoo Kang, Jifan Gao, Julia Truthmann, JunSeok Choe, Kari A Stephens, Lars Kaderali, Lav R Varshney, Marcus Vollmer, Maria-Theodora Pandi, Martin L Gunn, Meliha Yetisgen, Neetika Nath, Noah Hammarlund, Oliver Müller-Stricker, Panagiotis Togias, Patrick J Heagerty, Peter Muir, Peter Banda, Renata Retkute, Ron Henkel, Sagar Madgi, Samir Gupta, Sanghoon Lee, Sean Mooney, Shabeeb Kannattikuni, Shamim Sarhadi, Shikhar Omar, Shuo Wang, Soumyabrata Ghosh, Stefan Neumann, Stefan Simm, Subha Madhavan, Sunkyu Kim, Thomas Von Yu, Venkata Satagopam, Vikas Pejaver, Yachee Gupta, Yonghwa Choi, Zofia Nawalany, Łukasz Charzewski, Aaron Lee, Ali Salehzadeh-Yazdi, Alidivinas Prusokas, Anand Basu, Anas Belouali, Ann-Kristin Becker, Ariel Israel, Augustinas Prusokas, B Winter, Carlos Vega Moreno, Christoph Kurz, Dagmar Waltemath, Darius Schweinoch, Enrico Glaab, Gang Luo, Guanhua Chen, Helena U Zacharias, Hezhe Qiao, Inggeol Lee, Ivan Brugere, Jaewoo Kang, Jifan Gao, Julia Truthmann, JunSeok Choe, Kari A Stephens, Lars Kaderali, Lav R Varshney, Marcus Vollmer, Maria-Theodora Pandi, Martin L Gunn, Meliha Yetisgen, Neetika Nath, Noah Hammarlund, Oliver Müller-Stricker, Panagiotis Togias, Patrick J Heagerty, Peter Muir, Peter Banda, Renata Retkute, Ron Henkel, Sagar Madgi, Samir Gupta, Sanghoon Lee, Sean Mooney, Shabeeb Kannattikuni, Shamim Sarhadi, Shikhar Omar, Shuo Wang, Soumyabrata Ghosh, Stefan Neumann, Stefan Simm, Subha Madhavan, Sunkyu Kim, Thomas Von Yu, Venkata Satagopam, Vikas Pejaver, Yachee Gupta, Yonghwa Choi, Zofia Nawalany, Łukasz Charzewski, Aaron Lee, Ali Salehzadeh-Yazdi, Alidivinas Prusokas, Anand Basu, Anas Belouali, Ann-Kristin Becker, Ariel Israel, Augustinas Prusokas, B Winter, Carlos Vega Moreno, Christoph Kurz, Dagmar Waltemath, Darius Schweinoch, Enrico Glaab, Gang Luo, Guanhua Chen, Helena U Zacharias, Hezhe Qiao, Inggeol Lee, Ivan Brugere, Jaewoo Kang, Jifan Gao, Julia Truthmann, JunSeok Choe, Kari A Stephens, Lars Kaderali, Lav R Varshney, Marcus Vollmer, Maria-Theodora Pandi, Martin L Gunn, Meliha Yetisgen, Neetika Nath, Noah Hammarlund, Oliver Müller-Stricker, Panagiotis Togias, Patrick J Heagerty, Peter Muir, Peter Banda, Renata Retkute, Ron Henkel, Sagar Madgi, Samir Gupta, Sanghoon Lee, Sean Mooney, Shabeeb Kannattikuni, Shamim Sarhadi, Shikhar Omar, Shuo Wang, Soumyabrata Ghosh, Stefan Neumann, Stefan Simm, Subha Madhavan, Sunkyu Kim, Thomas Von Yu, Venkata Satagopam, Vikas Pejaver, Yachee Gupta, Yonghwa Choi, Zofia Nawalany, Łukasz Charzewski, Aaron Lee, Ali Salehzadeh-Yazdi, Alidivinas Prusokas, Anand Basu, Anas Belouali, Ann-Kristin Becker, Ariel Israel, Augustinas Prusokas, B Winter, Carlos Vega Moreno, Christoph Kurz, Dagmar Waltemath, Darius Schweinoch, Enrico Glaab, Gang Luo, Guanhua Chen, Helena U Zacharias, Hezhe Qiao, Inggeol Lee, Ivan Brugere, Jaewoo Kang, Jifan Gao, Julia Truthmann, JunSeok Choe, Kari A Stephens, Lars Kaderali, Lav R Varshney, Marcus Vollmer, Maria-Theodora Pandi, Martin L Gunn, Meliha Yetisgen, Neetika Nath, Noah Hammarlund, Oliver Müller-Stricker, Panagiotis Togias, Patrick J Heagerty, Peter Muir, Peter Banda, Renata Retkute, Ron Henkel, Sagar Madgi, Samir Gupta, Sanghoon Lee, Sean Mooney, Shabeeb Kannattikuni, Shamim Sarhadi, Shikhar Omar, Shuo Wang, Soumyabrata Ghosh, Stefan Neumann, Stefan Simm, Subha Madhavan, Sunkyu Kim, Thomas Von Yu, Venkata Satagopam, Vikas Pejaver, Yachee Gupta, Yonghwa Choi, Zofia Nawalany, Łukasz Charzewski,
  • Health Informatics

Abstract

Objective

Applications of machine learning in healthcare are of high interest and have the potential to improve patient care. Yet, the real-world accuracy of these models in clinical practice and on different patient subpopulations remains unclear. To address these important questions, we hosted a community challenge to evaluate methods that predict healthcare outcomes. We focused on the prediction of all-cause mortality as the community challenge question.

Materials and methods

Using a Model-to-Data framework, 345 registered participants, coalescing into 25 independent teams, spread over 3 continents and 10 countries, generated 25 accurate models all trained on a dataset of over 1.1 million patients and evaluated on patients prospectively collected over a 1-year observation of a large health system.

Results

The top performing team achieved a final area under the receiver operator curve of 0.947 (95% CI, 0.942-0.951) and an area under the precision-recall curve of 0.487 (95% CI, 0.458-0.499) on a prospectively collected patient cohort.

Discussion

Post hoc analysis after the challenge revealed that models differ in accuracy on subpopulations, delineated by race or gender, even when they are trained on the same data.

Conclusion

This is the largest community challenge focused on the evaluation of state-of-the-art machine learning methods in a healthcare system performed to date, revealing both opportunities and pitfalls of clinical AI.

More from our Archive