Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование
Towards fair decentralized benchmarking of healthcare AI algorithms with the Federated Tumor Segmentation (FeTS) challenge. / Pati, Sarthak; Linardos, Akis; Edwards, Brandon и др.
в: Nature Communications, Том 16, № 1, 6274, 08.07.2025.Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование
}
TY - JOUR
T1 - Towards fair decentralized benchmarking of healthcare AI algorithms with the Federated Tumor Segmentation (FeTS) challenge
AU - Pati, Sarthak
AU - Linardos, Akis
AU - Edwards, Brandon
AU - Sheller, Micah
AU - Foley, Patrick
AU - Aristizabal, Alejandro
AU - Zimmerer, David
AU - Gruzdev, Alexey
AU - Martin, Jason
AU - Shinohara, Russell T.
AU - Reinke, Annika
AU - Isensee, Fabian
AU - Parampottupadam, Santhosh
AU - Parekh, Kaushal
AU - Floca, Ralf
AU - Kassem, Hasan
AU - Baheti, Bhakti
AU - Thakur, Siddhesh
AU - Kushibar, Kaisar
AU - Lekadir, Karim
AU - Jiang, Meirui
AU - Yin, Youtan
AU - Yang, Hongzheng
AU - Liu, Quande
AU - Chen, Cheng
AU - Dou, Qi
AU - Heng, Pheng Ann
AU - Zhang, Xiaofan
AU - Zhang, Shaoting
AU - Khan, Muhammad Irfan
AU - Azeem, Mohammad Ayyaz
AU - Jafaritadi, Mojtaba
AU - Alhoniemi, Esa
AU - Kontio, Elina
AU - Khan, Suleiman A.
AU - Mächler, Leon
AU - Ezhov, Ivan
AU - Kofler, Florian
AU - Shit, Suprosanna
AU - Paetzold, Johannes C.
AU - Loehr, Timo
AU - Wiestler, Benedikt
AU - Peiris, Himashi
AU - Pawar, Kamlesh
AU - Zhong, Shenjun
AU - Chen, Zhaolin
AU - Hayat, Munawar
AU - Egan, Gary
AU - Harandi, Mehrtash
AU - Isik Polat, Ece
AU - Polat, Gorkem
AU - Kocyigit, Altan
AU - Temizel, Alptekin
AU - Tuladhar, Anup
AU - Tyagi, Lakshay
AU - Souza, Raissa
AU - Forkert, Nils D.
AU - Mouches, Pauline
AU - Wilms, Matthias
AU - Shambhat, Vishruth
AU - Maurya, Akansh
AU - Danannavar, Shubham Subhas
AU - Kalla, Rohit
AU - Anand, Vikas Kumar
AU - Krishnamurthi, Ganapathy
AU - Nalawade, Sahil
AU - Ganesh, Chandan
AU - Wagner, Ben
AU - Reddy, Divya
AU - Das, Yudhajit
AU - Yu, Fang F.
AU - Fei, Baowei
AU - Madhuranthakam, Ananth J.
AU - Maldjian, Joseph
AU - Singh, Gaurav
AU - Ren, Jianxun
AU - Zhang, Wei
AU - An, Ning
AU - Hu, Qingyu
AU - Zhang, Youjia
AU - Zhou, Ying
AU - Siomos, Vasilis
AU - Tarroni, Giacomo
AU - Passerrat-Palmbach, Jonathan
AU - Rawat, Ambrish
AU - Zizzo, Giulio
AU - Kadhe, Swanand Ravindra
AU - Epperlein, Jonathan P.
AU - Braghin, Stefano
AU - Tuchinov, Bair
A2 - Maier-Hein, Klaus
A2 - Bakas, Spyridon
N1 - We would like to thank Manuel Wiesenfarth and Paul F. Jäger (DKFZ) for helpful discussions. Research reported in this publication was partly funded by the Helmholtz Association (HA) within the project “Trustworthy Federated Data Analytics” (TFDA) (funding number ZT-I-OO1 4), and partly by the National Institutes of Health (NIH), under award numbers NCI:U01CA242871 (PI: S.Bakas) and NCI:U24CA279629 (PI: S.Bakas). K. Kushibar holds the Juan de la Cierva fellowship with a reference number FJC2021-047659-I. This work was supported in part by Hong Kong Research Grants Council Project No. T45- 401/22-N. Team HT-TUAS was partly funded by Business Finland under Grant 33961/31/2020. They also acknowledges the CSC-Puhti super-computer for their support and computational resources during FeTS 2021 and 2022. N. D. Forkert was supported by the Canadian Institutes of Health Research (CIHR Project Grant 462169). Jakub Nalepa was supported by the Silesian University of Technology funds through the Excellence Initiative–Research University program (Grant 02/080/SDU/10-21-01), and by the Silesian University of Technology funds through the grant for maintaining and developing research potential. Research reported in this publication was partly funded by R21EB030209, NIH/NIBIB (PI: Y. Yuan), UL1TR001433, NIH/NCATS, a research grant from Varian Medical Systems (Palo Alto, CA, USA) (PI: Y. Yuan). Y. Yuan also acknowledges the generous support of Herbert and Florence Irving/the Irving Trust. Z. Jiang was supported by National Cancer Institute (UG3 CA236536). H. Mohy-ud-Din was supported by a grant from the Higher Education Commission of Pakistan as part of the National Center for Big Data and Cloud Computing and the Clinical and Translational Imaging Lab at LUMS. M. Kozubek was supported by the Ministry of Health of the Czech Republic (grant NU21-08-00359 and conceptual development of research organization FNBr-65269705) and Ministry of Education, Youth and Sports of the Czech Republic (Project LM2023050). Václav Vybíhal was supported by MH CZ - DRO (FNBr, 65269705). Y. Gusev was supported by CCSG Grant number: NCI P30 CA51008. P. Vollmuth was supported by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - Project-ID 404521405, SFB 1389 - UNITE Glioblastoma, Work Package C02, and Priority Programme 2177 “Radiomics: Next Generation of Biomedical Imaging” (KI 2410/1-1 ∣ MA 6340/18-1). B. Landman was supported by NSF 2040462. A. Rao was supported by the NIH (R37CA214955-01A1). A. Falcão was supported by CNPq 304711/2023-3. P. Guevara was supported by the ANID-Basal proyects AFB240002 (AC3E) and FB210017 (CENIA). Research reported in this publication was partly funded by the NSF Convergence Accelerator - Track D: ImagiQ: Asynchronous and Decentralized Federated Learning for Medical Imaging, Grant Number: 2040532, and R21CA270742 (Period of Funding: 09/15/20 - 05/31/21). Martin Vallières acknowledges funding from the Canada CIFAR AI Chairs Program. Stuart Currie receives salary support from a Leeds Hospitals Charity (9R01/1403) and Cancer Research UK (C19942/A28832) grants. Kavi Fatania is a 4ward North Clinical PhD fellow funded by Wellcome award (203914/Z/16/Z). Russell Frood is a Clinical Trials Fellow funded by CRUK (RCCCTF-Oct22/100002). This work was funded in part by National Institutes of Health R01CA233888 and the grant NCI:U24CA248265. The content of this publication is solely the responsibility of the authors and does not represent the official views of the HA, or the NIH. U.Baid, S.Pati, and S.Bakas conducted part of the work reported in this manuscript at their current affiliations, as well as while they were affiliated with the Center for Artificial Intelligence and Data Science for Integrated Diagnostics (AI2D) and the Center for Biomedical Image Computing and Analytics (CBICA) at the University of Pennsylvania, Philadelphia, PA, USA.
PY - 2025/7/8
Y1 - 2025/7/8
N2 - Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test datasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this end, the Federated Tumor Segmentation (FeTS) Challenge represents the paradigm for real-world algorithmic performance evaluation. The FeTS challenge is a competition to benchmark (i) federated learning aggregation algorithms and (ii) state-of-the-art segmentation algorithms, across multiple international sites. Weight aggregation and client selection techniques were compared using a multicentric brain tumor dataset in realistic federated learning simulations, yielding benefits for adaptive weight aggregation, and efficiency gains through client sampling. Quantitative performance evaluation of state-of-the-art segmentation algorithms on data distributed internationally across 32 institutions yielded good generalization on average, albeit the worst-case performance revealed data-specific modes of failure. Similar multi-site setups can help validate the real-world utility of healthcare AI algorithms in the future.
AB - Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test datasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this end, the Federated Tumor Segmentation (FeTS) Challenge represents the paradigm for real-world algorithmic performance evaluation. The FeTS challenge is a competition to benchmark (i) federated learning aggregation algorithms and (ii) state-of-the-art segmentation algorithms, across multiple international sites. Weight aggregation and client selection techniques were compared using a multicentric brain tumor dataset in realistic federated learning simulations, yielding benefits for adaptive weight aggregation, and efficiency gains through client sampling. Quantitative performance evaluation of state-of-the-art segmentation algorithms on data distributed internationally across 32 institutions yielded good generalization on average, albeit the worst-case performance revealed data-specific modes of failure. Similar multi-site setups can help validate the real-world utility of healthcare AI algorithms in the future.
UR - https://www.mendeley.com/catalogue/46270516-fb74-3b8b-a755-186bd89f038e/
UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105010224653&origin=inward
U2 - 10.1038/s41467-025-60466-1
DO - 10.1038/s41467-025-60466-1
M3 - Article
VL - 16
JO - Nature Communications
JF - Nature Communications
SN - 2041-1723
IS - 1
M1 - 6274
ER -
ID: 68460939