Potential of Artificial Intelligence for Bone Age Assessment in Iranian Children and Adolescents: An Exploratory Study

Mehrzad Lotfi; Nahid Abolpour; Mohammadreza Ghasemi; Hajar Heydari; Reza Pourghayumi

doi:10.34172/aim.32070

Arch Iran Med. 2025;28(4): 198-206.
doi: 10.34172/aim.32070

PMID: 40382691
PMCID: PMC12085795

Abstract View: 1100

PDF Download: 811

Full Text View: 524

Original Article

Potential of Artificial Intelligence for Bone Age Assessment in Iranian Children and Adolescents: An Exploratory Study

Mehrzad Lotfi ^1,2 , Nahid Abolpour ², Mohammadreza Ghasemi ³ , Hajar Heydari ¹, Reza Pourghayumi ¹^*

¹ Department of Radiology, Medical Imaging Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
² Department of Artificial Intelligence, Shiraz University of Medical Sciences, Shiraz, Iran
³ School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran

*Corresponding Author: Reza Pourghayum, Email: rpurghayumi@gmail.com

Abstract

Background: To investigate whether the bone age (BA) of Iranian children could be accurately assessed via an artificial intelligence (AI) system. Accurate assessment of skeletal maturity is crucial for diagnosing and treating various musculoskeletal disorders, and is traditionally achieved through manual comparison with the Greulich-Pyle atlas. This process, however, is subjective and time-consuming. Recent advances in deep learning offer more efficient and consistent BA evaluations.

Methods: From left-hand radiographs of children aged 1–18 years who presented to a tertiary research hospital, 555 radiographs (220 boys and 335 girls) were collected. The reference BA was determined via the Greulich and Pyle (GP) method by two radiologists in consensus. The BA was then estimated to use a deep learning model specifically developed for this population. Model performance was evaluated using multiple metrics: Mean square error (MSE), mean absolute error (MAE), intra-class correlation coefficient (ICC), and 95% limits of agreement (LoA). Gender-specific results were analyzed separately.

Results: The model demonstrated acceptable accuracy. For boys, MSE was 0.55 years, MAE was 0.59 years, ICC was 0.74, and the 95% LoA ranged from -0.8 to 1.2 years. For girls, MSE was 0.59 years, MAE was 0.61 years, ICC was 0.82, and the 95% LoA ranged from -0.6 to 1.0 years. These results indicate stronger predictive accuracy for girls compared to boys.

Conclusion: Our findings demonstrate that the proposed deep learning model achieves reasonable accuracy in BA assessment, with stronger performance in girls compared to boys. However, the relatively wide 95% LoA, particularly for boys, and prediction errors at the extremes of the age range highlight the need for further refinement and validation. While the model shows potential as a supplementary tool for clinicians, future studies should focus on improving prediction accuracy, reducing variability, and validating the model on larger, more diverse datasets before considering widespread clinical implementation. Additionally, addressing edge cases and specific conditions that a human reviewer may detect but the model might overlook, will be essential for enhancing its clinical reliability.

Keywords: Artificial intelligence, Bone age, Deep learning, Neural network

Cite this article as: Lotfi M, Abolpour N, Ghasemi M, Heydari H, Pourghayumi R. Potential of artificial intelligence for bone age assessment in Iranian children and adolescents: an exploratory study. Arch Iran Med. 2025;28(4):198-206. doi: 10.34172/aim.32070