Abstract
Accurate differentiation between skull fractures and sutures is challenging in young children. Traditional diagnostic modalities like computed tomography involve ionizing radiation, while sonography is safer but demands expertise. This study explores the application of artificial intelligence (AI) to improve diagnostic accuracy in this context. A retrospective study utilized sonographic images of 86 children (mean age: 8.5 months) presenting with suspected skull fractures was performed. The AI approach included binary classification and object localization, with tenfold cross-validation applied to 385 images. The study compared AI performance against nine raters with varying expertise, with and without AI assistance. EfficientNet demonstrated superior classification metrics, with the B6 variant achieving the highest F1 score (0.841) and PR AUC (0.913). YOLOv11 models underperformed compared to EfficientNet in detecting fractures and sutures. Raters significantly benefited from AI-assisted diagnostics, with F1 scores improving from 0.749 (unassisted) to 0.833 (assisted). AI models consistently outperformed unassisted human raters. This study presents the first AI model differentiating skull fractures from sutures on pediatric sonographic images, highlighting AI’s potential to enhance diagnostic accuracy. Future efforts should focus on expanding datasets, validating AI models on independent cohorts, and exploring dynamic sonographic data to improve the diagnostic impact.
Introduction
Head trauma is one of the leading causes of morbidity and mortality in children, and skull fractures represent a significant proportion of injuries in younger age groups. The incidence of skull fractures in outpatient pediatric head trauma cases ranges from 2 to 20%, with infants being particularly vulnerable due to the relatively soft nature of their cranial bones1,2. In particular, more than half of the fractures diagnosed in infants are skull fractures, underscoring the critical need for accurate and timely diagnostic strategies to guide appropriate clinical interventions3,4.
Differentiating skull fractures from normal sutures in children poses a unique diagnostic challenge. Normal sutures, including their developmental variants, can mimic fractures in imaging, complicating the interpretation of diagnostic studies5. Current institutional protocols frequently rely on computed tomography (CT) as the gold standard for imaging in suspected skull fractures6. Although CT provides high resolution imaging, it exposes young patients to ionizing radiation, raising concerns about cumulative radiation risks. Magnetic resonance imaging (MRI), an alternative modality, often requires sedation or even general anesthesia in this population, limiting its feasibility in urgent or outpatient settings7,8. Moreover, despite recent advances9, the accuracy of MRI in the detection of linear fractures in young children and fractures of aerated bone remains limited10.
Ultrasound has become a safe, accessible, and radiation-free diagnostic tool, particularly for bedside evaluations. Its application in differentiating cranial fractures from sutures has shown promise in pediatric trauma care11. However, accurate interpretation of sonographic images often necessitates substantial expertise in pediatric imaging and lack of patient compliance may limit the interpretability and increase the examination duration. Variations in operator experience and the inherent complexity of sonographic image interpretation have hindered its broader adoption as a primary diagnostic modality12,13,14,15.
Artificial intelligence (AI) has demonstrated remarkable potential in enhancing diagnostic accuracy across various medical imaging applications. In adult trauma, AI models based on convolutional neural networks (CNNs) have shown superior sensitivity and specificity in fracture detection examining native radiographs or CT images16,17. Despite these advancements, pediatric applications of AI remain underexplored15. The anatomical variability of cranial sutures in children, combined with their developmental differences, presents additional complexities for AI-based diagnostic tools18,19,20.
To date, no AI model has been developed to differentiate skull fractures from sutures using sonographic images in pediatric patients. Given the unique challenges of this task, there is a critical need to explore AI’s capabilities in this context. An effective AI-based solution could significantly improve the diagnostic process, reducing reliance on CT and its associated risks, while improving diagnostic accessibility in resource-limited settings.
This study hypothesizes that AI models can reliably differentiate skull fractures from normal sutures in pediatric sonographic images and that these models will perform similarly to, or better than, human experts. The aims of this study are to evaluate the diagnostic accuracy of AI models in this context, compare their performance to human raters, and assess the potential of AI-assisted diagnostics to improve clinical decision-making in pediatric head trauma cases.
Methods
In this two-institutional retrospective study, ultrasound examinations of 213 (94 female and 119 male) children with clinical suspicion of acute traumatic brain injury and a request for a skull ultrasound evaluation were selected to determine eligibility to be included in the data set. Center A was the Department of Pediatric and Adolescent Surgery of the Medical University of Graz, Austria and center B was the Department of Pediatrics of the Hospital Hochsteiermark, Leoben, Austria. The institutional protocols for neurologically unremarkable children and the clinical suspicion of a skull fracture (such as swelling, hematoma) include ultrasound and in case of fractures inpatient admission for 48 h. Only if patients become neurologically suspicious for intracranial injuries, a cranial CT is performed. These protocols are based on reports that have shown high accuracy of ultrasound for diagnosis of skull fractures in children21,22. Linear probes were used and all calvarial sectors with a special focus on the clinically suspicious areas were examined in at least two imaging planes.
After the exclusion of improper studies, insufficient ultrasound documentation and poor image quality (mainly due to lack of patient compliance in terms of motion related blurring of the images), 86 patients (mean age 8.5 SD ± 8.3 months) were included in the further analysis. All of these images showed a disruption of the tabula externa. All images and reports were retrospectively assessed by a pediatric radiologist with 12 years of experience. Based on this, the ground truth in deciding between the absence or presence of a fracture was defined. 50 of the 86 patients suffered a skull fracture (mean age 10.3 ± SD 10.8 months), while the remaining 30 patients demonstrated normal sutures (mean age 8.3 ± SD 7.2 months). Three out of the 50 patients with a skull fracture became neurologically suspicious and received a cranial CT confirming the skull fractures.
385 individual images showing a disruption of the tabula externa could be used for training, validation, and testing. Due to the limited total number of images available and the pilot nature of this publication, we used k-fold cross-validation (k = 10). Testing results are displayed for the separated data of a given fold, no separate test dataset was used. A flowchart depicting the study dataset is shown in Fig. 1.
Flowchart depicting the study dataset and splits used during training, validation and testing of the EfficientNet and YOLOv11 neural networks used in this study. A limited number of images was available to conduct this pilot study, which we compensated by using tenfold cross validation. Center A = Graz, Center B = Leoben.
Image acquisition
Ultrasound examinations were conducted utilizing sonography equipment from a range of vendors. Ultrasound machines in use were Siemens Juniper (10 MHz linear probe), Siemens S2000 and S3000 (9 MHz linear probe and 18 MHz linear probe), GE Logiq E9 and GE Logiq E10 (15 MHz linear probes).
Cranial fractures were assessed in point-of-care studies employing high-frequency linear probes, with an abundant application of ultrasound gel. The resultant images were archived as Digital Imaging and Communications in Medicine (DICOM) files within the Picture Archiving and Communication Systems (PACS) of the participating institutions.
Image processing
DICOM studies were retrieved from the PACS and subsequently converted to Portable Network Graphics (PNG) format. The original DICOM images were converted to 8bit RGB PNG files and saved to disk for subsequent analysis. Ultrasound images were anonymized by removing digital embossings from the images. Throughout these steps, the images retained their original height and width. During model training, the processed images were dynamically re-scaled to the required input sizes.
Task 1: classification using EfficientNet
For the binary classification task, the EfficientNet neural network23 was selected due to its proven performance in various medical imaging applications24,25,26 and due to past experiences. To ensure the robustness of the experiment, all variants of EfficientNet (B0 to B7) were trained and tested, reducing the potential bias associated with image input resolution.
Task 2: localization using YOLOv11
We used the Ultralytics YOLOv11 library version 8.3.5127 and all of its standard model variants ”n”, ”s”, ”m”, ”l”, and ”x” to assess detection performance of fractures and sutures within the images. Bounding boxes around the objects of interest served as ground truth. Object labels were provided in terms of text files for each image. The images were resized to 640 by 640 pixels before training.
Task 3: human test set
A randomized selection comprising fifty percent of the data set was enhanced with AI classification predictions, distinguishing between sutures and fractures. The prediction result was burned into the image pixels of the selected samples in the left lower corner serving as a hint to human raters. In order to facilitate a comparison of the AI model with human raters, the total dataset was independently assessed by a cohort of 9 raters. Rater 1 (BM) was a pediatric surgeon with 8 years, rater 2 (EF) with 4 years, rater 3 (HE) with 9 years, rater 4 (HT) with 33 years and rater 5 (TP) with 25 years and rater 6 (UW) with 12 years of experience in interpreting trauma imaging. Rater 7 (ST) was a pediatric radiologist with 12 years, rater 8 (MS) with 2 years, and rater 9 (NS) with 4 years of experience in pediatric trauma imaging. Since performing ultrasound examinations is part of the training for pediatric surgery in our country the initial sonographic evaluations in our department are performed by pediatric surgeons and not exclusively by pediatric radiologists. Therefore, also the pediatric surgeons are experienced in interpreting skull sonographies.
The image data was provided on a private Linux server instance running the CVAT Labeling Platform v2.19.128 for binary annotation (fracure, suture). The generated labels were then linked to the ground truth. The same test metrics (see below) were applied as for AI neural networks.
Model training
Model training was conducted on a Linux workstation equipped with two Nvidia RTX 4090 graphics cards, each with 24 GB of video memory. The system featured an Intel Core i7-13700 K processor and 64 GB of RAM. The experiments were implemented using Python 3.10 on an Ubuntu 22.04 LTS platform. EfficientNet models were trained using the FastAI Python library (v2.7.15)29. The validation sets were randomly split from all training data for each fold with 20%.
Test metrics
We calculated widely recognized and commonly utilized AI performance metrics based on the rate of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).
In addition, Precision-Recall AUC analyzes (PR AUC) including 95% confidence intervals (CI) were performed based on the ground truth values and the prediction probabilities. The scikit-learn package version 1.5.230 and matplotlib 3.9.231 were used to calculate PR AUC values and to plot the curves in Python 3.11.9. We used macro-averaging for all metrics to compensate for class imbalances32.
Statistical analyses
Statistical analyses were performed to compare the performance of CNN models with that of nine human experts. The classification metrics described above were calculated for the AI model as well as for each expert. All data were checked for consistency and missing values to ensure the validity of the analyzes. We used paired-sample Wilcoxon rank sum tests to demonstrate differences in the described groups, as normal distributions could not be ensured. A level of p < 0.05 was considered statistically significant. Statistical analyses were performed with SPSS version 27.0.1.0 statistics software (IBM, Armonk, NY, USA).
Ethical statement
The ethics committee of the Medical University of Graz (IRB00002556) gave an affirmative vote for the retrospective data analysis (Ref. No. 36-024 ex 23/24). Informed consent from the patient or legal representative was waived for the retrospective data analysis. All experiments were carried out according to local legal regulations and the Declaration of Helsinki.
Results
The classification between fractures and cranial sutures using EfficientNet yielded a precision of 0.849 ± 0.083, recall of 0.756 ± 0.027, an accuracy of 0.770 ± 0.048, and an F1 score of 0.796 ± 0.051 averaged across the eight variants (B0 to B7).
Task 1: EfficientNet variant metrics
EfficientNet-B6 demonstrated the highest area under the curve score in terms of correctly classifying between cranial sutures and fractures with PR AUC values of 0.913 (0.870—0.948; 95% CI), while B0 performed worst with PR AUC of 0.696 (0.615—0.771; 95% CI). The highest F1 score was achieved by the EfficientNet-B6 variant with 0.841, the lowest in EfficientNet-B0 with 0.668. The performance metrics achieved by EfficientNet are demonstrated in Table 1 and Fig. 2.
Task 2: YOLOv11 metrics
The five YOLOv11 variants demonstrated lower performance characteristics in direct comparison to image classification. Precision was 0.720 ± 0.081, Recall 0.602 ± 0.057, mAP@50% was 0.642 ± 0.049, and mAP@50–95% was 0.233 ± 0.021. The detection results are graphically displayed in Fig. 3.
Task 3: human and AI-assisted performance
With the exception of one rater (rater 2, pediatric surgeon with 4 years experience), the performance metrics of human raters increased when aided by AI. All of these differences were statistically significant with respect to performance metrics (Table 2). Without AI support, none of the human raters (F1 scores ranging between 0.604 and 0.818) exceeded the best-performing EfficientNetB6 variant with an F1 score of 0.841. The average F1 score of all EfficientNet variants (0.796 ± 0.051) was higher than the average human rating without AI assistance (0.749 ± 0.064). In contrast, AI decision support helped seven out of nine human raters achieve F1 scores better (ranging from 0.853 to 0.919) than the best-performing EfficientNet variant B6 (0.841).
Discussion
This study aimed to determine the ability of AI to distinguish between cranial sutures and skull fractures on pediatric ultrasound examinations. A data set comprised of retrospectively collected ultrasound images from two trauma centers was used. The images were manually annotated before training the AI models using tenfold crossvalidation. The prediction results exhibited variability across different models and input dimensions. The accuracy of human raters in detecting the presence or absence of skull fractures improved significantly when assisted by AI support.
The novelty of our study lies in the systematic processing of a data set of different pediatric ultrasound images of the skull to correctly predict whether cranial sutures or fractures are present in the examination. Some approaches have been published, for example, detecting skull fracture in X-ray examinations using AI33. Unlike previous studies, we used standard 2D ultrasound in conjunction with AI for the first time to assist physicians in detecting pediatric skull fractures with sonography.
For the detection of skull fractures in adults, the support of AI algorithms has demonstrated enhanced sensitivity and specificity, often outperforming traditional diagnostic methods when trained on large, well-curated datasets19. AI models, particularly those employing segmentation approaches such as U-Net, have been effective in improving the accuracy of medical imaging interpretation. A recent study on binary classification systems for skull fracture detection using CT scans has shown that AI-assisted radiology led to faster diagnostic times and improved accuracy compared to traditional methods11.
However, pediatric skull fractures present additional complexities due to the anatomical variability of sutures in infants. The ability of AI to improve diagnostic accuracy in this area seems especially challenging and promising. Lu et al. demonstrated that AI could reduce diagnostic errors and assist in differentiating fractures from normal anatomical variations, such as growth sutures in infants in a CT data set11. Nonetheless, standard imaging modalities for detecting skull fractures, such as CT and MRI, either expose children to radiation or require anesthesia. In contrast, sonography offers a safer and readily available alternative that avoids both radiation exposure and the need for anesthesia34. In addition, it can be used on the bedside in terms of point of care ultrasound (POCUS)35. Developing AI models to differentiate between skull fractures and normal sutures using sonography could provide a non-invasive, accurate solution for pediatric trauma diagnosis.
Our study introduces the first AI/ML algorithm specifically trained on sonographic images to differentiate between skull fractures and normal sutures in infants and young children (examples depicted in Fig. 4). This proof of concept addresses a critical gap in AI applications tailored to sonographic imaging in pediatric patients. The use of AI in this context could significantly enhance diagnostic accuracy, reduce the radiation exposure associated with CT scans, and lead to faster and more appropriate clinical interventions. Despite its promise, this study has several limitations. First, the data set used for training the algorithm was limited in both size and diversity, which may impact the model’s generalizability across different pediatric populations. AI models are known to be data dependent, and expanding the data set to include a wider range of age groups, ethnicities, and clinical settings will be crucial to improve its performance36. Another limitation of our study is that the number of included samples does not allow for a sufficient sub-group analysis regarding the type, depth or exact location of the fractures at the given moment.
It should also be noted that this work is a proof-of-concept and the model has not yet undergone external validation on independent datasets. Using such validation can reduce the risk of a model inheriting a dataset-specific bias, achieving high performance on one dataset, which does not translate to others due to differences in feature distribution11. In order to reduce the risk of data set bias and due to a limited number of retrospective data, this study was designed as a two-center study. Nevertheless, the study highlights the promising performance of the model and establishes a foundation for further research, extended data collection, and preparation for future multicenter studies. As with other AI applications in radiology, the scalability of this AI solution is critical to ensure that it can account for anatomical and developmental variations between age groups, potentially improving its robustness and generalizability11.
In the event of further validation and academic acceptance, AI-driven decision support based on sonographic imaging could reduce the dependence on highly specialized radiologists. This would enable real-time diagnostic assistance in emergency settings, improving diagnostic consistency, particularly in under-resourced or remote areas where pediatric expertise is scarce36. With further refinement, this algorithm could become a valuable tool in pediatric radiology, complementing human expertise and improving patient outcomes. Nevertheless, in the future personnel producing high quality ultrasound images that can be processed by the AI will still be needed.
The most prominent distinction between image classification and object recognition is the markedly inferior performance of the YOLOv11 models. This behavior was not necessarily expected intuitively. The phenomenon can be attributed to the critical role of accompanying ultrasound findings, such as hematomas and soft tissue swelling, in the precise differentiation between fractures and sutures. The YOLOv11 algorithm uses bounding boxes to allow for supervised computer vision training, which hinders to incorporate these essential factors due to the limited area of its bounding boxes that primarily show the fracture or sutures itself. Consequently, these additional findings are scarcely considered in the process of object recognition, thereby diminishing the algorithm’s performance metrics.
The AI algorithms and human evaluators only gained access to a mixed pool of individual retrospective skull ultrasound images. This is particularly challenging because sonography is a dynamic diagnostic method and also depends on the handling of the ultrasound transducer, possibly improving the precision by panning, tilting and rotating the probe. This possibility was not given in the current study. The recognition rate of both AI and evaluators could be expected to increase with the use of video clips. However, a final clarification of this hypothesis requires further studies.
In conclusion, this study demonstrates the feasibility of AI in accurately differentiating pediatric skull fractures from normal sutures using sonographic images for the first time. AI models outperformed unassisted human raters and significantly improved clinician performance when used as a diagnostic aid. The average positive predictive value (precision) increased from 0.747 to 0.820 in assisted ratings, sensitivity (recall) from 0.766 to 0.855. Although the variations of the object detection model tested were less effective, the findings underscore the potential of AI-assisted sonography to reduce the reliance on CT scans, minimize radiation exposure, and improve point-of-care diagnostics. Despite promising results, further validation on larger, diverse datasets and the inclusion of dynamic sonographic data are essential to enhance AI model generalizability and clinical impact. These advancements could revolutionize pediatric trauma diagnostics, especially in under-resourced settings.
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
McGrath, A. & Taylor, R. S. in StatPearls (StatPearls Publishing Copyright © 2025, StatPearls Publishing LLC., 2025).
Rodà, D. et al. Epidemiology of fractures in children younger than 12 months. Pediatr. Emerg. Care 35, 256–260. https://doi.org/10.1097/pec.0000000000001157 (2019).
Wegmann, H. et al. The epidemiology of fractures in infants–Which accidents are preventable?. Injury 47, 188–191. https://doi.org/10.1016/j.injury.2015.08.037 (2016).
Deng, H. et al. Epidemiology of skeletal trauma and skull fractures in children younger than 1 year in Shenzhen: A retrospective study of 664 patients. BMC Musculoskelet. Disord. 22, 593. https://doi.org/10.1186/s12891-021-04438-8 (2021).
Idriz, S., Patel, J. H., Ameli Renani, S., Allan, R. & Vlahos, I. CT of normal developmental and variant anatomy of the pediatric skull: Distinguishing trauma from normality. Radiographics 35, 1585–1601. https://doi.org/10.1148/rg.2015140177 (2015).
Zulfiqar, M., Kim, S., Lai, J.-P. & Zhou, Y. The role of computed tomography in following up pediatric skull fractures. Am. J. Surgery 214, 483–488. https://doi.org/10.1016/j.amjsurg.2016.07.020 (2017).
Dremmen, M. H. G. et al. Does the addition of a “Black Bone” sequence to a fast multisequence trauma mr protocol allow MRI to replace CT after traumatic brain injury in children?. Am. J. Neuroradiol. 38, 2187–2192. https://doi.org/10.3174/ajnr.A5405 (2017).
Kralik, S. F. et al. Black bone MRI with 3D reconstruction for the detection of skull fractures in children with suspected abusive head trauma. Neuroradiology 61, 81–87. https://doi.org/10.1007/s00234-018-2127-9 (2019).
Eshraghi, B. P. et al. Deep-learning synthesized pseudo-CT for MR high-resolution pediatric cranial bone imaging (MR-HiPCB). Magn. Reson. Med. 88, 2285–2297. https://doi.org/10.1002/mrm.29356 (2022).
Dremmen, M. H. G. et al. Does the addition of a “Black Bone” sequence to a fast multisequence trauma MR protocol allow MRI to replace CT after traumatic brain injury in children?. AJNR Am. J. Neuroradiol. 38, 2187–2192. https://doi.org/10.3174/ajnr.A5405 (2017).
Lu, C.-Y. et al. Artificial intelligence application in skull bone fracture with segmentation approach. J. Imaging Inform. Med. https://doi.org/10.1007/s10278-024-01156-0 (2024).
Parri, N. et al. Point-of-care ultrasound for the diagnosis of skull fractures in children younger than two years of age. J. Pediatr. 196, 230-236.e232. https://doi.org/10.1016/j.jpeds.2017.12.057 (2018).
Alexandridis, G., Verschuuren, E. W., Rosendaal, A. V. & Kanhai, D. A. Evidence base for point-of-care ultrasound (POCUS) for diagnosis of skull fractures in children: A systematic review and meta-analysis. Emerg. Med. J. 39, 30–36. https://doi.org/10.1136/emermed-2020-209887 (2022).
Dehbozorgi, A. et al. Diagnosing skull fracture in children with closed head injury using point-of-care ultrasound vs. computed tomography scan. Eur. J. Pediatr. 180, 477–484. https://doi.org/10.1007/s00431-020-03851-w (2021).
Till, T., Tschauner, S., Singer, G., Lichtenegger, K. & Till, H. Development and optimization of AI algorithms for wrist fracture detection in children using a freely available dataset. Front. Pediatr. 11, 1291804. https://doi.org/10.3389/fped.2023.1291804 (2023).
Lu, C. Y. et al. Artificial intelligence application in skull bone fracture with segmentation approach. J. Imaging Inform. Med. 38, 31–46. https://doi.org/10.1007/s10278-024-01156-0 (2025).
Ramadanov, N. et al. Artificial intelligence-guided distal radius fracture detection on plain radiographs in comparison with human raters. J. Orthop. Surg. Res. 20, 468. https://doi.org/10.1186/s13018-025-05888-9 (2025).
Shelmerdine, S. C., White, R. D., Liu, H., Arthurs, O. J. & Sebire, N. J. Artificial intelligence for radiological paediatric fracture assessment: A systematic review. Insights Imaging 13, 94. https://doi.org/10.1186/s13244-022-01234-3 (2022).
Azad, Z., Aiman, U. & Shaheen, S. Artificial intelligence in paediatric head trauma: Enhancing diagnostic accuracy for skull fractures and brain haemorrhages. Neurosurg. Rev. 47, 641. https://doi.org/10.1007/s10143-024-02897-w (2024).
Kutbi, M. Artificial intelligence-based applications for bone fracture detection using medical images: A systematic review. Diagnostics 14, 1879 (2024).
Parri, N. et al. Ability of emergency ultrasonography to detect pediatric skull fractures: A prospective, observational study. J. Emerg Med. 44, 135–141. https://doi.org/10.1016/j.jemermed.2012.02.038 (2013).
Rabiner, J. E., Friedman, L. M., Khine, H., Avner, J. R. & Tsung, J. W. Accuracy of point-of-care ultrasound for diagnosis of skull fractures in children. Pediatrics 131, e1757-1764. https://doi.org/10.1542/peds.2012-3921 (2013).
Tan, M. & Le, Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. (2019).
Raza, R. et al. Lung-EffNet: Lung cancer classification using EfficientNet from CT-scan images. Eng. Appl. Artif. Intell. 126, 106902. https://doi.org/10.1016/j.engappai.2023.106902 (2023).
Ali, K., Shaikh, Z. A., Khan, A. A. & Laghari, A. A. Multiclass skin cancer classification using EfficientNets—A first step towards preventing skin cancer. Neurosci. Inform. 2, 100034. https://doi.org/10.1016/j.neuri.2021.100034 (2022).
Latha, M. et al. Revolutionizing breast ultrasound diagnostics with EfficientNet-B7 and Explainable AI. BMC Med. Imaging 24, 230. https://doi.org/10.1186/s12880-024-01404-3 (2024).
Ultralytics YOLO (Version 8.0.0) [Computer software] (2023).
opencv/cvat: v1.1.0 (v1.1.0). Zenodo. (2020).
Howard, J. & Gugger, S. Fastai: A layered API for deep learning. Information 11, 108 (2020).
Pedregosa, F. et al. Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12 (2012).
Hunter, J. D. Matplotlib: A 2D Graphics environment. Comput. Sci. Eng. 9, 90–95. https://doi.org/10.1109/MCSE.2007.55 (2007).
Krawczyk, B. Learning from imbalanced data: Open challenges and future directions. Prog. Artif. Intell. 5, 221–232. https://doi.org/10.1007/s13748-016-0094-0 (2016).
Choi, J. Y. et al. Accuracy of bedside ultrasound for the diagnosis of skull fractures in children aged 0 to 4 years. Pediatr. Emerg. Care 36 (2020).
Parri, N. et al. Ability of emergency ultrasonography to detect pediatric skull fractures: A prospective, observational study. J. Emerg. Med. 44, 135–141. https://doi.org/10.1016/j.jemermed.2012.02.038 (2013).
Choi, J. W. et al. Deep learning-assisted diagnosis of pediatric skull fractures on plain radiographs. Korean J. Radiol. 23, 343–354 (2022).
Ciet, P. et al. The unintended consequences of artificial intelligence in paediatric radiology. Pediatr. Radiol. 54, 585–593. https://doi.org/10.1007/s00247-023-05746-y (2024).
Acknowledgements
We express our sincere gratitude to all colleagues who have supported and contributed to the completion of this paper, namely Hesham Elsayed, Elena Friehs, Barbara Mittl, Thomas Petnehazy, Mario Scherkl, and Claus-Uwe Weitzer for rating the sonographic images.
Funding
No financial or non-financial benefits have been received or will be received from any party related directly or indirectly to the subject of this article.
Author information
Authors and Affiliations
Contributions
Saskia Hankel: study conception and design, data collection, data analysis and interpretation, draft manuscript preparation. Holger Till: study conception and design, data analysis and interpretation, draft manuscript preparation. Gerolf Schweintzger: data collection, substantial revision of the manuscript. Christof Kraxner: data collection, substantial revision of the manuscript. Georg Singer: data collection, data analysis and interpretation, substantial revision of the manuscript. Nikolaus Stranger: data collection and analysis, substantial revision of the manuscript. Tristan Till: data analysis and interpretation, draft manuscript preparation. Sebastian Tschauner: study conception and design, data analysis and interpretation, draft manuscript preparation.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Hankel, S., Till, H., Schweintzger, G. et al. Artificial intelligence based sonographic differentiation between skull fractures and normal sutures in young children. Sci Rep 15, 37006 (2025). https://doi.org/10.1038/s41598-025-09994-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-09994-w



