Abstract
INTRODUCTION: Injuries are a significant concern in athletic populations, limiting performance, shortening careers, and negative impact on health. There is growing evidence that intrinsic factors such as body composition and biomechanical characteristics are associated with injury risk, with metrics like fat and muscle distribution linked to injury likelihood across multiple sports. Moreover, recent research has demonstrated that predictive models incorporating physiological and biomechanical variables can effectively estimate injury risk, supporting the feasibility of data-driven injury risk prediction in sport science. PURPOSE: The purpose of this study was to assess whether body composition and biomechanical measurements can be used within a machine learning framework to predict short-term injury risk and provide individualized, body-region–specific injury risk estimates for athletes. METHODS: A retrospective dataset consisting of 1,258 NCAA Division I athletes' records (males = 825; females = 433; height = 178.7 ± 10.37 cm; weight = 86.9 ± 23.32 kg) was analyzed, of which 246 cases involved an injury occurring within 180 days of assessment. Scans were collected between August 8, 2022, and December 9, 2025. Body composition (DXA) and biomechanical variables (DARI® Motion Analysis System, YBT - balance test) were used as model inputs, and a two-stage machine learning framework was implemented: a binary classifier to predict overall injury risk within 180 days, followed by a multiclass classifier to estimate body-region–specific injury risk among injured athletes using CatBoost (Categorical Boosting). Model performance was evaluated using area under the receiver operating characteristic curve (AUC) for injury prediction and class-based performance metrics for body region estimation, with a hold-out test set reserved for final evaluation. RESULTS: The injury risk prediction model achieved an area under the receiver operating characteristic curve (AUC) of 0.74, indicating good discrimination between injured and non-injured athletes. For injured athletes, the body-region prediction model achieved a top-1 accuracy of 50.0%, with performance improving to 62.5% and 77.1% when the true injury location was required to be within the top two and top three predicted regions. CONCLUSION: Machine learning models utilizing body composition and biomechanical data can reliably estimate short-term injury risks in athletes and provide body-region–specific risk profiles. Although precise prediction of a single injury location remains challenging, particularly in the presence of class imbalance and limited sample sizes, ranking-based body region risk estimates show substantial promise for supporting injury prevention strategies and individualized athlete monitoring.
Recommended Citation
Strogalev, Nikita; Bernhardt, Vipa; Oldham, Michael; Kramarenko, Veronika; Jones, Brian; Banks, Bronwyn; Sabo, Dylan; Alonso, Fatima; Nate, Joshua; Pegueros, Karla; Bakcha, Ouays; and Ochoa, Stephanie Tapia
(2026)
"Predicting Injury Risk in NCAA Division I Athletes Using Biomechanical and Body Composition Screening Data,"
International Journal of Exercise Science: Conference Proceedings: Vol. 2:
Iss.
18, Article 120.
Available at:
https://digitalcommons.wku.edu/ijesab/vol2/iss18/120