TY - JOUR
T1 - Explainable Machine Learning Techniques to Predict Muscle Injuries in Professional Soccer Players through Biomechanical Analysis
AU - Calderón-Díaz, Mailyn
AU - Silvestre Aguirre, Rony
AU - Vásconez, Juan P.
AU - Yáñez, Roberto
AU - Roby, Matías
AU - Querales, Marvin
AU - Salas, Rodrigo
N1 - Publisher Copyright:
© 2023 by the authors.
PY - 2024/1
Y1 - 2024/1
N2 - There is a significant risk of injury in sports and intense competition due to the demanding physical and psychological requirements. Hamstring strain injuries (HSIs) are the most prevalent type of injury among professional soccer players and are the leading cause of missed days in the sport. These injuries stem from a combination of factors, making it challenging to pinpoint the most crucial risk factors and their interactions, let alone find effective prevention strategies. Recently, there has been growing recognition of the potential of tools provided by artificial intelligence (AI). However, current studies primarily concentrate on enhancing the performance of complex machine learning models, often overlooking their explanatory capabilities. Consequently, medical teams have difficulty interpreting these models and are hesitant to trust them fully. In light of this, there is an increasing need for advanced injury detection and prediction models that can aid doctors in diagnosing or detecting injuries earlier and with greater accuracy. Accordingly, this study aims to identify the biomarkers of muscle injuries in professional soccer players through biomechanical analysis, employing several ML algorithms such as decision tree (DT) methods, discriminant methods, logistic regression, naive Bayes, support vector machine (SVM), K-nearest neighbor (KNN), ensemble methods, boosted and bagged trees, artificial neural networks (ANNs), and XGBoost. In particular, XGBoost is also used to obtain the most important features. The findings highlight that the variables that most effectively differentiate the groups and could serve as reliable predictors for injury prevention are the maximum muscle strength of the hamstrings and the stiffness of the same muscle. With regard to the 35 techniques employed, a precision of up to 78% was achieved with XGBoost, indicating that by considering scientific evidence, suggestions based on various data sources, and expert opinions, it is possible to attain good precision, thus enhancing the reliability of the results for doctors and trainers. Furthermore, the obtained results strongly align with the existing literature, although further specific studies about this sport are necessary to draw a definitive conclusion.
AB - There is a significant risk of injury in sports and intense competition due to the demanding physical and psychological requirements. Hamstring strain injuries (HSIs) are the most prevalent type of injury among professional soccer players and are the leading cause of missed days in the sport. These injuries stem from a combination of factors, making it challenging to pinpoint the most crucial risk factors and their interactions, let alone find effective prevention strategies. Recently, there has been growing recognition of the potential of tools provided by artificial intelligence (AI). However, current studies primarily concentrate on enhancing the performance of complex machine learning models, often overlooking their explanatory capabilities. Consequently, medical teams have difficulty interpreting these models and are hesitant to trust them fully. In light of this, there is an increasing need for advanced injury detection and prediction models that can aid doctors in diagnosing or detecting injuries earlier and with greater accuracy. Accordingly, this study aims to identify the biomarkers of muscle injuries in professional soccer players through biomechanical analysis, employing several ML algorithms such as decision tree (DT) methods, discriminant methods, logistic regression, naive Bayes, support vector machine (SVM), K-nearest neighbor (KNN), ensemble methods, boosted and bagged trees, artificial neural networks (ANNs), and XGBoost. In particular, XGBoost is also used to obtain the most important features. The findings highlight that the variables that most effectively differentiate the groups and could serve as reliable predictors for injury prevention are the maximum muscle strength of the hamstrings and the stiffness of the same muscle. With regard to the 35 techniques employed, a precision of up to 78% was achieved with XGBoost, indicating that by considering scientific evidence, suggestions based on various data sources, and expert opinions, it is possible to attain good precision, thus enhancing the reliability of the results for doctors and trainers. Furthermore, the obtained results strongly align with the existing literature, although further specific studies about this sport are necessary to draw a definitive conclusion.
KW - hamstring injuries
KW - machine learning explainability
KW - soccer player
KW - sport medicine
KW - XGBoost
UR - http://www.scopus.com/inward/record.url?scp=85181934488&partnerID=8YFLogxK
U2 - 10.3390/s24010119
DO - 10.3390/s24010119
M3 - Article
C2 - 38202981
AN - SCOPUS:85181934488
SN - 1424-8220
VL - 24
JO - Sensors
JF - Sensors
IS - 1
M1 - 119
ER -