A new study used machine learning to improve detection, prevention, and management of arthritis linked to environmental factors, like exposure to particular heavy metals.
By
Lana Pine
| Published on August 9, 2024
5 min read
A new study used machine learning to help find better ways to detect, prevent, and manage arthritis caused by environmental factors. The research showed that making these models easier to understand could help doctors and researchers learn more about how the environment affects arthritis, according to a study published in Frontiers in Nutrition.1
Heavy metals like cadmium and lead have been gaining attention for possibly making arthritis worse. These metals may increase oxidative stress and lead to long-lasting inflammation, which is linked to rheumatoid arthritis (RA).2 However, more research is still needed to explore how these metals affect osteoarthritis (OA).
“Research into the correlation between heavy metals and arthritis remains nascent, and existing studies predominantly rely on traditional statistical methods,” wrote a group of investigators predominantly from the Shanghai Key Laboratory of Orthopedic Implants, Department of Orthopedic Surgery, Shanghai Ninth People’s Hospital, Shanghai Jiaotong University School of Medicine in Shanghai, China. “These conventional approaches often necessitate extensive data requirements, incorporate numerous presumptions, and are subject to stringent application constraints, which restricts their capacity to derive insights from voluminous datasets. However, the dawn of the big data era, coupled with the swift advancement of computational technologies, has paved the way for the burgeoning application of ML techniques across various domains, including medical research.”
Investigators used a phased ML strategy that encompassed a range of methodologies, such as the SHapley Addictive exPlanations (SHAP), and leveraged the minimum absolute shrinkage and selection operator (LASSO) regression, as well as the analytical framework, as well as integrating laboratory, demographic, and questionnaire data, to leverage the advanced ML techniques to understand the relationship between heavy metal exposure and its impacts on RA and OA. Data were collected from the National Health and Nutritional Examination Survey (NHANES) between 2003 and 2020.
A total of 13 ML models were employed across 7 methodologies to increase the predictability and interpretability of clinical outcomes. Each model phase was designed to refine the performance of the algorithm.
Ultimately, 14,319 patients were included in the analysis, of which 49% were male. The average age of participants was 49.0 years, and ranged from 34.0 to 63.0 years.
Results showed significant associations between specific heavy metals, including arsenic, arsenite, arsenic acid, dimethylarsinic acid, monomethylarsonic acid, barium, cadmium, lead, antimony, and tungsten, and an increased risk of arthritis. The phased ML approach allowed for the identification of key predictors and their contributions to disease outcomes.
Investigators noted the lack of a longitudinal follow-up for the same cohort as a limitation of the study. Additionally, the inability to access other datasets of a similar scale was another limitation. Inherent biases of cross-sectional studies, potential information bias as arthritis diagnoses were self-reported, and biases from missing data may have impacted the findings. Additionally, disparities in feature importance between SHAP and the premutation shuffling, with the latter concentrating on global explanations while the former focused on individual prediction contributions, could have hindered replicability because of the complexity of model interpretation.
Aside from heavy metal exposure, age, gender, and ethnicity (such as non-Hispanic Whites) were also identified as significant factors, with these groups demonstrating a higher association with arthritis.
“Employing SHAP enhanced our understanding of the predictive outcomes of these models, providing deep insights into the factors contributing to arthritis,” investigators concluded. “This approach combines advanced analytics with improved interpretability, overcoming the typical ‘black box’ issue in machine learning and enabling a more detailed exploration of the relationship between environmental exposures and health outcomes.”
An original version of this article was published on sister site HCPLive.
References