AccScience Publishing / AIH / Online First / DOI: 10.36922/AIH025070009
ORIGINAL RESEARCH ARTICLE

A bagging ensemble machine learning method for imbalanced data to predict anxiety disorders and analyze risk factors in older people: An observational study

Jinling Wang1* Michaela Black1 Debbie Rankin1 Jonathan Wallace2 Catherine F. Hughes3 Leane Hoey3 Adrian Moore4 Joshua Tobin5 Mimi Zhang5 James Ng5 Geraldine Horigan3 Paul Carlin6 Kevin McCarroll7 Conal Cunningham7 Helene McNulty3 Anne M. Molloy8
Show Less
1 School of Computing, Engineering and Intelligent Systems, Ulster University, Derry-Londonderry, United Kingdom
2 School of Computing, Ulster University, Jordanstown, United Kingdom
3 School of Biomedical Sciences, Nutrition Innovation Centre for Food and Health, Ulster University, Coleraine, United Kingdom
4 School of Geography and Environmental Sciences, Ulster University, Coleraine, United Kingdom
5 School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
6 School of Health, Wellbeing and Social Care, The Open University, Belfast, United Kingdom
7 Mercers Institute for Research on Ageing, St James’s Hospital, Dublin, Ireland
8 School of Medicine, Trinity College Dublin, Dublin, Ireland
Received: 12 February 2025 | Revised: 7 July 2025 | Accepted: 14 July 2025 | Published online: 8 September 2025
© 2025 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0/ )
Abstract

Anxiety disorders (ADs) rank among the most prevalent mental health problems, especially in older people. The high risk and prevalence of ADs underscore the need for effective mental health care. Artificial intelligence has gained popularity in the diagnosis and prediction of medical conditions and diseases, including mental health problems. In this study, we developed an adapted bagging ensemble machine learning system that can be used for the diagnosis and prediction of ADs and can address the challenges posed by extremely imbalanced data from the Trinity-Ulster-Department of Agriculture study. Statistical techniques were used to identify the risk factors for ADs. Feature selection and feature engineering were conducted based on the analysis of biomarker risk factors. Five machine learning methods have been used in the developed system to build weak learner submodels, yielding promising prediction results. Some risk factors were identified. These findings will benefit the early prediction of ADs in our future studies.

Keywords
Anxiety disorder
Bagging ensemble machine learning
Risk factor analysis
Diagnosis
Imbalanced data
Aging
Funding
The TUDA study was supported by government funding from the Irish Department of Agriculture, Food and the Marine, and Health Research Board (under the Food Institutional Research Measure), as well as from the Northern Ireland Department for Employment and Learning (under its Strengthening the All-Island Research Base Initiative). The AIM4HEALTH project gratefully acknowledges the support of the higher education authority, Department of Further and Higher Education, Research, Innovation and Science, and the Shared Island Fund, and the SFI grant 21/RC/10295_P2.
Conflict of interest
The authors declare they have no competing interests.
References
  1. COVID-19 Mental Disorders Collaborators. Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic. Lancet. 2021;398:1700-1712. doi: 10.1016/S0140-6736(21)02143-7

 

  1. GBD 2019 Mental Disorders Collaborators. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: A systematic analysis for the global burden of disease study 2019. Lancet Psychiatry. 2022;9:137-150. doi: 10.1016/S2215-0366(21)00395-3

 

  1. Lauderdale SA, Sheikh JL. Anxiety disorders in older adults. Clin Geriatr Med. 2003;19(4):721-741. doi: 10.1016/s0749-0690(03)00047-8

 

  1. Sheikh JI. Investigations of anxiety in older adults: Recent advances and future directions. J Geriatr Psychiatry Neurol. 2005;18(2):59-60. doi: 10.1177/0891988705276253

 

  1. Andreescu C, Varon D. New research on anxiety disorders in the elderly and an update on evidence-based treatments. Curr Psychiatry Rep. 2015;17(7):53. doi: 10.1007/s11920-015-0595-8

 

  1. Ishikawa RZ, Vyas C, Okereke O. Anxiety disorders among older adults: Empirically supported treatments and special considerations. In: Bui E, Charney ME, Baker AW, editors. Clinical Handbook of Anxiety Disorders: From Theory to Practice. United States: Humana Press/Springer Nature; 2020. p. 175-189. doi: 10.1007/978-3-030-30687-8_9

 

  1. Rankin D, Black M, Flanagan B, et al. Identifying key predictors of cognitive dysfunction in older people using supervised machine learning techniques: Observational study. JMIR Med Inform. 2020;8(9):e20995. doi: 10.2196/20995

 

  1. Javaid SF, Hashim IJ, Hashim MJ, Stip E, Samad MA, Ahbabi AA. Epidemiology of anxiety disorders: Global burden and sociodemographic associations. Middle East Curr Psychiatry. 2023;30:44. doi: 10.1186/s43045-023-00315-3

 

  1. Fusar-Poli P, Correll CU, Arango C, Berk M, Patel V, Ioannidis JP. Preventive psychiatry: A blueprint for improving the mental health of young people. World Psychiatry. 2021;20:200-21. doi: 10.1002/wps.20869

 

  1. Jorm AF, Patten SB, Brugha TS, Mojtabai R. Has increased provision of treatment reduced the prevalence of common mental disorders? Review of the evidence from four countries. World Psychiatry. 2017;16:90-99. doi: 10.1002/wps.20388

 

  1. Jain PR, Quadri SMK. Emerging role of intelligent techniques for effective detection and prediction of mental disorders. In: Hemanth J, Bestak R, Chen JIZ, editors. Intelligent Data Communication Technologies and Internet of Things. Lecture Notes on Data Engineering and Communications Technologies. Vol. 57. Singapore: Springer; 2021. doi: 10.1007/978-981-15-9509-7_16

 

  1. Meehan AJ, Lewis SJ, Fazel S, et al. Clinical prediction models in psychiatry: A systematic review of two decades of progress and challenges. Mol Psychiatry. 2022;27:2700-2708. doi: 10.1038/s41380-022-01528-4

 

  1. Graham S, Depp C, Lee EE, et al. Artificial intelligence for mental health and mental illnesses: An overview. Curr Psychiatry Rep. 2019;21:116. doi: 10.1007/s11920-019-1094-0

 

  1. Cearns M, Hahn T, Baune BT. Recommendations and future directions for supervised machine learning in psychiatry. Transl Psychiatry. 2019;9:271. doi: 10.1038/s41398-019-0607-2

 

  1. Thieme A, Belgrave D, Doherty G. Machine learning in mental health: A systematic review of the HCI literature to support the development of effective and implementable ml systems. ACM Trans Comput Hum Interact. 2020;27(5):1-53. doi: 10.1145/3398069

 

  1. Ancillon I, Elgendi M, Menon C. Machine learning for anxiety detection using biosignals: A review. Diagnostics (Basel). 2022;12(8):1794. doi: 10.3390/diagnostics12081794

 

  1. Khan A, Husain MH, Khan A. Analysis of mental state of users using social media to predict depression: A survey. Int J Adv Res Comput Sci. 2018;9:100-106. doi: 10.26483/ijarcs.v9i0.6146

 

  1. Agarwal D, Singh V, Singh AK, Madan P. Stacked ensemble model for analyzing mental health disorder from social media data. Multimed Tools Appl. 2023;83:53923-53948. doi: 10.1007/s11042-023-17395-2

 

  1. Nemesure MD, Heinz MV, Huang R, Jacobson NC. Predictive modeling of depression and anxiety using electronic health and a novel machine learning approach with artificial intelligence. Sci Rep. 2021;11(1):1980. doi: 10.1038/s41598-021-81368-4

 

  1. Shen ZX, Cui LJ, Mou SQ, et al. Combining S100B and cytokines as neuro-inflammatory biomarkers for diagnosing generalized anxiety disorder: A proof-of-concept study based on machine learning. Front Psychiatry. 2022;13:881241. doi: 10.3389/fpsyt.2022.881241

 

  1. Byeon H. Exploring factors for predicting anxiety disorders of the elderly living alone in South Korea using interpretable machine learning: A population-based study. Int J Environ Res Public Health. 2021;18(14):7625. doi: 10.3390/ijerph18147625

 

  1. Henry M, Isa SM. Mental health treatment prediction for Tech Employee with the implementation of ensemble methods. J Theor Appl Inf Technol. 2022;100(8):2675-2685.

 

  1. Rocca J. Ensemble Methods: Bagging, Boosting and Stacking-- Understanding the Key Concepts of Ensemble Learning; 2019. Available from: https://towardsdatascience.com/ensemble-methods-bagging-boosting-and-stacking-c9214a10a205. [Last accessed on 2024 Jan 27].

 

  1. Patel A. Ensemble Learning- the Heart of Machine Learning. Available from: https://medium.com/ml-research-lab/ ensemble-learning-the-heart-of-machine-learning-b4f59a5f9777 [Last accessed on 2020 Jan 03].

 

  1. Chicco D, Tötsch N, Jurman G. The matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min. 2021;14(1):13. doi: 10.1186/s13040-021-00244-z

 

  1. McCann A, McNulty H, Rigby J, et al. Effect of area-level socioeconomic deprivation on risk of cognitive dysfunction in older adults. J Am Geriatr Soc. 2018;66(7):1269-1275. doi: 10.1111/jgs.15258

 

  1. Moore K, Hughes CF, Hoey L, et al. B-vitamins in relation to depression in older adults over 60 years of age: The trinity ulster department of agriculture (TUDA) cohort study. J Am Med Dir Assoc. 2019;20(5):551-557.e1. doi: 10.1016/j.jamda.2018.11.031

 

  1. Wang J, Black M, Rankin D, et al. Analysis of Risk Factors and Diagnosis for Anxiety Disorder in Older People with the Aid of Artificial Intelligence: Observational Study. In: 2023 the 31st Irish Conference on Artificial Intelligence and Cognitive Science, Letterkenny, Ireland, IEEE. p. 1-8. doi: 10.1109/aics60730.2023.10470782

 

  1. Larsen BS. Synthetic Minority Over-Sampling Technique (SMOTE). Available from: https://github.com/dkbsl/ matlab_smote/releases/tag/1.0),github [Last accessed on 2023 May 31].

 

  1. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321-357. doi: 10.1613/jair.953

 

  1. He H, Bai Y, Garcia EA, Li S. ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning. In: 2008 IEEE International Joint Conference on Neural Networks; 2008. p. 1322-1328. doi: 10.1109/ijcnn.2008.4633969

 

  1. Edwards R. Causes and Risk Factors of Anxiety; 2021. Available from: https://www.verywellhealth.com/anxiety-causes-and-risk-factors-5191778 [Last accessed on 2023 Dec 03].

 

  1. Narmandakh A, Roest AM, De Jonge P, et al. Psychosocial and biological risk factors of anxiety disorders in adolescents: A TRAILS report. Eur Child Adolesc Psychiatry. 2021;30:1969-1982. doi: 10.1007/s00787-020-01669-3

 

  1. UK Statistics on Vitamin and Mineral Deficiency; 2023. Available from: https://vitall.co.uk/health-tests-blog/ statistics-vitamin-mineral-deficiency-uk [Last accessed on 2023 Aug 23].

 

  1. Vitamin D: The Connection to Depression and Anxiety. Available from: https://montarebehavioralhealth.com/ vitamin-d-the-connection-to-depression-and-anxiety [Last accessed on 2023 Aug 23].

 

  1. Chang S, Lee H. Vitamin D and health - the missing vitamin in humans. Pediatr Neonatol. 2019;60(3):237-244. doi: 10.1016/j.pedneo.2019.04.007

 

  1. Menon V, Kar SK, Suthar N, Nebhinani N. Vitamin D and depression: A critical appraisal of the evidence and future directions. Indian J Psychol Med. 2020;42(1):11-21. doi: 10.4103/ijpsym.ijpsym_160_19

 

  1. Kowalówka M, Gówka AK, Karaniewiczada M, Kosewski G. Clinical significance of analysis of vitamin D status in various diseases. Nutrients. 2020;12(9):2788. doi: 10.3390/nu12092788

 

  1. Chiang JJ, Park H, Almeida DM, et al. Psychosocial stress and C-reactive protein from mid-adolescence to young adulthood. Health Psychol. 2019;38(3):259-267. doi: 10.1037/hea0000701

 

  1. Anjum I, Jaffery SS, Fayyaz M, Samoo Z, Anjum S. The role of vitamin D in brain health: A mini literature review. Cureus. 2018;10(7):e2960. doi: 10.7759/cureus.2960

 

  1. Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21(6):6. doi: 10.1186/s12864-019-6413-7

 

  1. Chicco D, Jurman G. The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. BioData Min. 2023;16(1):4. doi: 10.1186/s13040-023-00322-4

 

  1. Adeniji OD, Adeyemi SO, Ajagbe SA. An improved bagging ensemble in predicting mental disorder using hybridized random forest - artificial neural network model. Informatica. 2022;46(4):543-550. doi: 10.31449/inf.v46i4.3916

 

  1. Ogunseye EO, Adenusi CA, Nwanakwaugwu AC, Ajagbe SA, Akinola SO. Predictive analysis of mental health conditions using adaboost algorithm. Paradigmplus. 2022;3:11-26. doi: 10.55969/paradigmplus.v3n2a2

 

  1. Alabi EO, Adeniji OD, Awoyelu TM, Fasae EO. Hybridization of machine learning techniques in predicting mental disorder. Int J Hum Comput Stud. 2021;3(6):22-30. doi: 10.31149/ijhcs.v3i6.2083

 

  1. Obiedat R, Toubasi SA. A combined approach for prediction employee’s productivity based on ensemble machine learning methods. Int J Comput Inform. 2022;46:49-58. doi: 10.31449/inf.v46i5.3839

 

  1. Moreno C, Wykes T, Galderisi S, et al. How mental health care should change as a consequence of the COVID-19 pandemic. Lancet Psychiatry. 2020;7:813-824. doi: 10.1016/S2215-0366(20)30307-2

 

  1. Campion J, Javed A, Sartorius N, Marmot M. Addressing the public mental health challenge of COVID-19. Lancet Psychiatry. 2020;7(8):657-659. doi: 10.1016/S2215-0366(20)30240-6

 

  1. Baxter AJ, Scott KM, Vos T, Whiteford HA. Global prevalence of anxiety disorders: A systematic review and meta-regression. Psychol Med. 2013;43(5):897-910. doi: 10.1017/S003329171200147X
Share
Back to top
Artificial Intelligence in Health, Electronic ISSN: 3029-2387 Print ISSN: 3041-0894, Published by AccScience Publishing