Ayan Chatterjee
Exploring online public survey lifestyle datasets with statistical analysis, machine learning and semantic ontology
Chatterjee, Ayan; Riegler, Michael A.; Johnson, Miriam Sinkerud; Das, Jishnu; Pahari, Nibedita; Ramachandra, Raghavendra; Ghosh, Bikramaditya; Saha, Arpan; Bajpai, Ram
Authors
Michael A. Riegler
Miriam Sinkerud Johnson
Jishnu Das
Nibedita Pahari
Raghavendra Ramachandra
Bikramaditya Ghosh
Arpan Saha
Dr Ram Bajpai r.bajpai@keele.ac.uk
Abstract
Lifestyle diseases significantly contribute to the global health burden, with lifestyle factors playing a crucial role in the development of depression. The COVID-19 pandemic has intensified many determinants of depression. This study aimed to identify lifestyle and demographic factors associated with depression symptoms among Indians during the pandemic, focusing on a sample from Kolkata, India. An online public survey was conducted, gathering data from 1,834 participants (with 1,767 retained post-cleaning) over three months via social media and email. The survey consisted of 44 questions and was distributed anonymously to ensure privacy. Data were analyzed using statistical methods and machine learning, with principal component analysis (PCA) and analysis of variance (ANOVA) employed for feature selection. K-means clustering divided the pre-processed dataset into five clusters, and a support vector machine (SVM) with a linear kernel achieved 96% accuracy in a multi-class classification problem. The Local Interpretable Model-agnostic Explanations (LIME) algorithm provided local explanations for the SVM model predictions. Additionally, an OWL (web ontology language) ontology facilitated the semantic representation and reasoning of the survey data. The study highlighted a pipeline for collecting, analyzing, and representing data from online public surveys during the pandemic. The identified factors were correlated with depressive symptoms, illustrating the significant influence of lifestyle and demographic variables on mental health. The online survey method proved advantageous for data collection, visualization, and cost-effectiveness while maintaining anonymity and reducing bias. Challenges included reaching the target population, addressing language barriers, ensuring digital literacy, and mitigating dishonest responses and sampling errors. In conclusion, lifestyle and demographic factors significantly impact depression during the COVID-19 pandemic. The study’s methodology offers valuable insights into addressing mental health challenges through scalable online surveys, aiding in the understanding and mitigation of depression risk factors.
Citation
Chatterjee, A., Riegler, M. A., Johnson, M. S., Das, J., Pahari, N., Ramachandra, R., …Bajpai, R. (2024). Exploring online public survey lifestyle datasets with statistical analysis, machine learning and semantic ontology. Scientific Reports, 14(1), 1-24. https://doi.org/10.1038/s41598-024-74539-6
Journal Article Type | Article |
---|---|
Acceptance Date | Sep 26, 2024 |
Online Publication Date | Oct 15, 2024 |
Publication Date | Oct 15, 2024 |
Deposit Date | Oct 25, 2024 |
Publicly Available Date | Oct 25, 2024 |
Journal | Scientific Reports |
Electronic ISSN | 2045-2322 |
Publisher | Nature Publishing Group |
Peer Reviewed | Peer Reviewed |
Volume | 14 |
Issue | 1 |
Article Number | 24190 |
Pages | 1-24 |
DOI | https://doi.org/10.1038/s41598-024-74539-6 |
Keywords | survey, datasets, COVID-19, depression, machine learning, semantics, LIME |
Public URL | https://keele-repository.worktribe.com/output/952920 |
Files
Exploring online public survey lifestyle datasets with statistical analysis, machine learning and semantic ontology
(2.8 Mb)
Archive
Licence
https://creativecommons.org/licenses/by-nc-nd/4.0/
Publisher Licence URL
https://creativecommons.org/licenses/by-nc-nd/4.0/
Copyright Statement
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
You might also like
Downloadable Citations
About Keele Repository
Administrator e-mail: research.openaccess@keele.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search