Fine semantic segmentation of hands for human functional pattern recognition in IADLs using deep learning
Eyitomilayo Yemisi Babatope, Jesus Alejandro Acosta‐Franco, Mireya Saraí García‐Vázquez, Alejandro Álvaro Ramírez‐Acosta- Psychiatry and Mental health
- Cellular and Molecular Neuroscience
- Geriatrics and Gerontology
- Neurology (clinical)
- Developmental Neuroscience
- Health Policy
- Epidemiology
Abstract
Background
Alzheimer’s Disease (AD) is an irreversible degenerative brain disease and the most common cause of dementia, in the elderly population (age 65 and above) although people younger may develop the disease (Clemmensen, 2020). AD is characterized by impairment in behavior and cognition which eventually obstruct the Instrumental Activities of Daily Living (IADL) (Kumar, 2022). Researchers have predicted functional capacity by assessing IADL as they are strongly associated with cognitive impairment. With the availability of big data and public repositories of multimedia data such as images and videos, artificial intelligence algorithms have been used in classifying and making informed decisions that could help in disease prediction and monitoring. In the context of early diagnosis of Alzheimer, researchers have worked with machine learning techniques applied to neuroimaging for AD classification (Lama, 2021). Researchers and practitioners often use Activity of Daily Living to monitor independent‐living senior citizens’ self‐care ability, wellness status, and disease progression (Zhu, 2021). Automating the assessments of IADL will ease the evaluation of functional and cognitive function and strengthen the prediction of AD.
Method
Our work approaches the timely detection and diagnosis of AD by exploring the deficiency in hand dexterity and visuospatial skills while performing IADL using egocentric videos. Therefore, in this work, we propose a fine semantic segmentation of hands, optimizing the predictions at the level of the relationship between pixels in a deep learning method. We trained the optimized RefineNet‐Pix convolutional neural network, obtaining a higher accuracy in the segmentation of semantic maps of hands.
Result
With the optimization of RefineNet‐Pix we obtained an accuracy higher than 87.9% (Urooj, 2018) in the semantic segmentation of hands. By integrating the RefineNet‐Pix results to the base model for human functional pattern recognition, we obtain an accuracy higher than 74% with respect to the base model (Yemisi, 2022).
Conclusion
The optimized RefineNet‐Pix model segments pixel‐wise the hand regions in egocentric videos, providing hand motion information that can be used to classify impairment based on human functional patterns. Optimization of these models is a way forward in building a technological device for the support of Alzheimer’s diagnosis.