Hybrid vision-IMU deep learning framework with graph convolutional networks and attention for personalized yoga posture identification

AI Summary

Hybrid Vision-IMU architecture utilises synthetic IMU data derived from skeletal kinematics and biomechanical models for personalised yoga posture detection.
LSTM encodes IMU signals, Graph Convolutional Network extracts visual features, and attention-based fusion yields user-specific classification representations.
Hybrid model outperforms vision-only and IMU-only approaches in accuracy and false alarm rate, enabling real-time personalised yoga posture identification for medical applications.

Sci Rep. 2026 May 12. doi: 10.1038/s41598-026-49970-6. Online ahead of print.

ABSTRACT

Asanas, are very crucial for maintaining humans physical and mental health in an efficient manner. Digital training platforms, fitness monitors, and medical applications depend on accurate yoga posture detection. Conventional vision-based posture identification algorithms encounter the issues such as occlusions, background clutter, coping with different camera angles, and individual variations in body proportions. To address these concerns, this research work proposes a hybrid Vision-IMU deep learning architecture for personalized yoga posture detection. In this work, Kinematic stance trajectories and the Yoga-82 vision dataset are employed to construct a synthetic IMU dataset. The IMU signals were algorithmically produced from skeletal joint motion, utilizing biomechanical motion models to reproduce accelerometer and gyroscope readings under controlled conditions. An LSTM network is employed to encodes IMU data, while a Graph Convolutional Network (GCN) interprets visual features. Finally, these two feature streams are combined by using an attention-based technique to provide user-specific posture classification representations. Experimental results show that the hybrid model handles pose orientation fluctuations and user-specific features better than vision-only or IMU-only models in terms of classification accuracy and false alarm rate. This work demonstrates that hybrid deep learning algorithms can be utilized for real-time, customized yoga position identification for the real time medical applications.

PMID:42120558 | DOI:10.1038/s41598-026-49970-6

Document this CPD