ObjectiveTo identify lifting actions and count the number of lifts performed in videos based on robust class prediction and a streamlined process for reliable real-time monitoring of lifting tasks. BackgroundTraditional methods for recognizing lifting actions often rely on deep learning classifiers applied to human motion data collected from wearable sensors. Despite their high performance, these methods can be difficult to implement on systems with limited hardware resources. MethodThe proposed method follows a five-stage process: (1) BlazePose, a real-time pose estimation model, detects key joints of the human body. (2) These joints are preprocessed by smoothing, centering, and scaling techniques. (3) Kinematic features are extracted from the preprocessed joints. (4) Video frames are classified as lifting or nonlifting using rank-altered kinematic feature pairs. (5) A lifting counting algorithm counts the number of lifts based on the class predictions. ResultsNine rank-altered kinematic feature pairs are identified as key pairs. These pairs were used to construct an ensemble classifier, which achieved 0.89 or above in classification metrics, including accuracy, precision, recall, and F1 score. This classifier showed an accuracy of 0.90 in lifting counting and a latency of 0.06 ms, which is at least 12.5 times faster than baseline classifiers. ConclusionThis study demonstrates that computer vision-based kinematic features could be adopted to effectively and efficiently recognize lifting actions. ApplicationThe proposed method could be deployed on various platforms, including mobile devices and embedded systems, to monitor lifting tasks in real-time for the proactive prevention of work-related low-back injuries.
more »
« less
This content will become publicly available on May 1, 2026
Toward Real-Time Posture Classification: Reality Check
Fall prevention has always been a crucial topic for injury prevention. Research shows that real-time posture monitoring and subsequent fall prevention are important for the prevention of fall-related injuries. In this research, we determine a real-time posture classifier by comparing classical and deep machine learning classifiers in terms of their accuracy and robustness for posture classification. For this, multiple classical classifiers, including classical machine learning, support vector machine, random forest, neural network, and Adaboost methods, were used. Deep learning methods, including LSTM and transformer, were used for posture classification. In the experiment, joint data were obtained using an RGBD camera. The results show that classical machine learning posture classifier accuracy was between 75% and 99%, demonstrating that the use of classical machine learning classification alone is sufficient for real-time posture classification even with missing joints or added noise. The deep learning method LSTM was also effective in classifying the postures with high accuracy, despite incurring a significant computational overhead cost, thus compromising the real-time posture classification performance. The research thus shows that classical machine learning methods are worthy of our attention, at least, to consider for reuse or reinvention, especially for real-time posture classification tasks. The insight of using a classical posture classifier for large-scale human posture classification is also given through this research.
more »
« less
- Award ID(s):
- 2306285
- PAR ID:
- 10616030
- Publisher / Repository:
- MDPI
- Date Published:
- Journal Name:
- Electronics
- Volume:
- 14
- Issue:
- 9
- ISSN:
- 2079-9292
- Page Range / eLocation ID:
- 1876
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The Local Climate Zone (LCZ) classification is already widely used in urban heat island and other climate studies. The current classification method does not incorporate crucial urban auxiliary GIS data on building height and imperviousness that could significantly improve urban-type LCZ classification utility as well as accuracy. This study utilized a hybrid GIS- and remote sensing imagery-based framework to systematically compare and evaluate different machine and deep learning methods. The Convolution Neural Network (CNN) classifier outperforms in terms of accuracy, but it requires multi-pixel input, which reduces the output’s spatial resolution and creates a tradeoff between accuracy and spatial resolution. The Random Forest (RF) classifier performs best among the single-pixel classifiers. This study also shows that incorporating building height dataset improves the accuracy of the high- and mid-rise classes in the RF classifiers, whereas an imperviousness dataset improves the low-rise classes. The single-pass forward permutation test reveals that both auxiliary datasets dominate the classification accuracy in the RF classifier, while near-infrared and thermal infrared are the dominating features in the CNN classifier. These findings show that the conventional LCZ classification framework used in the World Urban Database and Access Portal Tools (WUDAPT) can be improved by adopting building height and imperviousness information. This framework can be easily applied to different cities to generate LCZ maps for urban models.more » « less
-
A large variety of sound sources in the ocean, including biological, geophysical, and man-made, can be simultaneously monitored over instantaneous continental-shelf scale regions via the passive ocean acoustic waveguide remote sensing (POAWRS) technique by employing a large-aperture densely-populated coherent hydrophone array system. Millions of acoustic signals received on the POAWRS system per day can make it challenging to identify individual sound sources. An automated classification system is necessary to enable sound sources to be recognized. Here, the objectives are to (i) gather a large training and test data set of fin whale vocalization and other acoustic signal detections; (ii) build multiple fin whale vocalization classifiers, including a logistic regression, support vector machine (SVM), decision tree, convolutional neural network (CNN), and long short-term memory (LSTM) network; (iii) evaluate and compare performance of these classifiers using multiple metrics including accuracy, precision, recall and F1-score; and (iv) integrate one of the classifiers into the existing POAWRS array and signal processing software. The findings presented here will (1) provide an automatic classifier for near real-time fin whale vocalization detection and recognition, useful in marine mammal monitoring applications; and (2) lay the foundation for building an automatic classifier applied for near real-time detection and recognition of a wide variety of biological, geophysical, and man-made sound sources typically detected by the POAWRS system in the ocean.more » « less
-
Artificial intelligence (AI) and machine learning models are being increasingly deployed in real-world applications. In many of these applications, there is strong motivation to develop hybrid systems in which humans and AI algorithms can work together, leveraging their complementary strengths and weaknesses. We develop a Bayesian framework for combining the predictions and different types of confidence scores from humans and machines. The framework allows us to investigate the factors that influence complementarity, where a hybrid combination of human and machine predictions leads to better performance than combinations of human or machine predictions alone. We apply this framework to a large-scale dataset where humans and a variety of convolutional neural networks perform the same challenging image classification task. We show empirically and theoretically that complementarity can be achieved even if the human and machine classifiers perform at different accuracy levels as long as these accuracy differences fall within a bound determined by the latent correlation between human and machine classifier confidence scores. In addition, we demonstrate that hybrid human–machine performance can be improved by differentiating between the errors that humans and machine classifiers make across different class labels. Finally, our results show that eliciting and including human confidence ratings improve hybrid performance in the Bayesian combination model. Our approach is applicable to a wide variety of classification problems involving human and machine algorithms.more » « less
-
Mancuso, Renato (Ed.)Deep learning–based classifiers are widely used for perception in autonomous Cyber-Physical Systems (CPS’s). However, such classifiers rarely offer guarantees of perfect accuracy while being optimized for efficiency. To support safety-critical perception, ensembles of multiple different classifiers working in concert are typically used. Since CPS’s interact with the physical world continuously, it is not unreasonable to expect dependencies among successive inputs in a stream of sensor data. Prior work introduced a classification technique that leverages these inter-input dependencies to reduce the average time to successful classification using classifier ensembles. In this paper, we propose generalizations to this classification technique, both in the improved generation of classifier cascades and the modeling of temporal dependencies. We demonstrate, through theoretical analysis and numerical evaluation, that our approach achieves further reductions in average classification latency compared to the prior methods.more » « less
An official website of the United States government
