skip to main content


Search for: All records

Creators/Authors contains: "Liu, Mengjing"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Human activity recognition provides insights into physical and mental well-being by monitoring patterns of movement and behavior, facilitating personalized interventions and proactive health management. Radio Frequency (RF)-based human activity recognition (HAR) is gaining attention due to its less privacy exposure and non-contact characteristics. However, it suffers from data scarcity problems and is sensitive to environment changes. Collecting and labeling such data is laborintensive and time consuming. The limited training data makes generalizability challenging when the sensor is deployed in a very different relative view in the real world. Synthetic data generation from abundant videos presents a potential to address data scarcity issues, yet the domain gaps between synthetic and real data constrain its benefit. In this paper, we firstly share our investigations and insights on the intrinsic limitations of existing video-based data synthesis methods. Then we present M4X, a method using metric learning to extract effective viewindependent features from the more abundant synthetic data despite their domain gaps, thus enhancing cross-view generalizability. We explore two main design issues in different mining strategies for contrastive pairs/triplets construction, and different forms of loss functions. We find that the best choices are offline triplet mining with real data as anchors, balanced triplets, and a triplet loss function without hard negative mining for higher discriminative power. Comprehensive experiments show that M4X consistently outperform baseline methods in cross-view generalizability. In the most challenging case of the least amount of real training data, M4X outperforms three baselines by 7.9- 16.5% on all views, and 18.9-25.6% on a view with only synthetic but no real data during training. This proves its effectiveness in extracting view-independent features from synthetic data despite their domain gaps. We also observe that given limited sensor deployments, a participant-facing viewpoint and another at a large angle (e.g. 60◦) tend to produce much better performance. 
    more » « less
    Free, publicly-accessible full text available June 19, 2025
  2. Free, publicly-accessible full text available June 19, 2025
  3. In Activities of Daily Living (ADL) research, which has gained prominence due to the burgeoning aging population, the challenge of acquiring sufficient ground truth data for model training is a significant bottleneck. This obstacle necessitates a pivot towards unsupervised representation learning methodologies, which do not require many labeled datasets. The existing research focused on the tradeoff between the fully supervised model and the unsupervised pre-trained model and found that the unsupervised version outperformed in most cases. However, their investigation did not use large enough Human Activity Recognition (HAR) datasets, both datasets resulting in 3 dimensions. This poster extends the investigation by employing a large multivariate time series HAR dataset and experimenting with the models with different combinations of critical training parameters such as batch size and learning rate to observe the performance tradeoff. Our findings reveal that the pre-trained model is comparable to the fully supervised classification with a larger multivariate time series HAR dataset. This discovery underscores the potential of unsupervised representation learning in ADL extractions and highlights the importance of model configuration in optimizing performance. 
    more » « less
    Free, publicly-accessible full text available June 19, 2025
  4. Free, publicly-accessible full text available June 19, 2025
  5. A data collection infrastructure is vital for generating sufficient amounts and diversity of data necessary for developing algorithms in home-based health monitoring. However, the manageability— deployment and operation efforts—of such an infrastructure has long been overlooked. Even a small size of a dozen homes may incur enormous manual efforts on the research team, including installing, configuring and updating of sensor, edge devices; continuous monitoring for faults and errors to prevent data losses, and integrating new sensing modalities. In this paper, we present Proteus, an easily managed infrastructure designed to automate much of the work in deploying and operating such systems. Proteus includes: i) scalable, continuous deployment and update of devices with automatic bootstrapping; ii) automatic fault and error monitoring and recovery with watchdogs and LED feedback, and complementary edge and cloud storage backups; and iii) an easy-to-use data-agnostic pipeline for integrating new modalities. We demonstrate our system’s robustness through different sets of experiments: 3 sensor nodes running for 24 days sending data (17.3 Mbps aggregate rate), and 16 emulated sensors (92.8 Mbps aggregate rate). All such experiments have data loss rates less than 1%. Further we reduce human efforts by 25-fold and code required for adding new data modality by 25-fold. Our results show that Proteus is a promising solution for enabling research teams to effectively manage home-based health monitoring at small to medium sizes. 
    more » « less
  6. Fully decentralized model training for on-road vehicles can leverage crowdsourced data while not depending on central servers, infrastructure or Internet coverage. However, under unreliable wireless communication and short contact duration, model sharing among peer vehicles may suffer severe losses thus fail frequently. To address these challenges, we propose “RoADTrain”, a route-assisted decentralized peer model training approach that carefully chooses vehicles with high chances of successful model sharing. It bounds the per round communication time yet retains model performance under vehicle mobility and unreliable communication. Based on shared route information, a connected cluster of vehicles can estimate and embed the link reliability and contact duration information into the communication topology. We decompose the topology into subgraphs supporting parallel communication, and identify a subset of them with the highest algebraic connectivity that can maximize the speed of the information flow in the cluster with high model sharing successes, thus accelerating model training in the cluster. We conduct extensive evaluation on driving decision making models using the popular CARLA simulator. RoADTrain achieves comparable driving success rates and 1.2−4.5× faster convergence than representative decentralized learning methods that always succeed in model sharing (e.g., SGP), and significantly outperforms other benchmarks that consider losses by 17−27% in the hardest driving conditions. These demonstrate that route sharing enables shrewd selection of vehicles for model sharing, thus better model performance and faster convergence against wireless losses and mobility. 
    more » « less
  7. Home-based health monitoring systems are important to many conditions (e.g., aging, chronic diseases). The absence of suitable data collection infrastructure is a fundamental barrier to the development of related algorithms and systems. In this poster, we present Proteus, a robust, extensible and scalable data collection infrastructure, to enable small research teams to manage large deployments. We identify the desired features and achieve them by combining mature technologies and new components: i) extensibility with new, diverse sensor types and data formats with a few lines of coding (LOC) efforts; ii) scalability in managing sensor/edge devices to automate many deployment, management tasks; iii) resilience to system failures and network outage. Experiments on a prototype show zero data loss or system error for one sensor node running 10 days, and 99.95% of data received for 32 emulated sensors sending data at 200 Mbps, 20 and 100 fold reductions in node setup efforts and LOC for new sensor types. The preliminary results show Proteus is promising for large-scale longitudinal deployment of home-based health monitoring. 
    more » « less