NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Observation-Transformed Distributed Detection in Sensor Networks

https://doi.org/10.1109/CISS64860.2025.10944735

Cao, Lei; Viswanathan, Ramanarayanan (March 2025, IEEE)

Free, publicly-accessible full text available March 19, 2026
The nucleation and migration of $$\{10\bar{1}1\}$$ twins in hcp materials

https://doi.org/10.1007/s10853-025-10844-3

Ombogo, Jamie; Vitral, Eduardo; Zahiri, Amir; Cao, Lei (April 2025, Journal of Materials Science)

Free, publicly-accessible full text available April 1, 2026
DATAMORPHER: Automatic Data Transformation using LLM-Based Zero-Shot Code Generation

https://doi.org/10.1109/ICDE65448.2025.00346

Sharma, Ankita; Tandel, Jaykumar; Li, Xuanmao; Wang, Lanjun; Fariha, Anna; Zhang, Liang; Naqvi, Syed_Arsalan_Ahmed; Riaz, Irbaz_Bin; Cao, Lei; Zou, Jia (May 2025, 2025 IEEE 41st International Conference on Data Engineering (ICDE))

Free, publicly-accessible full text available May 7, 2026
Optimal Sensor Decision Rules for Quantized-but-Uncoded Distributed Detection

https://doi.org/10.1109/LSP.2024.3514798

Cao, Lei; Viswanathan, Ramanarayanan (January 2025, IEEE Signal Processing Letters)

Free, publicly-accessible full text available January 1, 2026
Deep-neural-network molecular dynamics investigation of phonon thermal transport in polyether ether ketone

https://doi.org/10.1016/j.commatsci.2024.113641

Cui, Haoran; Hua, Weijian; Cao, Lei; Jin, Yifei; Wang, Yan (February 2025, Computational Materials Science)

Free, publicly-accessible full text available February 1, 2026
Agree to Disagree: Robust Anomaly Detection with Noisy Labels

https://doi.org/10.1145/3709657

Hofmann, Dennis M; VanNostrand, Peter M; Ma, Lei; Zhang, Huayi; DeOliveira, Joshua C; Cao, Lei; Rundensteiner, Elke A (February 2025, Proceedings of the ACM on Management of Data)

Due to the scarcity of reliable anomaly labels, recent anomaly detection methods leveraging noisy auto-generated labels either select clean samples or refurbish noisy labels. However, both approaches struggle due to the unique properties of anomalies.Sample selectionoften fails to separate sufficiently many clean anomaly samples from noisy ones, whilelabel refurbishmenterroneously refurbishesmarginalclean samples. To overcome these limitations, we design Unity, thefirstlearning from noisy labels (LNL) approach for anomaly detection that elegantly leverages the merits of both sample selection and label refurbishment to iteratively prepare a diverse clean sample set for network training. Unity uses a pair of deep anomaly networks to collaboratively select samples with clean labels based on prediction agreement, followed by a disagreement resolution mechanism to capture marginal samples with clean labels. Thereafter, Unity utilizes unique properties of anomalies to design an anomaly-centric contrastive learning strategy that accurately refurbishes the remaining noisy labels. The resulting set, composed ofselected and refurbishedclean samples, will be used to train the anomaly networks in the next training round. Our experimental study on 10 real-world benchmark datasets demonstrates that Unity consistently outperforms state-of-the-art LNL techniques by up to 0.31 in F-1 Score (0.52 \rightarrow 0.83).
more » « less
Free, publicly-accessible full text available February 10, 2026
Pluto: Sample Selection for Robust Anomaly Detection on Polluted Log Data

https://doi.org/10.1145/3677139

Ma, Lei; Cao, Lei; VanNostrand, Peter M; Hofmann, Dennis M; Su, Yao; Rundensteiner, Elke A (October 2024, Proceedings of the ACM on Management of Data)

Log anomaly detection, critical in identifying system failures and preempting security breaches, finds irregular patterns within large volumes of log data. Modern log anomaly detectors rely on training deep learning models on clean anomaly-free log data. However, such clean log data requires expensive and tedious human labeling. In this paper, we thus propose a robust log anomaly detection framework, PlutoNOSPACE, that automatically selects a clean representative sample subset of the polluted log sequence data to train a Transformer-based anomaly detection model. Pluto features three innovations. First, due to localized concentrations of anomalies inherent in the embedding space of log data, Pluto partitions the sequence embedding space generated by the model into regions that then allow it to identify and discard regions that are highly polluted by our pollution level estimation scheme, based on our pollution quantification via Gaussian mixture modeling. Second, for the remaining more slightly polluted regions, we select samples that maximally purify the eigenvector spectrum, which can be transformed into the NP-hard facility location problem; allowing us to leverage its greedy solution with a (1-(1/e)) approximation guarantee in optimality. Third, by iteratively alternating between the above subset selection, a model re-training on the latest subset, and a subset filtering using dynamic training artifacts generated by the latest model, the data selected is progressively refined. The final sample set is used to retrain the final anomaly detection model. Our experiments on four real-world log benchmark datasets demonstrate that by retaining 77.7% (BGL) to 96.6% (ThunderBird) of the normal sequences while effectively removing 90.3% (BGL) to 100.0% (ThunderBird, HDFS) of the anomalies, Pluto provides a significant absolute F-1 improvement up to 68.86% (2.16% → 71.02%) compared to the state-of-the-art sample selection methods. The implementation of this work is available at https://github.com/LeiMa0324/Pluto-SIGMOD25.
more » « less
Full Text Available
Pluto: Sample Selection for Robust Anomaly Detection on Polluted Log Data

Ma, Lei; Cao, Lei; VanNostrand, Peter M; Hofmann, Dennis M; Su, Yao; Rundensteiner, Elke A (September 2024, Proceedings of the ACM on Management of Data)

Log anomaly detection, critical in identifying system failures and preempting security breaches, finds irregular patterns within large volumes of log data. Modern log anomaly detectors rely on training deep learning models on clean anomaly-free log data. However, such clean log data requires expensive and tedious human labeling. In this paper, we thus propose a robust log anomaly detection framework, PlutoNOSPACE, that automatically selects a clean representative sample subset of the polluted log sequence data to train a Transformer-based anomaly detection model. Pluto features three innovations. First, due to localized concentrations of anomalies inherent in the embedding space of log data, Pluto partitions the sequence embedding space generated by the model into regions that then allow it to identify and discard regions that are highly polluted by our pollution level estimation scheme, based on our pollution quantification via Gaussian mixture modeling. Second, for the remaining more slightly polluted regions, we select samples that maximally purify the eigenvector spectrum, which can be transformed into the NP-hard facility location problem; allowing us to leverage its greedy solution with a (1-(1/e)) approximation guarantee in optimality. Third, by iteratively alternating between the above subset selection, a model re-training on the latest subset, and a subset filtering using dynamic training artifacts generated by the latest model, the data selected is progressively refined. The final sample set is used to retrain the final anomaly detection model. Our experiments on four real-world log benchmark datasets demonstrate that by retaining 77.7\% (BGL) to 96.6\% (ThunderBird) of the normal sequences while effectively removing 90.3\% (BGL) to 100.0\% (ThunderBird, HDFS) of the anomalies, Pluto provides a significant absolute F-1 improvement up to 68.86\% (2.16\% → 71.02\%) compared to the state-of-the-art sample selection methods. The implementation of this work is available at https://github.com/LeiMa0324/Pluto-SIGMOD25.
more » « less
Full Text Available
Analysis of SRAM PUF Integrity Under Ionizing Radiation: Effects of Stored Data and Technology Node

https://doi.org/10.1109/TNS.2023.3340949

Surendranathan, Umeshwarnath; Wilson, Horace; Cao, Lei R; Milenkovic, Aleksandar; Ray, Biswajit (April 2024, IEEE Transactions on Nuclear Science)

Full Text Available
The anisotropy of deformation twinning in bcc materials: Mechanical loading, temperature effect, and twin–twin interaction

https://doi.org/10.1016/j.actamat.2024.119681

Zahiri, Amir Hassan; Lotfpour, Mehrab; Ombogo, Jamie; Vitral, Eduardo; Cao, Lei (March 2024, Acta Materialia)

Full Text Available

« Prev Next »

Search for: All records