NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A holistic approach for single-cell data trajectory inference using chromosome physical location and ensemble random walk

Cardoza-Aguilar, J.; Milbourn, C.; Zhang, Y.; Yang, L.; Dascalu, S.; Harris, F. (April 2024, 21st International Conference on Information Technology : New Generations)

Full Text Available
Feature Collusion Attack on PMU Data-driven Event Classification

https://doi.org/10.1109/ISGT59692.2024.10454151

Ghasemkhani, Amir; Haridas, Rutuja Sanjeev; Sajjadi Mohammadabadi, Seyed Mahmoud; Yang, Lei (February 2024, IEEE)

Full Text Available
TopoCommit: A Topological Commit Protocol for Cross-Ledger Transactions in Scientific Computing

https://doi.org/10.1109/CLUSTER52292.2023.00038

Tawose, Olamide Timothy; Yang, Lei; Zhao, Dongfang (October 2023, IEEE)
Towards Distributed Learning of PMU Data: A Federated Learning based Event Classification Approach

https://doi.org/10.1109/PESGM52003.2023.10252920

Mohammadabadi, Seyed Mahmoud; Liu, Yunchuan; Canafe, Abraham; Yang, Lei (July 2023, IEEE)
Toward Efficient Homomorphic Encryption for Outsourced Databases through Parallel Caching

https://doi.org/10.1145/3588920

Timothy Tawose, Olamide; Dai, Jun; Yang, Lei; Zhao, Dongfang (May 2023, Proceedings of the ACM on Management of Data)

Many applications deployed to public clouds are concerned about the confidentiality of their outsourced data, such as financial services and electronic patient records. A plausible solution to this problem is homomorphic encryption (HE), which supports certain algebraic operations directly over the ciphertexts. The downside of HE schemes is their significant, if not prohibitive, performance overhead for data-intensive workloads that are very common for outsourced databases, or database-as-a-serve in cloud computing. The objective of this work is to mitigate the performance overhead incurred by the HE module in outsourced databases. To that end, this paper proposes a radix-based parallel caching optimization for accelerating the performance of homomorphic encryption (HE) of outsourced databases in cloud computing. The key insight of the proposed optimization is caching selected radix-ciphertexts in parallel without violating existing security guarantees of the primitive/base HE scheme. We design the radix HE algorithm and apply it to both batch- and incremental-HE schemes; we demonstrate the security of those radix-based HE schemes by showing that the problem of breaking them can be reduced to the problem of breaking their base HE schemes that are known IND-CPA (i.e. Indistinguishability under Chosen-Plaintext Attack). We implement the radix-based schemes as middleware of a 10-node Cassandra cluster on CloudLab; experiments on six workloads show that the proposed caching can boost state-of-the-art HE schemes, such as Paillier and Symmetria, by up to five orders of magnitude.
more » « less
Full Text Available
Drifting Streaming Peaks-over-Threshold-Enhanced Self-Evolving Neural Networks for Short-Term Wind Farm Generation Forecast

https://doi.org/10.3390/fi15010017

Liu, Yunchuan; Ghasemkhani, Amir; Yang, Lei (January 2023, Future Internet)

This paper investigates the short-term wind farm generation forecast. It is observed from the real wind farm generation measurements that wind farm generation exhibits distinct features, such as the non-stationarity and the heterogeneous dynamics of ramp and non-ramp events across different classes of wind turbines. To account for the distinct features of wind farm generation, we propose a Drifting Streaming Peaks-over-Threshold (DSPOT)-enhanced self-evolving neural networks-based short-term wind farm generation forecast. Using DSPOT, the proposed method first classifies the wind farm generation data into ramp and non-ramp datasets, where time-varying dynamics are taken into account by utilizing dynamic ramp thresholds to separate the ramp and non-ramp events. We then train different neural networks based on each dataset to learn the different dynamics of wind farm generation by the NeuroEvolution of Augmenting Topologies (NEAT), which can obtain the best network topology and weighting parameters. As the efficacy of the neural networks relies on the quality of the training datasets (i.e., the classification accuracy of the ramp and non-ramp events), a Bayesian optimization-based approach is developed to optimize the parameters of DSPOT to enhance the quality of the training datasets and the corresponding performance of the neural networks. Based on the developed self-evolving neural networks, both distributional and point forecasts are developed. The experimental results show that compared with other forecast approaches, the proposed forecast approach can substantially improve the forecast accuracy, especially for ramp events. The experiment results indicate that the accuracy improvement in a 60 min horizon forecast in terms of the mean absolute error (MAE) is at least 33.6% for the whole year data and at least 37% for the ramp events. Moreover, the distributional forecast in terms of the continuous rank probability score (CRPS) is improved by at least 35.8% for the whole year data and at least 35.2% for the ramp events.
more » « less
Full Text Available
AI Service Placement for Multi-Access Edge Intelligence Systems in 6G

https://doi.org/10.1109/TNSE.2022.3228815

Li, Jiaxin; Lin, Fuhong; Yang, Lei; Huang, Daochao (January 2023, IEEE Transactions on Network Science and Engineering)

Full Text Available
Nemo: An Open-Source Transformer-Supercharged Benchmark for Fine-Grained Wildfire Smoke Detection

https://doi.org/10.3390/rs14163979

Yazdi, Amirhessam; Qin, Heyang; Jordan, Connor B.; Yang, Lei; Yan, Feng (August 2022, Remote Sensing)

Deep-learning (DL)-based object detection algorithms can greatly benefit the community at large in fighting fires, advancing climate intelligence, and reducing health complications caused by hazardous smoke particles. Existing DL-based techniques, which are mostly based on convolutional networks, have proven to be effective in wildfire detection. However, there is still room for improvement. First, existing methods tend to have some commercial aspects, with limited publicly available data and models. In addition, studies aiming at the detection of wildfires at the incipient stage are rare. Smoke columns at this stage tend to be small, shallow, and often far from view, with low visibility. This makes finding and labeling enough data to train an efficient deep learning model very challenging. Finally, the inherent locality of convolution operators limits their ability to model long-range correlations between objects in an image. Recently, encoder–decoder transformers have emerged as interesting solutions beyond natural language processing to help capture global dependencies via self- and inter-attention mechanisms. We propose Nemo: a set of evolving, free, and open-source datasets, processed in standard COCO format, and wildfire smoke and fine-grained smoke density detectors, for use by the research community. We adapt Facebook’s DEtection TRansformer (DETR) to wildfire detection, which results in a much simpler technique, where the detection does not rely on convolution filters and anchors. Nemo is the first open-source benchmark for wildfire smoke density detection and Transformer-based wildfire smoke detection tailored to the early incipient stage. Two popular object detection algorithms (Faster R-CNN and RetinaNet) are used as alternatives and baselines for extensive evaluation. Our results confirm the superior performance of the transformer-based method in wildfire smoke detection across different object sizes. Moreover, we tested our model with 95 video sequences of wildfire starts from the public HPWREN database. Our model detected 97.9% of the fires in the incipient stage and 80% within 5 min from the start. On average, our model detected wildfire smoke within 3.6 min from the start, outperforming the baselines.
more » « less
Full Text Available
Optimizing inference serving on serverless platforms

https://doi.org/10.14778/3547305.3547313

Ali, Ahsan; Pinciroli, Riccardo; Yan, Feng; Smirni, Evgenia (June 2022, Proceedings of the VLDB Endowment)

Serverless computing is gaining popularity for machine learning (ML) serving workload due to its autonomous resource scaling, easy to use and pay-per-use cost model. Existing serverless platforms work well for image-based ML inference, where requests are homogeneous in service demands. That said, recent advances in natural language processing could not fully benefit from existing serverless platforms as their requests are intrinsically heterogeneous. Batching requests for processing can significantly increase ML serving efficiency while reducing monetary cost, thanks to the pay-per-use pricing model adopted by serverless platforms. Yet, batching heterogeneous ML requests leads to additional computation overhead as small requests need to be "padded" to the same size as large requests within the same batch. Reaching effective batching decisions (i.e., which requests should be batched together and why) is non-trivial: the padding overhead coupled with the serverless auto-scaling forms a complex optimization problem. To address this, we develop Multi-Buffer Serving (MBS), a framework that optimizes the batching of heterogeneous ML inference serving requests to minimize their monetary cost while meeting their service level objectives (SLOs). The core of MBS is a performance and cost estimator driven by analytical models supercharged by a Bayesian optimizer. MBS is prototyped and evaluated on AWS using bursty workloads. Experimental results show that MBS preserves SLOs while outperforming the state-of-the-art by up to 8 x in terms of cost savings while minimizing the padding overhead by up to 37 x with 3 x less number of serverless function invocations.
more » « less
Full Text Available
Data Imputation for Multivariate Time Series Sensor Data With Large Gaps of Missing Data

https://doi.org/10.1109/JSEN.2022.3166643

Wu, Rui; Hamshaw, Scott D.; Yang, Lei; Kincaid, Dustin W.; Etheridge, Randall; Ghasemkhani, Amir (June 2022, IEEE Sensors Journal)

Full Text Available

« Prev Next »

Search for: All records