Search for: All records

Award ID contains: 2140247

« Prev Next »

Total Resources

8

Resource Type
Conference Paper

8

Conference Proceeding

0

Dataset

0

Journal Article

0

Workshop Report

0

Availability
Full Text / Resource Available

4

Citation Only

4

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

LISSNAS: Locality-based Iterative Search Space Shrinkage for Neural Architecture Search

https://doi.org/10.24963/ijcai.2023/86

Gopal, Bhavna ; Sridhar, Arjun ; Zhang, Tunhou ; Chen, Yiran ( August 2023 , The Thirty-Second International Joint Conference on Artificial Intelligence)

Search spaces hallmark the advancement of Neural Architecture Search (NAS). Large and complex search spaces with versatile building operators and structures provide more opportunities to brew promising architectures, yet pose severe challenges on efficient exploration and exploitation. Subsequently, several search space shrinkage methods optimize by selecting a single sub-region that contains some well-performing networks. Small performance and efficiency gains are observed with these methods but such techniques leave room for significantly improved search performance and are ineffective at retaining architectural diversity. We propose LISSNAS, an automated algorithm that shrinks a large space into a diverse, small search space with SOTA search performance. Our approach leverages locality, the relationship between structural and performance similarity, to efficiently extract many pockets of well-performing networks. We showcase our method on an array of search spaces spanning various sizes and datasets. We accentuate the effectiveness of our shrunk spaces when used in one-shot search by achieving the best Top-1 accuracy in two different search spaces. Our method achieves a SOTA Top-1 accuracy of 77.6% in ImageNet under mobile constraints, best-in-class Kendal-Tau, architectural diversity, and search space size.

more » « less
Free, publicly-accessible full text available August 1, 2024
Fed-CBS: A Heterogeneity-Aware Client Sampling Mechanism for Federated Learning via Class-Imbalance Reduction

Zhang, Jianyi ; Li, Ang ; Tang, Minxue ; Sun, Jingwei ; Chen, Xiang ; Zhang, Fan ; Chen, Changyou ; Chen, Yiran ; Li, Hai ( July 2023 , International Conference on Machine Learning)

Due to the often limited communication bandwidth of edge devices, most existing federated learning (FL) methods randomly select only a subset of devices to participate in training at each communication round. Compared with engaging all the available clients, such a random-selection mechanism could lead to significant performance degradation on non-IID (independent and identically distributed) data. In this paper, we present our key observation that the essential reason resulting in such performance degradation is the class-imbalance of the grouped data from randomly selected clients. Based on this observation, we design an efficient heterogeneity-aware client sampling mechanism, namely, Federated Class-balanced Sampling (Fed-CBS), which can effectively reduce class-imbalance of the grouped dataset from the intentionally selected clients. We first propose a measure of class-imbalance which can be derived in a privacy-preserving way. Based on this measure, we design a computationefficient client sampling strategy such that the actively selected clients will generate a more classbalanced grouped dataset with theoretical guarantees. Experimental results show that Fed-CBS outperforms the status quo approaches in terms of test accuracy and the rate of convergence while achieving comparable or even better performance than the ideal setting where all the available clients participate in the FL training.
more » « less
Free, publicly-accessible full text available July 23, 2024
ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models

Zhang, Jianyi ; Muhamed, Aashiq ; Anantharaman, Aditya ; Wang, Guoyin ; Chen, Changyou ; Zhong, Kai ; Cui, Qingjun ; Xu, Yi ; Zeng, Belinda ; Chilimbi, Trishul ; et al ( July 2023 , The 61st Annual Meeting of the Association for Computational Linguistics)

Knowledge Distillation (KD) (Hinton et al., 2015) is one of the most effective approaches for deploying large-scale pre-trained language models in low-latency environments by transferring the knowledge contained in the largescale models to smaller student models. Previous KD approaches use the soft labels and intermediate activations generated by the teacher to transfer knowledge to the student model parameters alone. In this paper, we show that having access to non-parametric memory in the form of a knowledge base with the teacher’s soft labels and predictions can further enhance student capacity and improve generalization. To enable the student to retrieve from the knowledge base effectively, we propose a new Retrieval-augmented KD framework with a loss function that aligns the relational knowledge in teacher and student embedding spaces. We show through extensive experiments that our retrieval mechanism can achieve state-of-the-art performance for taskspecific knowledge distillation on the GLUE benchmark (Wang et al., 2018a).
more » « less
Free, publicly-accessible full text available July 9, 2024
NASRec: Weight Sharing Neural Architecture Search for Recommender Systems

https://doi.org/10.1145/3543507.3583446

Zhang, Tunhou ; Cheng, Dehua ; He, Yuchen ; Chen, Zhengxing ; Dai, Xiaoliang ; Xiong, Liang ; Yan, Feng ; Li, Hai ; Chen, Yiran ; Wen, Wei ( April 2023 , the ACM Web Conference 2023)

The rise of deep neural networks offers new opportunities in optimizing recommender systems. However, optimizing recommender systems using deep neural networks requires delicate architecture fabrication. We propose NASRec, a paradigm that trains a single supernet and efficiently produces abundant models/sub-architectures by weight sharing. To overcome the data multi-modality and architecture heterogeneity challenges in the recommendation domain, NASRec establishes a large supernet (i.e., search space) to search the full architectures. The supernet incorporates versatile choice of operators and dense connectivity to minimize human efforts for finding priors. The scale and heterogeneity in NASRec impose several challenges, such as training inefficiency, operator-imbalance, and degraded rank correlation. We tackle these challenges by proposing single-operator any-connection sampling, operator-balancing interaction modules, and post-training fine-tuning. Our crafted models, NASRecNet, show promising results on three Click-Through Rates (CTR) prediction benchmarks, indicating that NASRec outperforms both manually designed models and existing NAS methods with state-of-the-art performance. Our work is publicly available here.
more » « less
Free, publicly-accessible full text available April 30, 2024
Mixture Outlier Exposure: Towards Out-of-Distribution Detection in Fine-grained Environments

Zhang, Jingyang ; Inkawhich, Nathan ; Linderman, Randolph ; Chen, Yiran ; Li, Hai ( January 2023 , the IEEE/CVF Winter Conference on Applications of Computer Vision)

Many real-world scenarios in which DNN-based recognition systems are deployed have inherently fine-grained attributes (e.g., bird-species recognition, medical image classification). In addition to achieving reliable accuracy, a critical subtask for these models is to detect Out-of-distribution (OOD) inputs. Given the nature of the deployment environment, one may expect such OOD inputs to also be fine-grained w.r.t. the known classes (e.g., a novel bird species), which are thus extremely difficult to identify. Unfortunately, OOD detection in fine-grained scenarios remains largely underexplored. In this work, we aim to fill this gap by first carefully constructing four large-scale fine-grained test environments, in which existing methods are shown to have difficulties. Particularly, we find that even explicitly incorporating a diverse set of auxiliary outlier data during training does not provide sufficient coverage over the broad region where fine-grained OOD samples locate. We then propose Mixture Outlier Exposure (MixOE), which mixes ID data and training outliers to expand the coverage of different OOD granularities, and trains the model such that the prediction confidence linearly decays as the input transitions from ID to OOD. Extensive experiments and analyses demonstrate the effectiveness of MixOE for building up OOD detector in fine-grained environments. The code is available at https://github.com/zjysteven/MixOE.
more » « less
Full Text Available
An Audio Frequency Unfolding Framework for Ultra-Low Sampling Rate Sensors

https://doi.org/10.1109/ISQED54688.2022.9806149

Gao, Zhihui ; Tang, Minxue ; Li, Ang ; Chen, Yiran ( April 2022 , 2022 23rd International Symposium on Quality Electronic Design)

Full Text Available
FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from a Client Perspective

Sun, Jingwei ; Li, Ang ; DiValentin, Louis ; Hassanzadeh, Amin ; Chen, Yiran ; Li, Hai ( December 2021 , Annual Conference on Neural Information Processing Systems (NeurIPS))

Full Text Available
FedMask: Joint Computation and Communication-Efficient Personalized Federated Learning via Heterogeneous Masking

https://doi.org/10.1145/3485730.3485929

Li, Ang ; Sun, Jingwei ; Zeng, Xiao ; Zhang, Mi ; Li, Hai ; Chen, Yiran ( January 2021 , ACM Conference on Embedded Networked Sensor Systems (SenSys'21))

Full Text Available