Most existing neural architecture search (NAS) benchmarks and algorithms prioritize well-studied tasks, eg image classification on CIFAR or ImageNet. This makes the performance of NAS approaches in more diverse areas poorly understood. In this paper, we present NAS-Bench-360, a benchmark suite to evaluate methods on domains beyond those traditionally studied in architecture search, and use it to address the following question: do state-of-the-art NAS methods perform well on diverse tasks? To construct the benchmark, we curate ten tasks spanning a diverse array of application domains, dataset sizes, problem dimensionalities, and learning objectives. Each task is carefully chosen to interoperate with modern CNN-based search methods while possibly being far-afield from its original development domain. To speed up and reduce the cost of NAS research, for two of the tasks we release the precomputed performance of 15,625 architectures comprising a standard CNN search space. Experimentally, we show the need for more robust NAS evaluation of the kind NAS-Bench-360 enables by showing that several modern NAS procedures perform inconsistently across the ten tasks, with many catastrophically poor results. We also demonstrate how NAS-Bench-360 and its associated precomputed results will enable future scientific discoveries by testing whether several recent hypotheses promoted in the NAS literature hold on diverse tasks. NAS-Bench-360 is hosted at https://nb360. ml. cmu. edu.
more »
« less
Network Compression via Cooperative Architecture Search and Distillation
Neural Architecture Search (NAS) and its variants are competitive in many computer vision tasks lately. In this paper, we develop a Cooperative Architecture Search and Distillation (CASD) method for network compression. Compared with prior art, our method achieves better performance in ResNet-164 pruning on CIFAR-10 and CIFAR-100 image classifications, promising to be extended to other tasks.
more »
« less
- PAR ID:
- 10335367
- Date Published:
- Journal Name:
- Proceedings IEEE International Conference on Artificial Intelligence for Industries
- ISSN:
- 2770-4718
- Page Range / eLocation ID:
- 42-43
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Bi-level optimization methods in machine learning are popularly effective in subdomains of neural architecture search, data reweighting, etc. However, most of these methods do not factor in variations in learning difficulty, which limits their performance in real-world applications. To address the above problems, we propose a framework that imitates the learning process of humans. In human learning, learners usually focus more on the topics where mistakes have been made in the past to deepen their understanding and master the knowledge. Inspired by this effective human learning technique, we propose a multilevel optimization framework, learning from mistakes (LFM), for machine learning. We formulate LFM as a three-stage optimization problem: 1) the learner learns, 2) the learner relearns based on the mistakes made before, and 3) the learner validates his learning. We develop an efficient algorithm to solve the optimization problem. We further apply our method to differentiable neural architecture search and data reweighting. Extensive experiments on CIFAR-10, CIFAR-100, ImageNet, and other related datasets powerfully demonstrate the effectiveness of our approach. The code of LFM is available at: https://github.com/importZL/LFM.more » « less
-
Learning from one's mistakes is an effective human learning technique where the learners focus more on the topics where mistakes were made, so as to deepen their understanding. In this paper, we investigate if this human learning strategy can be applied in machine learning. We propose a novel machine learning method called Learning From Mistakes (LFM), wherein the learner improves its ability to learn by focusing more on the mistakes during revision. We formulate LFM as a three-stage optimization problem: 1) learner learns; 2) learner re-learns focusing on the mistakes, and; 3) learner validates its learning. We develop an efficient algorithm to solve the LFM problem. We apply the LFM framework to neural architecture search on CIFAR-10, CIFAR-100, and Imagenet. Experimental results strongly demonstrate the effectiveness of our model.more » « less
-
It has been shown by many researchers that transformers perform as well as convolutional neural networks in many computer vision tasks. Meanwhile, the large computational costs of its attention module hinder further studies and applications on edge devices. Some pruning methods have been developed to construct efficient vision transformers, but most of them have considered image classification tasks only. Inspired by these results, we propose SiDT, a method for pruning vision transformer backbones on more complicated vision tasks like object detection, based on the search of transformer dimensions. Experiments on CIFAR-100 and COCO datasets show that the backbones with 20% or 40% dimensions/parameters pruned can have similar or even better performance than the unpruned models. Moreover, we have also provided the complexity analysis and comparisons with the previous pruning methods.more » « less
-
The ability to automatically generate a neural network architecture and the corresponding hardware implementation to optimize both accuracy and performance characteristics (latency, power) simultaneously for edge-based Artificial Intelligence (AI) applications is becoming prevalent. As both neural architecture search (NAS) and hardware implementation have ample design space, it is very challenging to integrate with resource-constrained edge computing hardware since the current co-search frameworks take several hundreds of GPU hours to converge. In this paper, we propose SCORCH, a novel neural architecture search and hardware accelerator co-design framework with reinforcement learning to maximize accuracy, and increase energy efficiency and throughput while converging faster. By predicting hyperparameters of neural networks together with hardware resources, we use a reinforcement-based multi-phased controller to explore neural architecture to achieve higher accuracy and hardware performance simultaneously by applying customized dataflows, voltage/frequency scaling, and tunable Network-on-Chip (NoC) hardware parameters. Our simulation results on the CIFAR-10/100 and ImageNet datasets show that SCORCH achieves identical neural network accuracy while achieving 2.6% higher accuracy, and 35.6%, 26.2%, and 65.8% reductions in latency, energy, and area compared with state-of-art co-search frameworks such as DANCE, NANDS, and NASAIC.more » « less
An official website of the United States government

