NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

OpenICS: Open image compressive sensing toolbox and benchmark

https://doi.org/10.1016/j.simpa.2021.100081

Zhao, Jonathan; Lakatos-Tóth, Márk; Westerham, Matthew; Zhang, Zhikang; Moskoff, Avi; Ren, Fengbo (August 2021, Software impacts)

The real-world application of image compressive sensing is largely limited by the lack of standardization in implementation and evaluation. To address this limitation, we present OpenICS, an image compressive sensing toolbox that implements multiple popular image compressive sensing algorithms into a unified framework with a standardized user interface. Furthermore, a corresponding benchmark is also proposed to provide a fair and complete evaluation of the implemented algorithms. We hope this work can serve the growing research community of compressive sensing and the industry to facilitate the development and application of image compressive sensing.
more » « less
Full Text Available
Light-Weight RetinaNet for Object Detection on Edge Devices

https://doi.org/10.1109/WF-IoT48130.2020.9221150

Li, Yixing; Dua, Akshay; Ren, Fengbo (June 2020, IEEE 6th World Forum on Internet of Things (WF-IoT))
null (Ed.)
This paper aims at reducing computation for Retinanet, an mAP-30-tier network, to facilitate its practical deployment on edge devices for providing IoT-based object detection services. We first validate RetinaNet has the best FLOP-mAP trade-off among all mAP-30-tier network. Then, we propose a light-weight RetinaNet structure with effective computation- accuracy trade-off by only reducing FLOPs in computationally intensive layers. Compared with the most common way of trading off computation with accuracy-input image scaling, the proposed solution shows a consistently better FLOPs-mAP trade-off curve. Light-weight RetinaNet achieves a 0.3% mAP improvement at 1.8x FLOPs reduction point over the original RetinaNet, and gains 1.8x more energy-efficiency on an Intel Arria 10 FPGA accelerator in the context of edge computing. The proposed method potentially can help a wide range of the object detection applications to move closer to a preferred corner for a better runtime and accuracy, while enjoys more energy-efficient inference at the edge.
more » « less
Full Text Available
Systolic-CNN: An OpenCL-defined Scalable Run-time-flexible FPGA Accelerator Architecture for Accelerating Convolutional Neural Network Inference in Cloud/Edge Computing

https://doi.org/10.1109/FCCM48280.2020.00064

Dua, Akshay; Li, Yixing; Ren, Fengbo (May 2020, IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM))
null (Ed.)
Full Text Available
Cra: A Generic Compression Ratio Adapter for End-To-End Data-Driven Image Compressive Sensing Reconstruction Frameworks

https://doi.org/10.1109/ICASSP40776.2020.9052998

Zhang, Zhikang; Xu, Kai; Ren, Fengbo (May 2020, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))

End-to-end data-driven image compressive sensing reconstruction (EDCSR) frameworks achieve state-of-the-art reconstruction performance in terms of reconstruction speed and accuracy. However, due to their end-to-end nature, existing EDCSR frameworks can not adapt to a variable compression ratio (CR). For applications that desire a variable CR, existing EDCSR frameworks must be trained from scratch at each CR, which is computationally costly and time-consuming. This paper presents a generic compression ratio adapter (CRA) framework that addresses the variable compression ratio (CR) problem for existing EDCSR frameworks with no modification to given reconstruction models nor enormous rounds of training needed. CRA exploits an initial reconstruction network to generate an initial estimate of reconstruction results based on a small portion of the acquired measurements. Subsequently, CRA approximates full measurements for the main reconstruction network by complementing the sensed measurements with resensed initial estimate. Our experiments based on two public image datasets (CIFAR10 and Set5) show that CRA provides an average of 13.02 dB and 5.38 dB PSNR improvement across the CRs from 5 to 30 over a naive zero-padding approach and the AdaptiveNN approach(a prior work), respectively. CRA addresses the fixed-CR limitation of existing EDCSR frameworks and makes them suitable for resource-constrained compressive sensing applications.
more » « less
Full Text Available
BNN Pruning: Pruning Binary Neural Network Guided by Weight Flipping Frequency

Li, Yixing; Ren, Fengbo (March 2020, International Symposium on Quality Electronic Design)

A binary neural network (BNN) is a compact form of neural network. Both the weights and activations in BNNs can be binary values, which leads to a significant reduction in both parameter size and computational complexity compared to their full-precision counterparts. Such reductions can directly translate into reduced memory footprint and computation cost in hardware, making BNNs highly suitable for a wide range of hardware accelerators. However, it is unclear whether and how a BNN can be further pruned for ultimate compactness. As both 0s and 1s are non-trivial in BNNs, it is not proper to adopt any existing pruning method of full- precision networks that interprets 0s as trivial. In this paper, we present a pruning method tailored to BNNs and illustrate that BNNs can be further pruned by using weight flipping frequency as an indicator of sensitivity to accuracy. The experiments performed on the binary versions of a 9- layer Network-in-Network (NIN) and the AlexNet with the CIFAR-10 dataset show that the proposed BNN-pruning method can achieve 20-40% reduction in binary operations with 0.5-1.0% accuracy drop, which leads to a 15-40% run- time speedup on a TitanX GPU.
more » « less
Full Text Available
Build a Compact Binary Neural Network through Bit-level Sensitivity and Data Pruning

Li, Yixing; Zhang, Shuai; Zhou, Xichuan; Ren, Fengbo (January 2020, Neurocomputing)

Due to the high computational complexity and memory storage requirement, it is hard to directly deploy a full-precision Convolutional neural network (CNN) on embedded devices. The hardware-friendly designs are needed for resource-limited and energy-constrained embedded devices. Emerging solutions are adopted for the neural network compression, e.g., binary/ternary weight network, pruned network and quantized network. Among them, binary neural network (BNN) is believed to be the most hardware-friendly framework due to its small network size and low computational complexity. No existing work has further shrunk the size of BNN. In this work, we explore the redundancy in BNN and build a compact BNN (CBNN) based on the bit-level sensitivity analysis and bit-level data pruning. The input data is converted to a high dimensional bit-sliced format. In the post-training stage, we analyze the impact of different bit slices to the accuracy. By pruning the redundant input bit slices and shrinking the network size, we are able to build a more compact BNN. Our result shows that we can further scale down the network size of the BNN up to 3.9x with no more than a 1% accuracy drop. The actual runtime can be reduced up to 2x and 9.9x compared with the baseline BNN and its full-precision counterpart, respectively.
more » « less
Full Text Available
Learning in the Frequency Domain

https://doi.org/10.1109/CVPR42600.2020.00181

Xu, Kai; Qin, Minghai; Sun, Fei; Wang, Yuhao; Chen, Yen-Kuang; Ren, Fengbo (January 2020, IEEE Conference on Computer Vision and Pattern Recognition)
null (Ed.)
Deep neural networks have achieved remarkable success in computer vision tasks. Existing neural networks mainly operate in the spatial domain with fixed input sizes. For practical applications, images are usually large and have to be downsampled to the predetermined input size of neural networks. Even though the downsampling operations reduce computation and the required communication bandwidth, it removes both redundant and salient information obliviously, which results in accuracy degradation. Inspired by digital signal processing theories, we analyze the spectral bias from the frequency perspective and propose a learning-based frequency selection method to identify the trivial frequency components which can be removed without accuracy loss. The proposed method of learning in the frequency domain leverages identical structures of the well-known neural networks, such as ResNet-50, MobileNetV2, and Mask R-CNN, while accepting the frequency-domain information as the input. Experiment results show that learning in the frequency domain with static channel selection can achieve higher accuracy than the conventional spatial downsampling approach and meanwhile further reduce the input data size. Specifically for ImageNet classification with the same input size, the proposed method achieves 1.60% and 0.63% top-1 accuracy improvements on ResNet-50 and MobileNetV2, respectively. Even with half input size, the proposed method still improves the top-1 accuracy on ResNet-50 by 1.42%. In addition, we observe a 0.8% average precision improvement on Mask R-CNN for instance segmentation on the COCO dataset.
more » « less
Full Text Available
A 34-FPS 698-GOP/s/W Binarized Deep Neural Network-based Natural Scene Text Interpretation Accelerator for Mobile Edge Computing

https://doi.org/10.1109/TIE.2018.2875643

Li, Yixing; Liu, Zichuan; Liu, Wenye; Jiang, Yu; Wang, Yongliang; Goh, Wang Ling; Yu, Hao; Ren, Fengbo (October 2018, IEEE Transactions on Industrial Electronics)

Full Text Available
LAPRAN: A Scalable Laplacian Pyramid Reconstructive Adversarial Network for Flexible Compressive Sensing Reconstruction

Xu, Kai; Zhang, Zhikang; Ren, Fengbo (September 2018, The European Conference on Computer Vision (ECCV))

This paper addresses the single-image compressive sensing (CS) and reconstruction problem. We propose a scalable Laplacian pyramid reconstructive adversarial network (LAPRAN) that enables high-fidelity, flexible and fast CS images reconstruction. LAPRAN progressively reconstructs an image following the concept of the Laplacian pyramid through multiple stages of reconstructive adversarial networks (RANs). At each pyramid level, CS measurements are fused with a contextual latent vector to generate a high-frequency image residual. Consequently, LAPRAN can produce hierarchies of reconstructed images and each with an incremental resolution and improved quality. The scalable pyramid structure of LAPRAN enables high-fidelity CS reconstruction with a flexible resolution that is adaptive to a wide range of compression ratios (CRs), which is infeasible with existing methods. Experimental results on multiple public datasets show that LAPRAN offers an average 7.47dB and 5.98dB PSNR, and an average 57.93% and 33.20% SSIM improvement compared to model-based and data-driven baselines, respectively.
more » « less
Full Text Available
CSVideoNet: A Real-Time End-to-End Learning Framework for High-Frame-Rate Video Compressive Sensing

https://doi.org/10.1109/WACV.2018.00187

Xu, Kai; Ren, Fengbo (March 2018, IEEE Winter Conference on Applications of Computer Vision (WACV))

This paper addresses the real-time encoding-decoding problem for high-frame-rate video compressive sensing (CS). Unlike prior works that perform reconstruction using iterative optimization-based approaches, we propose a noniterative model, named "CSVideoNet", which directly learns the inverse mapping of CS and reconstructs the original input in a single forward propagation. To overcome the limitations of existing CS cameras, we propose a multi-rate CNN and a synthesizing RNN to improve the trade-o. between compression ratio (CR) and spatial-temporal resolution of the reconstructed videos. the experiment results demonstrate that CSVideoNet significantly outperforms state-of-the-art approaches. Without any pre/post-processing, we achieve a 25dB Peak signal-to-noise ratio (PSNR) recovery quality at 100x CR, with a frame rate of 125 fps on a Titan X GPU. Due to the feedforward and high-data-concurrency natures of CSVideoNet, it can take advantage of GPU acceleration to achieve three orders of magnitude speed-up over conventional iterative-based approaches. We share the source code at https://github.com/PSCLab-ASU/CSVideoNet.
more » « less
Full Text Available

Search for: All records