skip to main content


This content will become publicly available on June 26, 2025

Title: Compression with Attention: Learning in Lower Dimensions
With deep learning models ever ballooning in size to push state-ofthe- art accuracy improvements, efforts to find compact models have become necessary. To meet such an objective, we propose a novel operation called Personal Self-Attention (PSA). It is designed specifically to learn non-linear 1-D functions faster than existing architectures like Multi-Layer Perceptron (MLP) and Polynomial-based methods, while being highly compatible with gradient backpropagation. We show that by stacking and combining these non-linear functions with linear transformations, we can achieve the same accuracy as a larger model but with a hidden dimension that is significantly smaller. To test our contribution, we implemented PSA on an MLP-based vision model called ResMLP and tested it against vision classification tasks on SVHN, and CIFAR-10 datasets. We show how PSA pushes the pareto-front, achieving the same accuracy with 2 − 6× smaller hidden-dimension sizes compared to the conventional MLP structures. Further, by quantizing our non-linear function, the PSA can be mapped to a simple lookup table, allowing for very efficient translation to FPGA hardware. We demonstrate this by designing an unrolled high-throughput accelerator for ResMLP using nearly 1.5× fewer DSPs with PSA compared to a conventional MLP architecture while achieving the same accuracy of 86% and throughput of 29k FPS.  more » « less
Award ID(s):
2016390
PAR ID:
10533920
Author(s) / Creator(s):
;
Publisher / Repository:
Design Automation Conference
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Quantitative analysis of brain disorders such as Autism Spectrum Disorder (ASD) is an ongoing field of research. Machine learning and deep learning techniques have been playing an important role in automating the diagnosis of brain disorders by extracting discriminative features from the brain data. In this study, we propose a model called Auto-ASD-Network in order to classify subjects with Autism disorder from healthy subjects using only fMRI data. Our model consists of a multilayer perceptron (MLP) with two hidden layers. We use an algorithm called SMOTE for performing data augmentation in order to generate artificial data and avoid overfitting, which helps increase the classification accuracy. We further investigate the discriminative power of features extracted using MLP by feeding them to an SVM classifier. In order to optimize the hyperparameters of SVM, we use a technique called Auto Tune Models (ATM) which searches over the hyperparameter space to find the best values of SVM hyperparameters. Our model achieves more than 70% classification accuracy for 4 fMRI datasets with the highest accuracy of 80%. It improves the performance of SVM by 26%, the stand-alone MLP by 16% and the state of the art method in ASD classification by 14%. The implemented code will be available as GPL license on GitHub portal of our lab (https://github.com/PCDS). 
    more » « less
  2. This study evaluates the performance of multiple machine learning (ML) algorithms and electrical resistivity (ER) arrays for inversion with comparison to a conventional Gauss-Newton numerical inversion method. Four different ML models and four arrays were used for the estimation of only six variables for locating and characterizing hypothetical subsurface targets. The combination of dipole-dipole with Multilayer Perceptron Neural Network (MLP-NN) had the highest accuracy. Evaluation showed that both MLP-NN and Gauss-Newton methods performed well for estimating the matrix resistivity while target resistivity accuracy was lower, and MLP-NN produced sharper contrast at target boundaries for the field and hypothetical data. Both methods exhibited comparable target characterization performance, whereas MLP-NN had increased accuracy compared to Gauss-Newton in prediction of target width and height, which was attributed to numerical smoothing present in the Gauss-Newton approach. MLP-NN was also applied to a field dataset acquired at U.S. DOE Hanford site. 
    more » « less
  3. null (Ed.)
    Non-line-of-sight (NLOS) imaging is a rapidly advancing technology that provides asymmetric vision: seeing without being seen. Though limited in accuracy, resolution, and depth recovery compared to active methods, the capabilities of passive methods are especially surprising because they typically use only a single, inexpensive digital camera. One of the largest challenges in passive NLOS imaging is ambient background light, which limits the dynamic range of the measurement while carrying no useful information about the hidden part of the scene. In this work we propose a new reconstruction approach that uses an optimized linear transformation to balance the rejection of uninformative light with the retention of informative light, resulting in fast (video-rate) reconstructions of hidden scenes from photographs of a blank wall under high ambient light conditions. 
    more » « less
  4. In this paper, a multilayer perceptron (MLP)-type artificial neural network model with a back-propagation training algorithm is utilized to model the bubble growth and bubble dynamics parameters in nucleate boiling with a non-uniform electric field. The influences of the electric field on different parameters that describe bubble’s behaviors including bubble waiting time, bubble departure frequency, bubble growth time, and bubble departure diameter are considered. This study models single bubble dynamic behaviors of R113 created on a heater in an inconsistent electric field by utilizing a MLP neural network optimized by four different swarm-based optimization algorithms, namely: Salp Swarm Algorithm (SSA), Grey Wolf Optimizer (GWO), Artificial Bee Colony (ABC) algorithm, and Particle Swarm Optimization (PSO). For evaluating the model effectiveness, the MSE value (Mean-Square Error) of the artificial neural network model with various optimization algorithms is measured and compared. The results suggest that the optimal networks in the two-hidden layer and three-hidden layer models for the bubble departure diameter improve MSE by 33.85% and 35.27%, respectively, when compared with the best response in the one-hidden layer model. Additionally, for bubble growth time, the networks with two hidden layers and three hidden layers have the 44.51% and 45.85% reduction in error, when compared with the network with one hidden layer, respectively. For the departure frequency, the error reduction in the two-layer and three-layer networks is 46.85% and 62.32%, respectively. For bubble waiting time, the best networks in the two hidden-layer and three hidden-layer models improve MSE by 52.44% and 62.27% compared with the best 1HL model response, respectively. Also, the two algorithms of SSA and GWO are able to compete well (comparable MSE) with the PSO and ABC algorithms. 
    more » « less
  5. Conventional continuous-wave amplitude-modulated time-of-flight (CWAM ToF) cameras suffer from a fundamental trade-off between light throughput and depth of field (DoF): a larger lens aperture allows more light collection but suffers from significantly lower DoF. However, both high light throughput, which increases signal-to-noise ratio, and a wide DoF, which enlarges the system’s applicable depth range, are valuable for CWAM ToF applications. In this work, we propose EDoF-ToF, an algorithmic method to extend the DoF of large-aperture CWAM ToF cameras by using a neural network to deblur objects outside of the lens’s narrow focal region and thus produce an all-in-focus measurement. A key component of our work is the proposed large-aperture ToF training data simulator, which models the depth-dependent blurs and partial occlusions caused by such apertures. Contrary to conventional image deblurring where the blur model is typically linear, ToF depth maps are nonlinear functions of scene intensities, resulting in a nonlinear blur model that we also derive for our simulator. Unlike extended DoF for conventional photography where depth information needs to be encoded (or made depth-invariant) using additional hardware (phase masks, focal sweeping, etc.), ToF sensor measurements naturally encode depth information, allowing a completely software solution to extended DoF. We experimentally demonstrate EDoF-ToF increasing the DoF of a conventional ToF system by 3.6 ×, effectively achieving the DoF of a smaller lens aperture that allows 22.1 × less light. Ultimately, EDoF-ToF enables CWAM ToF cameras to enjoy the benefits of both high light throughput and a wide DoF.

     
    more » « less