skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.


Title: Branch Prediction with Multilayer Neural Networks: The Value of Specialization
Abstract—Multi-layer neural networks show promise in im- proving branch prediction accuracy. Tarsa et al. have shown that convolutional neural networks (CNNs) can accurately predict many branches that state-of-the-art branch predictors cannot. Yet, strict latency and storage constraints make naive adoption of typical neural network architectures impractical. Thus, it is necessary to understand the unique characteristics of branch prediction to design constraint-aware neural networks. This paper studies why CNNs are so effective for two hard-to- predict branches from the SPEC benchmark suite. We identify custom prediction algorithms for these branches that are more accurate and cost-efficient than CNNs. Finally, we discuss why out-of-the-box machine learning techniques do not find optimal solutions and propose research directions aimed at solving these inefficiencies.  more » « less
Award ID(s):
2011145
PAR ID:
10249272
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Machine Learning for Computer Architecture and Systems
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract—The state-of-the-art branch predictor, TAGE, re- mains inefficient at identifying correlated branches deep in a noisy global branch history. We argue this inefficiency is a fundamental limitation of runtime branch prediction and not a coincidental artifact due to the design of TAGE. To further improve branch prediction, we need to relax the constraint of runtime only training and adopt more sophisticated prediction mechanisms. To this end, Tarsa et al. proposed using convo- lutional neural networks (CNNs) that are trained at compile- time to accurately predict branches that TAGE cannot. Given enough profiling coverage, CNNs learn input-independent branch correlations that can accurately predict branches when running a program with unseen inputs. We build on their work and introduce BranchNet, a CNN with a practical on-chip inference engine tailored to the needs of branch prediction. At runtime, BranchNet predicts a few hard-to-predict branches, while TAGE- SC-L predicts the remaining branches. This hybrid approach reduces the MPKI of SPEC2017 Integer benchmarks by 7.6% (and up to 15.7%) when compared to a very large (impractical) MTAGE-SC baseline, demonstrating a fundamental advantage in the prediction capabilities of BranchNet compared to TAGE- like predictors. We also propose a practical resource-constrained variant of BranchNet that improves the MPKI by 9.6% (and up to 17.7%) compared to a 64KB TAGE-SC-L without increasing the prediction latency. 
    more » « less
  2. Deep convolutional neural networks (CNNs) are becoming increasingly popular models to predict neural responses in visual cortex. However, contextual effects, which are prevalent in neural processing and in perception, are not explicitly handled by current CNNs, including those used for neural prediction. In primary visual cortex, neural responses are modulated by stimuli spatially surrounding the classical receptive field in rich ways. These effects have been modeled with divisive normalization approaches, including flexible models, where spatial normalization is recruited only to the degree that responses from center and surround locations are deemed statistically dependent. We propose a flexible normalization model applied to midlevel representations of deep CNNs as a tractable way to study contextual normalization mechanisms in midlevel cortical areas. This approach captures nontrivial spatial dependencies among midlevel features in CNNs, such as those present in textures and other visual stimuli, that arise from tiling high-order features geometrically. We expect that the proposed approach can make predictions about when spatial normalization might be recruited in midlevel cortical areas. We also expect this approach to be useful as part of the CNN tool kit, therefore going beyond more restrictive fixed forms of normalization. 
    more » « less
  3. Protein structure prediction algorithms such as AlphaFold2 and ESMFold have dramatically increased the availability of high-quality models of protein structures. Because these algorithms predict only the structure of the protein itself, there is a growing need for methods that can rapidly screen protein structures for ligands. Previous work on similar tasks has shown promise but is lacking scope in the classes of atoms predicted and can benefit from the recent architectural developments in convolutional neural networks (CNNs). In this work, we introduce SE3Lig, a model for semantic in-painting of small molecules in protein structures. Specifically, we report SE(3)-equivariant CNNs trained to predict the atomic densities of common classes of cofactors (hemes, flavins, etc.) and the water molecules and inorganic ions in their vicinity. While the models are trained on high-resolution crystal structures of enzymes, they perform well on structures predicted by AlphaFold2, which suggests that the algorithm correctly represents cofactor-binding cavities. 
    more » « less
  4. Convolutional Neural Networks (CNNs) filter the input data using spatial convolution operators with compact stencils. Commonly, the convolution operators couple features from all channels, which leads to immense computational cost in the training of and prediction with CNNs. To improve the efficiency of CNNs, we introduce lean convolution operators that reduce the number of parameters and computational complexity, and can be used in a wide range of existing CNNs. Here, we exemplify their use in residual networks (ResNets), which have been very reliable for a few years now and analyzed intensively. In our experiments on three image classification problems, the proposed LeanResNet yields results that are comparable to other recently proposed reduced architectures using similar number of parameters. 
    more » « less
  5. null (Ed.)
    Continuous and accurate decoding of intended motions is critical for human-machine interactions. Here, we developed a novel approach for real-time continuous prediction of forces in individual fingers using parallel convolutional neural networks (CNNs). We extracted populational motor unit discharge frequency using CNNs organized in a parallel structure. The CNNs parameters were trained based on two features from high-density electromyogram (HD-EMG), namely temporal energy heatmaps and frequency spectrum maps. The populational motor unit discharge frequency was then used to continuously predict finger forces based on a linear regression model. The force prediction performance was compared with a motor unit decomposition method and the conventional EMG amplitude-based method. Our results showed that the correlation coefficient between the predicted and the recorded forces of the parallel CNN approach was on average 0.91, compared with an offline decomposition method of 0.89, an online decomposition method of 0.82, and the EMG amplitude method of 0.81. Additionally, the CNN based approach showed generalizable performance, with CNN trained on one finger applying to a different finger. The outcomes suggest that our CNN based algorithm can offer an accurate and efficient force decoding method for human-machine interactions. 
    more » « less