skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: BranchNet: A Convolutional Neural Network to Predict Hard-To-Predict Branches
Abstract—The state-of-the-art branch predictor, TAGE, re- mains inefficient at identifying correlated branches deep in a noisy global branch history. We argue this inefficiency is a fundamental limitation of runtime branch prediction and not a coincidental artifact due to the design of TAGE. To further improve branch prediction, we need to relax the constraint of runtime only training and adopt more sophisticated prediction mechanisms. To this end, Tarsa et al. proposed using convo- lutional neural networks (CNNs) that are trained at compile- time to accurately predict branches that TAGE cannot. Given enough profiling coverage, CNNs learn input-independent branch correlations that can accurately predict branches when running a program with unseen inputs. We build on their work and introduce BranchNet, a CNN with a practical on-chip inference engine tailored to the needs of branch prediction. At runtime, BranchNet predicts a few hard-to-predict branches, while TAGE- SC-L predicts the remaining branches. This hybrid approach reduces the MPKI of SPEC2017 Integer benchmarks by 7.6% (and up to 15.7%) when compared to a very large (impractical) MTAGE-SC baseline, demonstrating a fundamental advantage in the prediction capabilities of BranchNet compared to TAGE- like predictors. We also propose a practical resource-constrained variant of BranchNet that improves the MPKI by 9.6% (and up to 17.7%) compared to a 64KB TAGE-SC-L without increasing the prediction latency.  more » « less
Award ID(s):
2011145
PAR ID:
10249269
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
53rd Annual IEEE/ACM International Symposium on Microarchitecture
Page Range / eLocation ID:
118 to 130
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract—Multi-layer neural networks show promise in im- proving branch prediction accuracy. Tarsa et al. have shown that convolutional neural networks (CNNs) can accurately predict many branches that state-of-the-art branch predictors cannot. Yet, strict latency and storage constraints make naive adoption of typical neural network architectures impractical. Thus, it is necessary to understand the unique characteristics of branch prediction to design constraint-aware neural networks. This paper studies why CNNs are so effective for two hard-to- predict branches from the SPEC benchmark suite. We identify custom prediction algorithms for these branches that are more accurate and cost-efficient than CNNs. Finally, we discuss why out-of-the-box machine learning techniques do not find optimal solutions and propose research directions aimed at solving these inefficiencies. 
    more » « less
  2. Abstract Robotic assistive or rehabilitative devices are promising aids for people with neurological disorders as they help regain normative functions for both upper and lower limbs. However, it remains challenging to accurately estimate human intent or residual efforts non-invasively when using these robotic devices. In this article, we propose a deep learning approach that uses a brightness mode, that is, B-mode, of ultrasound (US) imaging from skeletal muscles to predict the ankle joint net plantarflexion moment while walking. The designed structure of customized deep convolutional neural networks (CNNs) guarantees the convergence and robustness of the deep learning approach. We investigated the influence of the US imaging’s region of interest (ROI) on the net plantarflexion moment prediction performance. We also compared the CNN-based moment prediction performance utilizing B-mode US and sEMG spectrum imaging with the same ROI size. Experimental results from eight young participants walking on a treadmill at multiple speeds verified an improved accuracy by using the proposed US imaging + deep learning approach for net joint moment prediction. With the same CNN structure, compared to the prediction performance by using sEMG spectrum imaging, US imaging significantly reduced the normalized prediction root mean square error by 37.55% ( $ p $  < .001) and increased the prediction coefficient of determination by 20.13% ( $ p $  < .001). The findings show that the US imaging + deep learning approach personalizes the assessment of human joint voluntary effort, which can be incorporated with assistive or rehabilitative devices to improve clinical performance based on the assist-as-needed control strategy. 
    more » « less
  3. Modern data center applications experience frequent branch mispredictions – degrading performance, increasing cost, and reducing energy efficiency in data centers. Even the state-of-the-art branch predictor, TAGE-SC-L, suffers from an average branch Mispredictions Per Kilo Instructions (branch-MPKI) of 3.0 (0.5-7.2) for these applications since their large code footprints exhaust TAGE-SC-L’s intended capacity. In this work, we propose Whisper, a novel profile-guided mechanism to avoid branch mispredictions. Whisper investigates the in-production profile of data center applications to identify precise program contexts that lead to branch mispredictions. Corresponding prediction hints are then inserted into code to strategically avoid those mispredictions during program execution. Whisper presents three novel profile-guided techniques: (1) hashed history correlation which efficiently encodes hard-topredict correlations in branch history using lightweight Boolean formulas, (2) randomized formula testing which selects a locally optimal Boolean formula from a randomly selected subset of possible formulas to predict a branch, and (3) the extension of Read-Once Monotone Boolean Formulas with Implication and Converse Non-Implication to improve the branch history coverage of these formulas with minimal overhead. We evaluate Whisper on 12 widely-used data center applications and demonstrate that Whisper enables traditional branch predictors to achieve a speedup close to that of an ideal branch predictor. Specifically, Whisper achieves an average speedup of 2.8% (0.4%-4.6%) by reducing 16.8% (1.7%-32.4%) of branch mispredictions over TAGE-SC-L and outperforms the state-ofthe-art profile-guided branch prediction mechanisms by 7.9% on average. 
    more » « less
  4. Deep neural network (DNN) is the de facto standard for running a variety of computer vision applications over mobile and embedded systems. Prior to deployment, a DNN is specialized by training to fit the target use scenario (depending on computing power and visual data input). To handle its costly training and meet diverse deployment needs, a “Train Once, Deploy Everywhere” paradigm has been recently proposed by training one super-network and selecting one out of many sub-networks (part of the super-network) for the target scenario; This empowers efficient DNN deployment at low training cost (training once). However, the existing studies tackle some deployment factors like computing power and source data but largely overlook the impact of their runtime dynamics (say, time-varying visual contents and GPU/CPU workloads). In this work, we propose OPA to cover all these deployment factors, particularly those along with runtime dynamics in visual data contents and computing resources. To quickly and accurately learn which sub-network runs “best” in the dynamic deployment scenario, we devise a “One-Predict-All” approach with no need to run all the candidate sub-networks. Instead, we first develop a shallow sub-network to test the water and then use its test results to predict the performance of all other deeper sub-networks. We have implemented and evaluated OPA. Compared to the state-of-the-art, OPA has achieved up to 26% higher Top-1 accuracy for a given latency requirement. 
    more » « less
  5. In this paper, we propose a new deep framework which predicts facial attributes and leverage it as a soft modality to improve face identification performance. Our model is an end to end framework which consists of a convolutional neural network (CNN) whose output is fanned out into two separate branches; the first branch predicts facial attributes while the second branch identifies face images. Contrary to the existing multi-task methods which only use a shared CNN feature space to train these two tasks jointly, we fuse the predicted attributes with the features from the face modality in order to improve the face identification performance. Experimental results show that our model brings benefits to both face identification as well as facial attribute prediction performance, especially in the case of identity facial attributes such as gender prediction. We tested our model on two standard datasets annotated by identities and face attributes. Experimental results indicate that the proposed model outperforms most of the current existing face identification and attribute prediction methods. 
    more » « less