skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Augmented Arithmetic Operations Proposed for IEEE-754 2018
Algorithms for extending arithmetic precision through compensated summation or arithmetics like double-double rely on operations commonly called twoSum and twoProduct. The current draft of the IEEE 754 standard specifies these operations under the names augmentedAddition and augmentedMultiplication. These operations were included after three decades of experience because of a motivating new use: bitwise reproducible arithmetic. Standardizing the operations provides a hardware acceleration target that can provide at least a 33 % speed improvements in reproducible dot product, placing reproducible dot product almost within a factor of two of common dot product. This paper provides history and motivation for standardizing these operations. We also define the operations, explain the rationale for all the specific choices, and provide parameterized test cases for new boundary behaviors.  more » « less
Award ID(s):
1339745
PAR ID:
10089378
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE 25th Symposium on Computer Arithmetic (ARITH)
Page Range / eLocation ID:
45 to 52
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The objective of this paper is to provide a holistic summary of ongoing research related to the development, implementation, assessment, and continuous refinement of an augmented reality (AR) app known as Vectors in Space. This Unity-based app was created by the authors and provides a self-guided learning experience for students to learn about fundamental vector concepts routinely encountered in undergraduate physics and engineering mechanics courses. Vectors are a fundamental tool in mechanics courses as they allow for the precise and comprehensive description of physical phenomena such as forces, moments, and motion. In early engineering coursework, students often perceive vectors as an abstract mathematical concept that requires spatial visualization skills in three dimensions (3D). The app aims to allow students to build these tacit skills while simultaneously allowing them to learn fundamental vector concepts that will be necessary in subsequent coursework. Three self-paced, guided learning activities systematically address concepts that include: (a) Cartesian components of vectors, (b) unit vectors and directional angles, (c) addition, (d) subtraction, (e) cross product using the right-hand rule, (f) angle between vectors using the dot product, and (g) vector projections using the dot product. The authors first discuss the app's scaffolding approach with special attention given to the incorporation of Mayer's principles of multimedia learning as well as the use of animations. The authors' approach to develop the associated statics learning activities, practical aspects of implementation, and lessons learned are shared. The effectiveness of the activities is assessed by applying analysis of covariance (ANCOVA) to pre- and post-activity assessment scores for control and treatment groups. Though the sample sizes are relatively small (less than 50 students), the results demonstrate that AR had a positive impact on student learning of the dot product and its applications. Larger sample sizes and refinements to the test instruments will be necessary in the future to draw robust conclusions regarding the other vector topics and operations. Qualitative feedback from student focus groups conducted with undergraduate engineering students identified the app's strengths as well as potential areas of improvement. 
    more » « less
  2. Conventional multiply-accumulate (MAC) operations have long dominated computation time for deep neural networks (DNNs), especially convolutional neural networks (CNNs). Recently, product quantization (PQ) has been applied to these workloads, replacing MACs with memory lookups to pre-computed dot products. To better understand the efficiency tradeoffs of product-quantized DNNs (PQ-DNNs), we create a custom hardware accelerator to parallelize and accelerate nearest-neighbor search and dot-product lookups. Additionally, we perform an empirical study to investigate the efficiency–accuracy tradeoffs of different PQ parameterizations and training methods. We identify PQ configurations that improve performance-per-area for ResNet20 by up to 3.1×, even when compared to a highly optimized conventional DNN accelerator, with similar improvements on two additional compact DNNs. When comparing to recent PQ solutions, we outperform prior work by 4× in terms of performance-per-area with a 0.6% accuracy degradation. Finally, we reduce the bitwidth of PQ operations to investigate the impact on both hardware efficiency and accuracy. With only 2–6-bit precision on three compact DNNs, we were able to maintain DNN accuracy eliminating the need for DSPs. 
    more » « less
  3. The LAProof library provides formal machine-checked proofs of the accuracy of basic linear algebra operations: inner product using conventional multiply and add, inner product using fused multiply-add, scaled matrix-vector and matrix-matrix multiplication, and scaled vector and matrix addition. These proofs can connect to concrete implementations of low-level basic linear algebra subprograms; as a proof of concept we present a machine-checked correctness proof of a C function implementing sparse matrix-vector multiplication using the compressed sparse row format. Our accuracy proofs are backward error bounds and mixed backward-forward error bounds that account for underflow, proved subject to no assumptions except a low-level formal model of IEEE-754 arithmetic. We treat low-order error terms concretely, not approximating as O(u^2). 
    more » « less
  4. Prior work indicates that children have an untrained ability to approximately calculate using their approximate number system (ANS). For example, children can mentally double or halve a large array of discrete objects. Here, we asked whether children can per-form a true multiplication operation, flexibly attending to both the multiplier and multiplicand, prior to formal multiplication instruc-tion. We presented 5- to 8-year-olds with nonsymbolic multipli-cands (dot arrays) or symbolic multiplicands (Arabic numerals) ranging from 2 to 12 and with nonsymbolic multipliers ranging from 2 to 8. Children compared each imagined product with a vis-ible comparison quantity. Children performed with above-chance accuracy on both nonsymbolic and symbolic approximate multipli-cation, and their performance was dependent on the ratio between the imagined product and the comparison target. Children who could not solve any single-digit symbolic multiplication equations (e.g., 2  3) on a basic math test were nevertheless successful on both our approximate multiplication tasks, indicating that children have an intuitive sense of multiplication that emerges independent of formal instruction about symbolic multiplication. Nonsymbolic multiplication performance mediated the relation between chil-dren’s Weber fraction and symbolic math abilities, suggesting a pathway by which the ANS contributes to children’s emerging symbolic math competence. These findings may inform future educational interventions that allow children to use their basic arithmetic intuition as a scaffold to facilitate symbolic math learning. 
    more » « less
  5. In this paper we present an approximate division scheme for Scaled Population (SP) arithmetic, a technique that improves on the limitations of stochastic computing (SC). SP arithmetic circuits are designed (a) to perform all operations with a constant delay, and (b) they use scaling operations to help reduce errors compared to SC circuits. As part of this work, we also present a method to correlate two SP numbers with a constant delay. We compare our SP divider with SC dividers, as well as fixed-point dividers (in terms of area, power and delay). Our 512-bit SP divider has a delay (power) that is 0.08× (0.06×) that of the equivalent fixed-point binary divider. Compared to a equivalent SC divider, our power-delay-product is 13× better. Index Terms—Approximate Arithmetic, Stochastic Computing, Computer Arithmetic, Approximate Division, Fast Division 
    more » « less