skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Fast Polynomial Evaluation for Correctly Rounded Elementary Functions using the RLIBM Approach
This paper proposes fast polynomial evaluation methods for correctly rounded elementary functions generated using our RLibm approach. The resulting functions produce correct results for all inputs with multiple representations and rounding modes. Given an oracle, the RLibm approach approximates the correctly rounded result rather than the real value of an elementary function. A key observation is that there is an interval of real values around the correctly rounded result such that any real value in it rounds to the correct result. This interval is the maximum freedom available to RLibm’s polynomial generation procedure. Subsequently, the problem of generating correctly rounded elementary functions using these intervals can be structured as a linear programming problem. Our prior work on the RLibm approach uses Horner’s method for polynomial evaluation. This paper explores polynomial evaluation techniques such as Knuth’s coefficient adaptation procedure, parallel execution of operations using Estrin’s procedure, and the use of fused multiply-add operations in the context of the RLibm approach. If we take the polynomial generated by the RLibm approach and subsequently perform polynomial evaluation optimizations, it results in incorrect results due to rounding errors during polynomial evaluation. Hence, we propose to integrate the fast polynomial evaluation procedure in the RLibm’s polynomial generation process. Our new polynomial evaluation procedure that combines parallel execution with fused multiply-add operations outperforms the Horner’s method used by RLibm’s correctly rounded functions. We show the resulting polynomials for 32-bit float are not only correct but also faster than prior functions in RLibm by 24%  more » « less
Award ID(s):
2110861 1908798
PAR ID:
10418147
Author(s) / Creator(s):
;
Date Published:
Journal Name:
CGO 2023: Proceedings of the 21st ACM/IEEE International Symposium on Code Generation and OptimizationFebruary 2023
Page Range / eLocation ID:
95 to 107
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Our RLibm project has recently proposed methods to generate a single implementation for an elementary function that produces correctly rounded results for multiple rounding modes and representations with up to 32-bits. They are appealing for developing fast reference libraries without double rounding issues. The key insight is to build polynomial approximations that produce the correctly rounded result for a representation with two additional bits when compared to the largest target representation and with the “non-standard” round-to-odd rounding mode, which makes double rounding the RLibm math library result to any smaller target representation innocuous. The resulting approximations generated by the RLibm approach are implemented with machine supported floating-point operations with the round-to-nearest rounding mode. When an application uses a rounding mode other than the round-to-nearest mode, the RLibm math library saves the application’s rounding mode, changes the system’s rounding mode to round-to-nearest, computes the correctly rounded result, and restores the application’s rounding mode. This frequent change of rounding modes has a performance cost. This paper proposes two new methods, which we call rounding-invariant outputs and rounding-invariant input bounds, to avoid the frequent changes to the rounding mode and the dependence on the round-to-nearest mode. First, our new rounding-invariant outputs method proposes using the round-to-zero rounding mode to implement RLibm’s polynomial approximations. We propose fast, error-free transformations to emulate a round-to-zero result from any standard rounding mode without changing the rounding mode. Second, our rounding-invariant input bounds method factors any rounding error due to different rounding modes using interval bounds in the RLibm pipeline. Both methods make a different set of trade-offs and improve the performance of resulting libraries by more than 2× 
    more » « less
  2. This paper presents a novel method for generating a single polynomial approximation that produces correctly rounded results for all inputs of an elementary function for multiple representations. The generated polynomial approximation has the nice property that the first few lower degree terms produce correctly rounded results for specific representations of smaller bitwidths, which we call progressive performance. To generate such progressive polynomial approximations, we approximate the correctly rounded result and formulate the computation of correctly rounded polynomial approximations as a linear program similar to our prior work on the RLIBM project. To enable the use of resulting polynomial approximations in mainstream libraries, we want to avoid piecewise polynomials with large lookup tables. We observe that the problem of computing polynomial approximations for elementary functions is a linear programming problem in low dimensions, i.e., with a small number of unknowns. We design a fast randomized algorithm for computing polynomial approximations with progressive performance. Our method produces correct and fast polynomials that require a small amount of storage. A few polynomial approximations from our prototype have already been incorporated into LLVM’s math library. 
    more » « less
  3. Mainstream math libraries for floating point (FP) do not produce correctly rounded results for all inputs. In contrast, CR-LIBM and RLIBM provide correctly rounded implementations for a specific FP representation with one rounding mode. Using such libraries for a representation with a new rounding mode or with different precision will result in wrong results due to double rounding. This paper proposes a novel method to generate a single polynomial approximation that produces correctly rounded results for all inputs for multiple rounding modes and multiple precision configurations. To generate a correctly rounded library forn-bits, our key idea is to generate a polynomial approximation for a representation withn+2-bits using theround-to-oddmode. We prove that the resulting polynomial approximation will produce correctly rounded results for all five rounding modes in the standard and for multiple representations withk-bits such that |E| +1 <k≤n, where |E| is the number of exponent bits in the representation. Similar to our prior work in the RLIBM project, we approximate the correctly rounded result when we generate the library withn+2-bits using the round-to-odd mode. We also generate polynomial approximations by structuring it as a linear programming problem but propose enhancements to polynomial generation to handle the round-to-odd mode. Our prototype is the first 32-bit float library that produces correctly rounded results with all rounding modes in the IEEE standard for all inputs with a single polynomial approximation. It also produces correctly rounded results for any FP configuration ranging from 10-bits to 32-bits while also being faster than mainstream libraries. 
    more » « less
  4. null (Ed.)
    This paper proposes a set of techniques to develop correctly rounded math libraries for 32-bit float and posit types. It enhances our RLIBM approach that frames the problem of generating correctly rounded libraries as a linear programming problem in the context of 16-bit types to scale to 32-bit types. Specifically, this paper proposes new algorithms to (1) generate polynomials that produce correctly rounded outputs for all inputs using counterexample guided polynomial generation, (2) generate efficient piecewise polynomials with bit-pattern based domain splitting, and (3) deduce the amount of freedom available to produce correct results when range reduction involves multiple elementary functions. The resultant math library for the 32-bit float type is faster than state-of-the-art math libraries while producing the correct output for all inputs. We have also developed a set of correctly rounded elementary functions for 32-bit posits. 
    more » « less
  5. Given the importance of floating point (FP) performance in numerous domains, several new variants of FP and its alternatives have been proposed (e.g., Bfloat16, TensorFloat32, and posits). These representations do not have correctly rounded math libraries. Further, the use of existing FP libraries for these new representations can produce incorrect results. This paper proposes a novel approach for generating polynomial approximations that can be used to implement correctly rounded math libraries. Existing methods generate polynomials that approximate the real value of an elementary function 𝑓 (𝑥) and produce wrong results due to approximation errors and rounding errors in the implementation. In contrast, our approach generates polynomials that approximate the correctly rounded value of 𝑓 (𝑥) (i.e., the value of 𝑓 (𝑥) rounded to the target representation). It provides more margin to identify efficient polynomials that produce correctly rounded results for all inputs. We frame the problem of generating efficient polynomials that produce correctly rounded results as a linear programming problem. Using our approach, we have developed correctly rounded, yet faster, implementations of elementary functions for multiple target representations. 
    more » « less