Floating-point types are notorious for their intricate representation. The effective use of mixed precision, i.e., using various precisions in different computations, is critical to achieve a good balance between accuracy and performance. Unfortunately, reasoning about mixed precision is difficult even for numerical experts. Techniques have been proposed to systematically search over floating-point variables and/or program instructions to find a faster, mixed-precision version of a given program. These techniques, however, are characterized by their black box nature, and face scalability limitations due to the large search space. In this paper, we exploit the community structure of floating-point variables to devise a scalable hierarchical search for precision tuning. Specifically, we perform dependence analysis and edge profiling to create a weighted dependence graph that presents a network of floating-point variables. We then formulate hierarchy construction on the network as a community detection problem, and present a hierarchical search algorithm that iteratively lowers precision with regard to communities. We implement our algorithm in the tool HiFPTuner, and show that it exhibits higher search efficiency over the state of the art for 75.9% of the experiments taking 59.6% less search time on average. Moreover, HiFPTuner finds more profitable configurations for 51.7% of the experiments, with one known to be as good as the global optimum found through exhaustive search.
more »
« less
Representing Hyperbolic Space Accurately using Multi-Component Floats
Hyperbolic space is particularly useful for embedding data with hierarchical structure; however, representing hyperbolic space with ordinary floating-point numbers greatly affects the performance due to its \emph{ineluctable} numerical errors. Simply increasing the precision of floats fails to solve the problem and incurs a high computation cost for simulating greater-than-double-precision floats on hardware such as GPUs, which does not support them. In this paper, we propose a simple, feasible-on-GPUs, and easy-to-understand solution for numerically accurate learning on hyperbolic space. We do this with a new approach to represent hyperbolic space using multi-component floating-point (MCF) in the Poincar{\'e} upper-half space model. Theoretically and experimentally we show our model has small numerical error, and on embedding tasks across various datasets, models represented by multi-component floating-points gain more capacity and run significantly faster on GPUs than prior work.
more »
« less
- Award ID(s):
- 2008102
- PAR ID:
- 10387215
- Editor(s):
- Ranzato, M.; Beygelzimer, A.; Dauphin Y.; Liang, P.S.; Wortman Vaughan, J.
- Date Published:
- Journal Name:
- Advances in neural information processing systems
- Volume:
- 34
- ISSN:
- 1049-5258
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Phylogenetic placement, used widely in ecological analyses, seeks to add a new species to an existing tree. A deep learning approach was previously proposed to estimate the distance between query and backbone species by building a map from gene sequences to a high-dimensional space that preserves species tree distances. They then use a distance-based placement method to place the queries on that species tree. In this paper, we examine the appropriate geometry for faithfully representing tree distances while embedding gene sequences. Theory predicts that hyperbolic spaces should provide a drastic reduction in distance distortion compared to the conventional Euclidean space. Nevertheless, hyperbolic embedding imposes its own unique challenges related to arithmetic operations, exponentially-growing functions, and limited bit precision, and we address these challenges. Our results confirm that hyperbolic embeddings have substantially lower distance errors than Euclidean space. However, these better-estimated distances do not always lead to better phylogenetic placement. We then show that the deep learning framework can be used not just to place on a backbone tree but to update it to obtain a fully resolved tree. With our hyperbolic embedding framework, species trees can be updated remarkably accurately with only a handful of genes.more » « less
-
Outlier detection (OD) is a key machine learning task for finding rare and deviant data samples, with many time-critical applications such as fraud detection and intrusion detection. In this work, we propose TOD, the first tensor-based system for efficient and scalable outlier detection on distributed multi-GPU machines. A key idea behind TOD is decomposing complex OD applications into a small collection of basic tensor algebra operators. This decomposition enables TOD to accelerate OD computations by leveraging recent advances in deep learning infrastructure in both hardware and software. Moreover, to deploy memory-intensive OD applications on modern GPUs with limited on-device memory, we introduce two key techniques. First, provable quantization speeds up OD computations and reduces its memory footprint by automatically performing specific floating-point operations in lower precision while provably guaranteeing no accuracy loss. Second, to exploit the aggregated compute resources and memory capacity of multiple GPUs, we introduce automatic batching , which decomposes OD computations into small batches for both sequential execution on a single GPU and parallel execution across multiple GPUs. TOD supports a diverse set of OD algorithms. Evaluation on 11 real-world and 3 synthetic OD datasets shows that TOD is on average 10.9X faster than the leading CPU-based OD system PyOD (with a maximum speedup of 38.9X), and can handle much larger datasets than existing GPU-based OD systems. In addition, TOD allows easy integration of new OD operators, enabling fast prototyping of emerging and yet-to-discovered OD algorithms.more » « less
-
Hyperbolic neural networks have been popular in the re- cent past due to their ability to represent hierarchical data sets effectively and efficiently. The challenge in develop- ing these networks lies in the nonlinearity of the embed- ding space namely, the Hyperbolic space. Hyperbolic space is a homogeneous Riemannian manifold of the Lorentz group which is a semi-Riemannian manifold, i.e. a mani- fold equipped with an indefinite metric. Most existing meth- ods (with some exceptions) use local linearization to de- fine a variety of operations paralleling those used in tra- ditional deep neural networks in Euclidean spaces. In this paper, we present a novel fully hyperbolic neural network which uses the concept of projections (embeddings) fol- lowed by an intrinsic aggregation and a nonlinearity all within the hyperbolic space. The novelty here lies in the projection which is designed to project data on to a lower- dimensional embedded hyperbolic space and hence leads to a nested hyperbolic space representation independently useful for dimensionality reduction. The main theoretical contribution is that the proposed embedding is proved to be isometric and equivariant under the Lorentz transforma- tions, which are the natural isometric transformations in hyperbolic spaces. This projection is computationally effi- cient since it can be expressed by simple linear operations, and, due to the aforementioned equivariance property, it al- lows for weight sharing. The nested hyperbolic space rep- resentation is the core component of our network and there- fore, we first compare this representation – independent of the network – with other dimensionality reduction methods such as tangent PCA, principal geodesic analysis (PGA) and HoroPCA. Based on this equivariant embedding, we develop a novel fully hyperbolic graph convolutional neural network architecture to learn the parameters of the projec- tion. Finally, we present experiments demonstrating com- parative performance of our network on several publicly available data sets.more » « less
-
Abstract Growing constraints on memory utilization, power consumption, and I/O throughput have increasingly become limiting factors to the advancement of high performance computing (HPC) and edge computing applications. IEEE‐754 floating‐point types have been the de facto standard for floating‐point number systems for decades, but the drawbacks of this numerical representation leave much to be desired. Alternative representations are gaining traction, both in HPC and machine learning environments. Posits have recently been proposed as a drop‐in replacement for the IEEE‐754 floating‐point representation. We survey the state‐of‐the‐art and state‐of‐the‐practice in the development and use of posits in edge computing and HPC. The current literature supports posits as a promising alternative to traditional floating‐point systems, both as a stand‐alone replacement and in a mixed‐precision environment. Development and standardization of the posit type is ongoing, and much research remains to explore the application of posits in different domains, how to best implement them in hardware, and where they fit with other numerical representations.more » « less