skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Fast quasi-harmonic weights for geometric data interpolation
We proposequasi-harmonic weightsfor interpolating geometric data, which are orders of magnitude faster to compute than state-of-the-art. Currently, interpolation (or, skinning) weights are obtained by solving large-scale constrained optimization problems with explicit constraints to suppress oscillative patterns, yielding smooth weights only after a substantial amount of computation time. As an alternative, our weights are obtained as minima of an unconstrained problem that can be optimized quickly using straightforward numerical techniques. We consider weights that can be obtained as solutions to a parameterized family of second-order elliptic partial differential equations. By leveraging the maximum principle and careful parameterization, we pose weight computation as an inverse problem of recovering optimal anisotropic diffusivity tensors. In addition, we provide a customized ADAM solver that significantly reduces the number of gradient steps; our solver only requires inverting tens of linear systems that share the same sparsity pattern. Overall, our approach achieves orders of magnitude acceleration compared to previous methods, allowing weight computation in near real-time.  more » « less
Award ID(s):
1838071
PAR ID:
10603027
Author(s) / Creator(s):
 ;  
Publisher / Repository:
Association for Computing Machinery (ACM)
Date Published:
Journal Name:
ACM Transactions on Graphics
Volume:
40
Issue:
4
ISSN:
0730-0301
Format(s):
Medium: X Size: p. 1-15
Size(s):
p. 1-15
Sponsoring Org:
National Science Foundation
More Like this
  1. Radial basis functions are typically used when discretization schemes require inhomogeneous node distributions. While spawning from a desire to interpolate functions on a random set of nodes, they have found successful applications in solving many types of differential equations. However, the weights of the interpolated solution, used in the linear superposition of basis functions to interpolate the solution, and the actual value of the solution are completely different. In fact, these weights mix the value of the solution with the geometrical location of the nodes used to discretize the equation. In this paper, we used nodal radial basis functions, which are interpolants of the impulse function at each node inside the domain. This transformation allows to solve a linear hyperbolic partial differential equation using series expansion rather than the explicit computation of a matrix inverse. This transformation effectively yields an implicit solver which only requires the multiplication of vectors with matrices. Because the solver requires neither matrix inverse nor matrix-matrix products, this approach is numerically more stable and reduces the error by at least two orders of magnitude, compared to solvers using radial basis functions directly. Further, boundary conditions are integrated directly inside the solver, at no extra cost. The method is locally conservative, keeping the error virtually constant throughout the computation. 
    more » « less
  2. Computation of injective (or inversion-free) maps is a key task in geometry processing, physical simulation, and shape optimization. Despite being a longstanding problem, it remains challenging due to its highly nonconvex and combinatoric nature. We propose computation ofvariational quasi-harmonic mapsto obtain smooth inversion-free maps. Our work is built on a key observation about inversion-free maps: A planar map is a diffeomorphism if and only if it is quasi-harmonic and satisfies a special Cauchy boundary condition. We hence equate the inversion-free mapping problem to an optimal control problem derived from our theoretical result, in which we search in the space of parameters that define an elliptic PDE. We show that this problem can be solved by minimizing within a family of functionals. Similarly, our discretized functionals admit exactly injective maps as the minimizers, empirically producing inversion-free discrete maps of triangle meshes. We design efficient numerical procedures for our problem that prioritize robust convergence paths. Experiments show that on challenging examples our methods can achieve up to orders of magnitude improvement over state-of-the-art, in terms of speed or quality. Moreover, we demonstrate how to optimize a generic energy in our framework while restricting to quasi-harmonic maps. 
    more » « less
  3. Abstract Anomaly detection in real-time using autoencoders implemented on edge devices is exceedingly challenging due to limited hardware, energy, and computational resources. We show that these limitations can be addressed by designing an autoencoder with low-resolution non-volatile memory-based synapses and employing an effective quantized neural network learning algorithm. We further propose nanoscale ferromagnetic racetracks with engineered notches hosting magnetic domain walls (DW) as exemplary non-volatile memory-based autoencoder synapses, where limited state (5-state) synaptic weights are manipulated by spin orbit torque (SOT) current pulses to write different magnetoresistance states. The performance of anomaly detection of the proposed autoencoder model is evaluated on the NSL-KDD dataset. Limited resolution and DW device stochasticity aware training of the autoencoder is performed, which yields comparable anomaly detection performance to the autoencoder having floating-point precision weights. While the limited number of quantized states and the inherent stochastic nature of DW synaptic weights in nanoscale devices are typically known to negatively impact the performance, our hardware-aware training algorithm is shown to leverage these imperfect device characteristics to generate an improvement in anomaly detection accuracy (90.98%) compared to accuracy obtained with floating-point synaptic weights that are extremely memory intensive. Furthermore, our DW-based approach demonstrates a remarkable reduction of at least three orders of magnitude in weight updates during training compared to the floating-point approach, implying significant reduction in operation energy for our method. This work could stimulate the development of extremely energy efficient non-volatile multi-state synapse-based processors that can perform real-time training and inference on the edge with unsupervised data. 
    more » « less
  4. A new class of neuromorphic processors promises to provide fast and power-efficient execution of spiking neural networks with on-chip synaptic plasticity. This efficiency derives in part from the fine-grained parallelism as well as event-driven communication mediated by spatially and temporally sparse spike messages. Another source of efficiency arises from the close spatial proximity between synapses and the sites where their weights are applied and updated. This proximity of compute and memory elements drastically reduces expensive data movements but imposes the constraint that only local operations can be efficiently performed, similar to constraints present in biological neural circuits. Efficient weight update operations should therefore only depend on information available locally at each synapse as non-local operations that involve copying, taking a transpose, or normalizing an entire weight matrix are not efficiently supported by present neuromorphic architectures. Moreover, spikes are typically non-negative events, which imposes additional constraints on how local weight update operations can be performed. The Locally Competitive Algorithm (LCA) is a dynamical sparse solver that uses only local computations between non-spiking leaky integrator neurons, allowing for massively parallel implementations on compatible neuromorphic architectures such as Intel's Loihi research chip. It has been previously demonstrated that non-spiking LCA can be used to learn dictionaries of convolutional kernels in an unsupervised manner from raw, unlabeled input, although only by employing non-local computation and signed non-spiking outputs. Here, we show how unsupervised dictionary learning with spiking LCA (S-LCA) can be implemented using only local computation and unsigned spike events, providing a promising strategy for constructing self-organizing neuromorphic chips. 
    more » « less
  5. We describe a paradigm for designing parallel algorithms via approximation, and illustrate it on the b-edgecover problem. A b-edgecover of minimum weight in a graph is a subset $$C$$ of its edges such that at least a specified number $b(v)$ of edges in $$C$$ is incident on each vertex $$v$$, and the sum of the edge weights in $$C$$ is minimum. The Greedy algorithm and a variant, the LSE algorithm, provide $3/2$-approximation guarantees in the worst-case for this problem, but these algorithms have limited parallelism. Hence we design two new $$2$$-approximation algorithms with greater concurrency. The MCE algorithm reduces the computation of a b-edgecover to that of finding a b'-matching, by exploiting the relationship between these subgraphs in an approximation context. The LSE-NW is derived from the LSEalgorithm using static edge weights rather than dynamically computing effective edge weights. This relaxation gives LSE a worse approximation guarantee but makes it more amenable to parallelization. We prove that both the MCE and LSE-NW algorithms compute the same b-edgecover with at most twice the weight of the minimum weight edge cover. In practice, the $$2$$-approximation and $3/2$-approximation algorithms compute edge covers of weight within $$10\%$$ the optimal. We implement three of the approximation algorithms, MCE, LSE, and LSE-NW on shared memory multi-core machines, including an Intel Xeon and an IBM Power8 machine with 8 TB memory. The MCE algorithm is the fastest of these by an order of magnitude or more. It computes an edge cover in a graph with billions of edges in $20$ seconds using two hundred threads on the IBM Power8. We also show that the parallel depth and work can be bounded for the Suitor and b-Suitor algorithms when edge weights are random. 
    more » « less