This content will become publicly available on May 30, 2025
Learning from complex, multidimensional data has become central to computational mathematics, and among the most successful high-dimensional function approximators are deep neural networks (DNNs). Training DNNs is posed as an optimization problem to learn network weights or parameters that well-approximate a mapping from input to target data. Multiway data or tensors arise naturally in myriad ways in deep learning, in particular as input data and as high-dimensional weights and features extracted by the network, with the latter often being a bottleneck in terms of speed and memory. In this work, we leverage tensor representations and processing to efficiently parameterize DNNs when learning from high-dimensional data. We propose tensor neural networks (t-NNs), a natural extension of traditional fully-connected networks, that can be trained efficiently in a reduced, yet more powerful parameter space. Our t-NNs are built upon matrix-mimetic tensor-tensor products, which retain algebraic properties of matrix multiplication while capturing high-dimensional correlations. Mimeticity enables t-NNs to inherit desirable properties of modern DNN architectures. We exemplify this by extending recent work on stable neural networks, which interpret DNNs as discretizations of differential equations, to our multidimensional framework. We provide empirical evidence of the parametric advantages of t-NNs on dimensionality reduction using autoencoders and classification using fully-connected and stable variants on benchmark imaging datasets MNIST and CIFAR-10.
more » « less- Award ID(s):
- 2309751
- NSF-PAR ID:
- 10533088
- Editor(s):
- Zhang, Yanqing
- Publisher / Repository:
- Frontiers
- Date Published:
- Journal Name:
- Frontiers in Big Data
- Volume:
- 7
- ISSN:
- 2624-909X
- Subject(s) / Keyword(s):
- tensor algebra, deep learning, machine learning, image classification, inverse problems
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract Modern machine learning (ML) and deep learning (DL) techniques using high-dimensional data representations have helped accelerate the materials discovery process by efficiently detecting hidden patterns in existing datasets and linking input representations to output properties for a better understanding of the scientific phenomenon. While a deep neural network comprised of fully connected layers has been widely used for materials property prediction, simply creating a deeper model with a large number of layers often faces with vanishing gradient problem, causing a degradation in the performance, thereby limiting usage. In this paper, we study and propose architectural principles to address the question of improving the performance of model training and inference under fixed parametric constraints. Here, we present a general deep-learning framework based on branched residual learning (BRNet) with fully connected layers that can work with any numerical vector-based representation as input to build accurate models to predict materials properties. We perform model training for materials properties using numerical vectors representing different composition-based attributes of the respective materials and compare the performance of the proposed models against traditional ML and existing DL architectures. We find that the proposed models are significantly more accurate than the ML/DL models for all data sizes by using different composition-based attributes as input. Further, branched learning requires fewer parameters and results in faster model training due to better convergence during the training phase than existing neural networks, thereby efficiently building accurate models for predicting materials properties.more » « less
-
We propose a unified optimization framework that combines neural networks with dictionary learning to model complex interactions between resting state functional MRI and behavioral data. The dictionary learning objective decomposes patient correlation matrices into a collection of shared basis networks and subject-specific loadings. These subject-specific features are simultaneously input into a neural network that predicts multidimensional clinical information. Our novel optimization framework combines the gradient information from the neural network with that of a conventional matrix factorization objective. This procedure collectively estimates the basis networks, subject loadings, and neural network weights most informative of clinical severity. We evaluate our combined model on a multi-score prediction task using 52 patients diagnosed with Autism Spectrum Disorder (ASD). Our integrated framework outperforms state-of-the-art methods in a ten-fold cross validated setting to predict three different measures of clinical severity.more » « less
-
We propose a unified optimization framework that combines neural networks with dictionary learning to model complex interactions between resting state functional MRI and behavioral data. The dictionary learning objective decomposes patient correlation matrices into a collection of shared basis networks and subject-specific loadings. These subject-specific features are simultaneously input into a neural network that predicts multidimensional clinical information. Our novel optimization framework combines the gradient information from the neural network with that of a conventional matrix factorization objective. This procedure collectively estimates the basis networks, subject loadings, and neural network weights most informative of clinical severity. We evaluate our combined model on a multi-score prediction task using 52 patients diagnosed with Autism Spectrum Disorder (ASD). Our integrated framework outperforms state-of-the-art methods in a ten-fold cross validated setting to predict three different measures of clinical severity.more » « less
-
Abstract This paper presents a deep learning method for solving an improved one-dimensional Poisson-Nernst-Planck ion channel (PNPic) model, called the PNPic deep learning solver. The solver combines a novel local neural network, adapted from the neural network with local converging inputs, with an efficient PNPic finite element solver, developed in this work. In particular, the local neural network is extended to handle the complexities of the PNPic model --- a system of nonlinear convection-diffusion and elliptic equations with multiple subdomains connected by interface conditions. The PNPic finite element solver efficiently generates input and reference datasets for fast training the local neural network, as well as input datasets for quickly predicting PNPic solutions with high accuracy for a family of PNPic models. Initial numerical tests, involving perturbations of model parameters and interface locations, demonstrate that the PNPic deep learning solver can generate highly accurate numerical solutions.
-
We trained convolutional neural networks (CNNs) to suppress off-axis scattering in the short-time Fourier Transform (STFT) domain. Our training data were point target responses from simulated anechoic cysts. We used random neural architecture search to build CNN models with variable input formulations, layer sizes, and training hyperparameters. Our results showed that CNNs were easier to train, as they required fewer network weights to match the performance of fully-connected networks (FCNs). The best CNN models achieved comparable phantom CNRs with with two to three orders of magnitude fewer weights.more » « less