Evaluating the exact first derivative of a feedforward neural network (FFNN) output with respect to the input feature is pivotal for performing the sensitivity analysis of the trained neural network with respect to the inputs. In this paper, a novel method is presented that computes the analytical quality first derivative of a trained feedforward neural network output with respect to the input features without the need for backpropagation. To this end, the complex step derivative approximation is illustrated, and its implementation in the framework of the feedforward neural network is described. Artificial datasets are generated, and the efficacy of the proposed method for both regression and classification tasks is demonstrated. The results obtained for the regression task indicated that the proposed method is capable of obtaining analytical quality derivatives, and in the case of the classification task, the least relevant features could be identified.
more » « less- Award ID(s):
- 1946202
- PAR ID:
- 10248959
- Publisher / Repository:
- Springer Science + Business Media
- Date Published:
- Journal Name:
- Journal of Big Data
- Volume:
- 8
- Issue:
- 1
- ISSN:
- 2196-1115
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Implicit neural networks are a general class of learning models that replace the layers in traditional feedforward models with implicit algebraic equations. Compared to traditional learning models, implicit networks offer competitive performance and reduced memory consumption. However, they can remain brittle with respect to input adversarial perturbations. This paper proposes a theoretical and computational framework for robustness verification of implicit neural networks; our framework blends together mixed monotone systems theory and contraction theory. First, given an implicit neural network, we introduce a related embedded network and show that, given an infinity-norm box constraint on the input, the embedded network provides an infinity-norm box overapproximation for the output of the original network. Second, using infinity-matrix measures, we propose sufficient conditions for well-posedness of both the original and embedded system and design an iterative algorithm to compute the infinity-norm box robustness margins for reachability and classification problems. Third, of independent value, we show that employing a suitable relative classifier variable in our analysis will lead to tighter bounds on the certified adversarial robustness in classification problems. Finally, we perform numerical simulations on a Non-Euclidean Monotone Operator Network (NEMON) trained on the MNIST dataset. In these simulations, we compare the accuracy and run time of our mixed monotone contractive approach with the existing robustness verification approaches in the literature for estimating the certified adversarial robustness.more » « less
-
Transfer learning using ImageNet pre-trained models has been the de facto approach in a wide range of computer vision tasks. However, fine-tuning still requires task-specific training data. In this paper, we propose N3 (Neural Networks from Natural Language) - a new paradigm of synthesizing task-specific neural networks from language descriptions and a generic pre-trained model. N3 leverages language descriptions to generate parameter adaptations as well as a new task-specific classification layer for a pre-trained neural network, effectively “fine-tuning” the network for a new task using only language descriptions as input. To the best of our knowledge, N3 is the first method to synthesize entire neural networks from natural language. Experimental results show that N3 can out-perform previous natural-language based zero-shot learning methods across 4 different zero-shot image classification benchmarks. We also demonstrate a simple method to help identify keywords in language descriptions leveraged by N3 when synthesizing model parameters.more » « less
-
Abstract Sensitivity analysis is a popular feature selection approach employed to identify the important features in a dataset. In sensitivity analysis, each input feature is perturbed one-at-a-time and the response of the machine learning model is examined to determine the feature's rank. Note that the existing perturbation techniques may lead to inaccurate feature ranking due to their sensitivity to perturbation parameters. This study proposes a novel approach that involves the perturbation of input features using a complex-step. The implementation of complex-step perturbation in the framework of deep neural networks as a feature selection method is provided in this paper, and its efficacy in determining important features for real-world datasets is demonstrated. Furthermore, the filter-based feature selection methods are employed, and the results obtained from the proposed method are compared. While the results obtained for the classification task indicated that the proposed method outperformed other feature ranking methods, in the case of the regression task, it was found to perform more or less similar to that of other feature ranking methods.
-
Soltani, Alireza (Ed.)Feedforward network models performing classification tasks rely on highly convergent output units that collect the information passed on by preceding layers. Although convergent output-unit like neurons may exist in some biological neural circuits, notably the cerebellar cortex, neocortical circuits do not exhibit any obvious candidates for this role; instead they are highly recurrent. We investigate whether a sparsely connected recurrent neural network (RNN) can perform classification in a distributed manner without ever bringing all of the relevant information to a single convergence site. Our model is based on a sparse RNN that performs classification dynamically. Specifically, the interconnections of the RNN are trained to resonantly amplify the magnitude of responses to some external inputs but not others. The amplified and non-amplified responses then form the basis for binary classification. Furthermore, the network acts as an evidence accumulator and maintains its decision even after the input is turned off. Despite highly sparse connectivity, learned recurrent connections allow input information to flow to every neuron of the RNN, providing the basis for distributed computation. In this arrangement, the minimum number of synapses per neuron required to reach maximum memory capacity scales only logarithmically with network size. The model is robust to various types of noise, works with different activation and loss functions and with both backpropagation- and Hebbian-based learning rules. The RNN can also be constructed with a split excitation-inhibition architecture with little reduction in performance.more » « less
-
Classification of Aortic Stenosis Using Time-Frequency Features from Chest Cardio-mechanical SignalsObjectives: This paper introduces a novel method for the detection and classification of aortic stenosis (AS) using the time-frequency features of chest cardio-mechanical signals collected from wearable sensors, namely seismo-cardiogram (SCG) and gyro-cardiogram (GCG) signals. Such a method could potentially monitor high-risk patients out of the clinic. Methods: Experimental measurements were collected from twenty patients with AS and twenty healthy subjects. Firstly, a digital signal processing framework is proposed to extract time-frequency features. The features are then selected via the analysis of variance test. Different combinations of features are evaluated using the decision tree, random forest, and artificial neural network methods. Two classification tasks are conducted. The first task is a binary classification between normal subjects and AS patients. The second task is a multi-class classification of AS patients with co-existing valvular heart diseases. Results: In the binary classification task, the average accuracies achieved are 96.25% from decision tree, 97.43% from random forest, and 95.56% from neural network. The best performance is from combined SCG and GCG features with random forest classifier. In the multi-class classification, the best performance is 92.99% using the random forest classifier and SCG features. Conclusion: The results suggest that the solution could be a feasible method for classifying aortic stenosis, both in the binary and multi-class tasks. It also indicates that most of the important time-frequency features are below 11 Hz. Significance: The proposed method shows great potential to provide continuous monitoring of valvular heart diseases to prevent patients from sudden critical cardiac situations.more » « less