skip to main content


This content will become publicly available on June 18, 2024

Title: Counterfactual Explanations of Neural Network-Generated Response Curves
Response curves exhibit the magnitude of the response of a sensitive system to a varying stimulus. However, response of such systems may be sensitive to multiple stimuli (i.e., input features) that are not necessarily independent. As a consequence, the shape of response curves generated for a selected input feature (referred to as “active feature”) might depend on the values of the other input features (referred to as “passive features”). In this work we consider the case of systems whose response is approximated using regression neural networks. We propose to use counterfactual explanations (CFEs) for the identification of the features with the highest relevance on the shape of response curves generated by neural network black boxes. CFEs are generated by a genetic algorithm-based approach that solves a multi-objective optimization problem. In particular, given a response curve generated for an active feature, a CFE finds the minimum combination of passive features that need to be modified to alter the shape of the response curve. We tested our method on a synthetic dataset with 1-D inputs and two crop yield prediction datasets with 2-D inputs. The relevance ranking of features and feature combinations obtained on the synthetic dataset coincided with the analysis of the equation that was used to generate the problem. Results obtained on the yield prediction datasets revealed that the impact on fertilizer responsivity of passive features depends on the terrain characteristics of each field.  more » « less
Award ID(s):
1664858
NSF-PAR ID:
10493698
Author(s) / Creator(s):
;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
International Joint Conference on Neural Networks (IJCNN)
Page Range / eLocation ID:
01 to 08
Format(s):
Medium: X
Location:
Gold Coast, Australia
Sponsoring Org:
National Science Foundation
More Like this
  1. Domain adaptation techniques using deep neural networks have been mainly used to solve the distribution shift problem in homogeneous domains where data usually share similar feature spaces and have the same dimensionalities. Nevertheless, real world applications often deal with heterogeneous domains that come from completely different feature spaces with different dimensionalities. In our remote sensing application, two remote sensing datasets collected by an active sensor and a passive one are heterogeneous. In particular, CALIOP actively measures each atmospheric column. In this study, 25 measured variables/features that are sensitive to cloud phase are used and they are fully labeled. VIIRS is an imaging radiometer, which collects radiometric measurements of the surface and atmosphere in the visible and infrared bands. Recent studies have shown that passive sensors may have difficulties in prediction cloud/aerosol types in complicated atmospheres (e.g., overlapping cloud and aerosol layers, cloud over snow/ice surface, etc.). To overcome the challenge of the cloud property retrieval in passive sensor, we develop a novel VAE based approach to learn domain invariant representation that capture the spatial pattern from multiple satellite remote sensing data (VDAM), to build a domain invariant cloud property retrieval method to accurately classify different cloud types (labels) in the passive sensing dataset. We further exploit the weight based alignment method on the label space to learn a powerful domain adaptation technique that is pertinent to the remote sensing application. Experiments demonstrate our method outperforms other state-of-the-art machine learning methods and achieves higher accuracy in cloud property retrieval in the passive satellite dataset. 
    more » « less
  2. This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the ( x, y )-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods. 
    more » « less
  3. Data description This dataset presents the raw and augmented data that were used to train the machine learning (ML) models for classification of printing outcome in projection two-photon lithography (P-TPL). P-TPL is an additive manufacturing technique for the fabrication of cm-scale complex 3D structures with features smaller than 200 nm. The P-TPL process is further described in this article: “Saha, S. K., Wang, D., Nguyen, V. H., Chang, Y., Oakdale, J. S., and Chen, S.-C., 2019, "Scalable submicrometer additive manufacturing," Science, 366(6461), pp. 105-109.” This specific dataset refers to the case wherein a set of five line features were projected and the printing outcome was classified into three classes: ‘no printing’, ‘printing’, ‘overprinting’. Each datapoint comprises a set of ten inputs (i.e., attributes) and one output (i.e., target) corresponding to these inputs. The inputs are: optical power (P), polymerization rate constant at the beginning of polymer conversion (kp-0), radical quenching rate constant (kq), termination rate constant at the beginning of polymer conversion (kt-0), number of optical pulses, (N), kp exponential function shape parameter (A), kt exponential function shape parameter (B), quantum yield of photoinitiator (QY), initial photoinitiator concentration (PIo), and the threshold degree of conversion (DOCth). The output variable is ‘Class’ which can take these three values: -1 for the class ‘no printing’, 0 for the class ‘printing’, and 1 for the class ‘overprinting’. The raw data (i.e., the non-augmented data) refers to the data generated from finite element simulations of P-TPL. The augmented data was obtained from the raw data by (1) changing the DOCth and re-processing a solved finite element model or (2) by applying physics-based prior process knowledge. For example, it is known that if a given set of parameters failed to print, then decreasing the parameters that are positively correlated with printing (e.g. kp-0, power), while keeping the other parameters constant would also lead to no printing. Here, positive correlation means that individually increasing the input parameter will lead to an increase in the amount of printing. Similarly, increasing the parameters that are negatively correlated with printing (e.g. kq, kt-0), while keeping the other parameters constant would also lead to no printing. The converse is true for those datapoints that resulted in overprinting. The 'Raw.csv' file contains the datapoints generated from finite element simulations, the 'Augmented.csv' file contains the datapoints generated via augmentation, and the 'Combined.csv' file contains the datapoints from both files. The ML models were trained on the combined dataset that included both raw and augmented data. 
    more » « less
  4. null (Ed.)
    Objective. Background noise experienced during extracellular neural recording limits the number of spikes that can be reliably detected, which ultimately limits the performance of next-generation neuroscientific work. In this study, we aim to utilize stochastic resonance (SR), a technique that can help identify weak signals in noisy environments, to enhance spike detectability. Approach. Previously, an SR-based pre-emphasis algorithm was proposed, where a particle inside a 1D potential well is exerted by a force defined by the extracellular recording, and the output is obtained as the displacement of the particle. In this study, we investigate how the well shape and damping status impact the output Signal-to-Noise Ratio (SNR). We compare the overdamped and underdamped solutions of shallow- and steep-wall monostable wells and bistable wells in terms of SNR improvement using two synthetic datasets. Then, we assess the spike detection performance when thresholding is applied on the output of the well shape-damping status configuration giving the best SNR enhancement. Main results. The SNR depends on the well-shape and damping-status type as well as the input noise level. The underdamped solution of the shallow-wall monostable well can yield to more than four orders of magnitude greater SNR improvement compared to other configurations for low noise intensities. Using this configuration also results in better spike detection sensitivity and positive predictivity than the state-of-the-art spike detection algorithms for a public synthetic dataset. For larger noise intensities, the overdamped solution of the steep-wall monostable well provides better spike enhancement than the others. Significance. The dependence of SNR improvement on the input signal noise level can be used to design a detector with multiple outputs, each more sensitive to a certain distance from the electrode. Such a detector can potentially enhance the performance of a successive spike sorting stage. 
    more » « less
  5. Passive Remote Sensing services are indispensable in modern society because of the applications related to climate studies and earth science. Among those, NASA’s Soil Moisture Active and Passive (SMAP) mission provides an essential climate variable such as the moisture content of the soil by using microwave radiation within protected band over 1400-1427 MHz. However, because of the increasing active wireless technologies such as Internet of Things (IoT), unmanned aerial vehicles (UAV), and 5G wireless communication, the SMAP’s passive observations are expected to experience an increasing number of Radio Frequency Interference (RFI). RFI is a well-documented issue and SMAP has a ground processing unit dedicated to tackling this issue. However, advanced techniques are needed to tackle the increasing RFI problem for passive sensing systems and to jointly coexist communication and sensing systems. In this paper, we apply a deep learning approach where a novel Convolutional Neural Network (CNN) architecture for both RFI detection and mitigation is employed. SMAP Level 1A spectrogram of antenna counts and various moments data are used as the inputs to the deep learning architecture. We simulate different types of RFI sources such as pulsed, CW or wideband anthropogenic signals. We then use artificially corrupted SMAP Level 1B antenna measurements in conjunction with RFI labels to train the learning architecture. While the learned detection network classifies input spectrograms as RFI or no-RFI cases, the mitigation network reconstructs the RFI mitigated antenna temperature images. The proposed learning framework both takes advantage of the existing SMAP data and the simulated RFI scenarios. Future remote sensing systems such as radiometers will suffer an increasing RFI problem and spectrum sharing and techniques that will allow coexistance of sensing and communication systems will be utmost importance for both parties. RFI detection and mitigation will remain a prerequisite for these radiometers and the proposed deep learning approach has the potential to provide an additional perspective to existing solutions. We will present detailed analysis on the selected deep learning architecture, obtained RFI detection accuracy levels and RFI mitigation performance. 
    more » « less