skip to main content

Title: Vertex finding in neutrino-nucleus interaction: a model architecture comparison
Abstract We compare different neural network architectures for machine learning algorithms designed to identify the neutrino interaction vertex position in the MINERvA detector. The architectures developed and optimized by hand are compared with the architectures developed in an automated way using the package “Multi-node Evolutionary Neural Networks for Deep Learning” (MENNDL), developed at Oak Ridge National Laboratory. While the domain-expert hand-tuned network was the best performer, the differences were negligible and the auto-generated networks performed as well. There is always a trade-off between human, and computer resources for network optimization and this work suggests that automated optimization, assuming resources are available, provides a compelling way to save significant expert time.  more » « less
Award ID(s):
2111053 2013217
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Date Published:
Journal Name:
Journal of Instrumentation
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Groote, Jan Friso ; Larsen, Kim Guldstrand (Ed.)
    Deep learning has emerged as an effective approach for creating modern software systems, with neural networks often surpassing hand-crafted systems. Unfortunately, neural networks are known to suffer from various safety and security issues. Formal verification is a promising avenue for tackling this difficulty, by formally certifying that networks are correct. We propose an SMT-based technique for verifying binarized neural networks — a popular kind of neural network, where some weights have been binarized in order to render the neural network more memory and energy efficient, and quicker to evaluate. One novelty of our technique is that it allows the verification of neural networks that include both binarized and non-binarized components. Neural network verification is computationally very difficult, and so we propose here various optimizations, integrated into our SMT procedure as deduction steps, as well as an approach for parallelizing verification queries. We implement our technique as an extension to the Marabou framework, and use it to evaluate the approach on popular binarized neural network architectures. 
    more » « less
  2. openDIEL is a workflow engine that aims to give researchers and users of HPC an efficient way to coordinate, organize, and interconnect many disparate modules of computation in order to effectively utilize and allocate HPC resources [13]. A GUI has been developed to aid in creating workflows, and allows for the specification of data science jobs, including specification neural network architectures, data processing, and hyperparameter tuning. Existing machine learning tools can be readily used in the openDIEL, allowing for easy experimentation with various models and approaches. 
    more » « less
  3. Neural networks hold a critical domain in machine learning algorithms because of their self-adaptiveness and state-of-the-art performance. Before the testing (inference) phases in practical use, sophisticated training (learning) phases are required, calling for efficient training methods with higher accuracy and shorter converging time. Many existing studies focus on the training optimization on high-performance servers or computing clusters, e.g. GPU clusters. However, training neural networks on resource-constrained devices, e.g. mobile platforms, is an important research topic barely touched. In this paper, we implement AdaLearner–an adaptive distributed mobile learning system for neural networks that trains a single network with heterogenous mobile resources under the same local network in parallel. To exploit the potential of our system, we adapt neural networks training phase to mobile device-wise resources and fiercely decrease the transmission overhead for better system scalability. On three representative neural network structures trained from two image classification datasets, AdaLearner boosts the training phase significantly. For example, on LeNet, 1.75-3.37⇥ speedup is achieved when increasing the worker nodes from 2 to 8, thanks to the achieved high execution parallelism and excellent scalability. 
    more » « less
  4. An emerging problem in trustworthy machine learning is to train models that pro- duce robust interpretations for their predictions. We take a step towards solving this problem through the lens of axiomatic attribution of neural networks. Our theory is grounded in the recent work, Integrated Gradients (IG) [STY17], in axiomatically attributing a neural network’s output change to its input change. We propose training objectives in classic robust optimization models to achieve robust IG attributions. Our objectives give principled generalizations of previous objectives designed for robust predictions, and they naturally degenerate to classic soft-margin training for one-layer neural networks. We also generalize previous theory and prove that the objectives for different robust optimization models are closely related. Experiments demonstrate the effectiveness of our method, and also point to intriguing problems which hint at the need for better optimization techniques or better neural network architectures for robust attribution training. 
    more » « less
  5. Implementing artificial neural networks is commonly achieved via high-level programming languages such as Python and easy-to-use deep learning libraries such as Keras. These software libraries come preloaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural network model in Python, where these tools are readily available. However, many large-scale scientific computation projects are written in Fortran, making it difficult to integrate with modern deep learning methods. To alleviate this problem, we introduce a software library, the Fortran-Keras Bridge (FKB). This two-way bridge connects environments where deep learning resources are plentiful with those where they are scarce. The paper describes several unique features offered by FKB, such as customizable layers, loss functions, and network ensembles. The paper concludes with a case study that applies FKB to address open questions about the robustness of an experimental approach to global climate simulation, in which subgrid physics are outsourced to deep neural network emulators. In this context, FKB enables a hyperparameter search of one hundred plus candidate models of subgrid cloud and radiation physics, initially implemented in Keras, to be transferred and used in Fortran. Such a process allows the model’s emergent behavior to be assessed, i.e., when fit imperfections are coupled to explicit planetary-scale fluid dynamics. The results reveal a previously unrecognized strong relationship between offline validation error and online performance, in which the choice of the optimizer proves unexpectedly critical. This in turn reveals many new neural network architectures that produce considerable improvements in climate model stability including some with reduced error, for an especially challenging training dataset. 
    more » « less