skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Decide: Knowledge-Based Version Incompatibility Detection in Deep Learning Stacks
Version incompatibility issues are prevalent when reusing or reproducing deep learning (DL) models and applications. Compared with official API documentation, which is often incomplete or out-of-date, Stack Overflow (SO) discussions possess a wealth of version knowledge that has not been explored by previous approaches. To bridge this gap, we present Decide, a web-based visualization of a knowledge graph that contains 2,376 version knowledge extracted from SO discussions. As an interactive tool, Decide allows users to easily check whether two libraries are compatible and explore compatibility knowledge of certain DL stack components with or without the version specified. A video demonstrating the usage of Decide is available at https://youtu.be/wqPxF2ZaZo0.  more » « less
Award ID(s):
2333736
PAR ID:
10548055
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
ISBN:
9798400706585
Page Range / eLocation ID:
547 to 551
Format(s):
Medium: X
Location:
Porto de Galinhas Brazil
Sponsoring Org:
National Science Foundation
More Like this
  1. Haldorai, Anandakumar (Ed.)
    Cybersecurity affects us all in our daily lives. New knowledge on best practices, new vulnerabilities, and timely fixes for cybersecurity issues is growing super-linearly, and is spread across numerous, heterogeneous sources. Because of that, community contribution-based, question and answer sites have become clearinghouses for cybersecurity-related inquiries, as they have for many other topics. Historically, Stack Overflow has been the most popular platform for different kinds of technical questions, including for cybersecurity. That has been changing, however, with the advent of Security Stack Exchange, a site specifically designed for cybersecurity-related questions and answers. More recently, some cybersecurity-related subreddits of Reddit, have become hubs for cybersecurity-related questions and discussions. The availability of multiple overlapping communities has created a complex terrain to navigate for someone looking for an answer to a cybersecurity question. In this paper, we investigate how and why people choose among three prominent, overlapping, question and answer communities, for their cybersecurity knowledge needs. We aggregated data of several consecutive years of cybersecurity-related questions from Stack Overflow, Security Stack Exchange, and Reddit, and performed statistical, linguistic, and longitudinal analysis. To triangulate the results, we also conducted user surveys. We found that the user behavior across those three communities is different, in most cases. Likewise, cybersecurity-related questions asked on the three sites are different, more technical on Security Stack Exchange and Stack Overflow, and more subjective and personal on Reddit. Moreover, there appears to have been a differentiation of the communities along the same lines, accompanied by overall popularity trends suggestive of Stack Overflow’s decline and Security Stack Exchange’s rise within the cybersecurity community. Reddit is addressing the more subjective, discussion type needs of the lay community, and is growing rapidly. 
    more » « less
  2. Abstract. Recently, deep learning (DL) has emerged as a revolutionary andversatile tool transforming industry applications and generating new andimproved capabilities for scientific discovery and model building. Theadoption of DL in hydrology has so far been gradual, but the field is nowripe for breakthroughs. This paper suggests that DL-based methods can open up acomplementary avenue toward knowledge discovery in hydrologic sciences. Inthe new avenue, machine-learning algorithms present competing hypotheses thatare consistent with data. Interrogative methods are then invoked to interpretDL models for scientists to further evaluate. However, hydrology presentsmany challenges for DL methods, such as data limitations, heterogeneityand co-evolution, and the general inexperience of the hydrologic field withDL. The roadmap toward DL-powered scientific advances will require thecoordinated effort from a large community involving scientists and citizens.Integrating process-based models with DL models will help alleviate datalimitations. The sharing of data and baseline models will improve theefficiency of the community as a whole. Open competitions could serve as theorganizing events to greatly propel growth and nurture data science educationin hydrology, which demands a grassroots collaboration. The area ofhydrologic DL presents numerous research opportunities that could, in turn,stimulate advances in machine learning as well. 
    more » « less
  3. Deep Learning (DL) is a class of machine learning algorithms that are used in a wide variety of applications. Like any software system, DL programs can have bugs. To support bug localization in DL programs, several tools have been proposed in the past. As most of the bugs that occur due to improper model structure known as structural bugs lead to inadequate performance during training, it is challenging for developers to identify the root cause and address these bugs. To support bug detection and localization in DL programs, in this article, we propose Theia, which detects and localizes structural bugs in DL programs. Unlike the previous works, Theia considers the training dataset characteristics to automatically detect bugs in DL programs developed using two DL libraries,KerasandPyTorch. Since training the DL models is a time-consuming process, Theia detects these bugs at the beginning of the training process and alerts the developer with informative messages containing the bug’s location and actionable fixes which will help them to improve the structure of the model. We evaluated Theia on a benchmark of 40 real-world buggy DL programs obtained fromStack Overflow. Our results show that Theia successfully localizes 57/75 structural bugs in 40 buggy programs, whereas NeuraLint, a state-of-the-art approach capable of localizing structural bugs before training localizes 17/75 bugs. 
    more » « less
  4. Work on optimal protocols for \emph{Eventual Byzantine Agreement} (EBA)---protocols that, in a precise sense, decide as soon as possible in every run and guarantee that all nonfaulty agents decide on the same value---has focused on full-information protocols} (FIPs), where agents repeatedly send messages that completely describe their past observations to every other agent. While it can be shown that, without loss of generality, we can take an optimal protocol to be an FIP, full information exchange is impractical to implement for many applications due to the required message size. We separate protocols into two parts, the information-exchange protocol and the action protocol, so as to be able to examine the effects of more limited information exchange. We then define a notion of optimality with respect to an information-exchange protocol. Roughly speaking, an action protocol P is optimal with respect to an information-exchange protocol E if, with P, agents decide as soon as possible among action protocols that exchange information according to E. We present a knowledge-based EBA program for omission failures all of whose implementations are guaranteed to be correct and are optimal if the information exchange satisfies a certain safety condition. We then construct concrete programs that implement this knowledge-based program in two settings of interest that are shown to satisfy the safety condition. Finally, we show that a small modification of our program results in an FIP that s both optimal and efficiently implementable, settling an open problem posed by Halpern, Moses, and Waarts (SIAM J. Comput., 2001). 
    more » « less
  5. Across basic research studies, cell counting requires significant human time and expertise. Trained experts use thin focal plane scanning to count (click) cells in stained biological tissue. This computer-assisted process (optical disector) requires a well-trained human to select a unique best z-plane of focus for counting cells of interest. Though accurate, this approach typically requires an hour per case and is prone to inter-and intra-rater errors. Our group has previously proposed deep learning (DL)-based methods to automate these counts using cell segmentation at high magnification. Here we propose a novel You Only Look Once (YOLO) model that performs cell detection on multi-channel z-plane images (disector stack). This automated Multiple Input Multiple Output (MIMO) version of the optical disector method uses an entire z-stack of microscopy images as its input, and outputs cell detections (counts) with a bounding box of each cell and class corresponding to the z-plane where the cell appears in best focus. Compared to the previous segmentation methods, the proposed method does not require time-and labor-intensive ground truth segmentation masks for training, while producing comparable accuracy to current segmentation-based automatic counts. The MIMO-YOLO method was evaluated on systematic-random samples of NeuN-stained tissue sections through the neocortex of mouse brains (n=7). Using a cross validation scheme, this method showed the ability to correctly count total neuron numbers with accuracy close to human experts and with 100% repeatability (Test-Retest). 
    more » « less