skip to main content

Title: Client Insourcing: Bringing Ops In-House for Seamless Re-engineering of Full-Stack JavaScript Applications
Modern web applications are distributed across a browser-based client and a cloud-based server. Distribution provides access to remote resources, accessed over the web and shared by clients. Much of the complexity of inspecting and evolving web applications lies in their distributed nature. Also, the majority of mature program analysis and transformation tools works only with centralized software. Inspired by business process re-engineering, in which remote operations can be insourced back in house to restructure and outsource anew, we bring an analogous approach to the re-engineering of web applications. Our target domain are full-stack JavaScript applications that implement both the client and server code in this language. Our approach is enabled by Client Insourcing, a novel automatic refactoring that creates a semantically equivalent centralized version of a distributed application. This centralized version is then inspected, modified, and redistributed to meet new requirements. After describing the design and implementation of Client Insourcing, we demonstrate its utility and value in addressing changes in security, reliability, and performance requirements. By reducing the complexity of the non-trivial program inspection and evolution tasks performed to meet these requirements, our approach can become a helpful aid in the re-engineering of web applications in this domain.  more » « less
Award ID(s):
1717065 1650540
Author(s) / Creator(s):
Date Published:
Journal Name:
WWW '20: Proceedings of The Web Conference 2020
Page Range / eLocation ID:
179 to 189
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Localizing bugs in distributed applications is complicated by the potential presence of server/middleware misconfigurations and intermittent network connectivity. In this paper, we present a novel approach to localizing bugs in distributed web applications, targeting the important domain of full-stack JavaScript applications. The debugged application is first automatically refactored to create its semantically equivalent centralized version by gluing together the application’s client and server parts, thus separating the programmer-written code from configuration/environmental issues as suspected bug causes. The centralized version is then debugged to fix various bugs. Finally, based on the bug fixing changes of the centralized version, a patch is automatically generated to fix the original application source files. We show how our approach can be used to catch bugs that include performance bottlenecks and memory leaks. These results indicate that our debugging approach can facilitate the challenges of localizing and fixing bugs in web applications. 
    more » « less
  2. The paper introduces a visual programming language and corresponding web and cloud-based development environment called NetsBlox. NetsBlox is an extension of Snap! and builds upon its visual formalism as well as its open source code base. NetsBlox adds distributed programming capabilities by introducing two well-known abstractions to block-based programming: message passing and Remote Procedure Calls (RPC). Messages containing data can be exchanged by two or more NetsBlox programs running on different computers connected to the Internet. RPCs are called on a client program and are executed on the NetsBlox server. These two abstractions make it possible to create distributed programs such as multi-player games or client-server applications. We believe that NetsBlox not only teaches basic distributed programming concepts but also provides increased motivation for high-school students to become creators and not just consumers of technology. 
    more » « less
  3. Distributed applications enhance their execution by using remote resources. However, distributed execution incurs communication, synchronization, fault-handling, and security overheads. If these overheads are not offset by the yet larger execution enhancement, distribution becomes counterproductive. For maximum benefits, the distribution’s granularity cannot be too fine or too crude; it must be just right. In this paper, we present a novel approach to re-architecting distributed applications, whose distribution granularity has turned ill-conceived. To adjust the distribution of such applications, our approach automatically reshapes their remote invocations to reduce aggregate latency and resource consumption. To that end, our approach insources a remote functionality for local execution, splits it into separate functions to profile their performance, and determines the optimal redistribution based on a cost function. Redistribution strategies combine separate functions into single remotely invocable units. To automate all the required program transformations, our approach introduces a series of domainspecific automatic refactorings. We have concretely realized our approach as an analysis and automatic program transformation infrastructure for the important domain of full-stack JavaScript applications, and evaluated its value, utility, and performance on a series of real-world cross-platform mobile apps. Our evaluation results indicate that our approach can become a useful tool for software developers charged with the challenges of re-architecting distributed applications. 
    more » « less
  4. Obeid, I. ; Selesnick, I. (Ed.)
    The Neural Engineering Data Consortium at Temple University has been providing key data resources to support the development of deep learning technology for electroencephalography (EEG) applications [1-4] since 2012. We currently have over 1,700 subscribers to our resources and have been providing data, software and documentation from our web site [5] since 2012. In this poster, we introduce additions to our resources that have been developed within the past year to facilitate software development and big data machine learning research. Major resources released in 2019 include: ● Data: The most current release of our open source EEG data is v1.2.0 of TUH EEG and includes the addition of 3,874 sessions and 1,960 patients from mid-2015 through 2016. ● Software: We have recently released a package, PyStream, that demonstrates how to correctly read an EDF file and access samples of the signal. This software demonstrates how to properly decode channels based on their labels and how to implement montages. Most existing open source packages to read EDF files do not directly address the problem of channel labels [6]. ● Documentation: We have released two documents that describe our file formats and data representations: (1) electrodes and channels [6]: describes how to map channel labels to physical locations of the electrodes, and includes a description of every channel label appearing in the corpus; (2) annotation standards [7]: describes our annotation file format and how to decode the data structures used to represent the annotations. Additional significant updates to our resources include: ● NEDC TUH EEG Seizure (v1.6.0): This release includes the expansion of the training dataset from 4,597 files to 4,702. Calibration sequences have been manually annotated and added to our existing documentation. Numerous corrections were made to existing annotations based on user feedback. ● IBM TUSZ Pre-Processed Data (v1.0.0): A preprocessed version of the TUH Seizure Detection Corpus using two methods [8], both of which use an FFT sliding window approach (STFT). In the first method, FFT log magnitudes are used. In the second method, the FFT values are normalized across frequency buckets and correlation coefficients are calculated. The eigenvalues are calculated from this correlation matrix. The eigenvalues and correlation matrix's upper triangle are used to generate feature. ● NEDC TUH EEG Artifact Corpus (v1.0.0): This corpus was developed to support modeling of non-seizure signals for problems such as seizure detection. We have been using the data to build better background models. Five artifact events have been labeled: (1) eye movements (EYEM), (2) chewing (CHEW), (3) shivering (SHIV), (4) electrode pop, electrostatic artifacts, and lead artifacts (ELPP), and (5) muscle artifacts (MUSC). The data is cross-referenced to TUH EEG v1.1.0 so you can match patient numbers, sessions, etc. ● NEDC Eval EEG (v1.3.0): In this release of our standardized scoring software, the False Positive Rate (FPR) definition of the Time-Aligned Event Scoring (TAES) metric has been updated [9]. The standard definition is the number of false positives divided by the number of false positives plus the number of true negatives: #FP / (#FP + #TN). We also recently introduced the ability to download our data from an anonymous rsync server. The rsync command [10] effectively synchronizes both a remote directory and a local directory and copies the selected folder from the server to the desktop. It is available as part of most, if not all, Linux and Mac distributions (unfortunately, there is not an acceptable port of this command for Windows). To use the rsync command to download the content from our website, both a username and password are needed. An automated registration process on our website grants both. An example of a typical rsync command to access our data on our website is: rsync -auxv Rsync is a more robust option for downloading data. We have also experimented with Google Drive and Dropbox, but these types of technology are not suitable for such large amounts of data. All of the resources described in this poster are open source and freely available at We will demonstrate how to access and utilize these resources during the poster presentation and collect community feedback on the most needed additions to enable significant advances in machine learning performance. 
    more » « less
  5. This paper introduces NetsBlox, a visual programming environment for learning distributed programming principles. Extending both the visual formalism and open source code base of Snap!, NetsBlox provides two accessible distributed programming abstractions to simplify the process of creating networked applications: message passing and Remote Procedure Calls (RPC). Messaging passing allows NetsBlox applications to send data to other connected NetsBlox clients. Remote Procedure Calls enable seamless integration of third party services, such as Google Maps, weather, traffic and other public domain data sources, into NetsBlox applications. Other RPCs help coordinating distributed clients which may be difficult for novice programmers allowing the user to more quickly create captivating and sophisticated applications. These abstractions empower users to develop networked programs, including multi-player games and client-server applications. By providing networking support, NetsBlox not only allows users to learn distribute programming concepts but also makes programming more engaging by incorporating diverse services available on the web. 
    more » « less