<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>"Bring-your-own" Plug-in Management Middleware for Programmable Science Gateways.</title></titleStmt>
			<publicationStmt>
				<publisher>https://osf.io/meetings/gateways2020/</publisher>
				<date>10/12/2020</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10510370</idno>
					<idno type="doi"></idno>
					<title level='j'>Gateways 2020</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>K B Vekaria</author><author>P Calyam</author><author>R Oruche</author><author>Y Zhang</author><author>S Wang</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[There is a growing need for next-generation science gateways to increase the accessibility of data sets and cloud computing resources using latest technologies. Most science gateways today are built for specific purposes with pre-defined workflows, user interfaces, and fixed computing resources. There is a need to modernize them with middleware that can provide ‘plug in’ support to programmatically increase their extensibility and scalability to meet users’ growing needs. In this paper, we propose a novel middleware that can be integrated into science gate ways using a “bring-your-own” plug-in management approach. This approach features microservice architectures to decouple applications, and allows users (i.e., administrators, developers, researchers) to customize and incorporate domain-specific components in an existing science gateway. We detail the application programming interfaces in our middleware for creation of end-toend pipelines with diverse infrastructure, customized processes, detailed monitoring and flexible programmability for a scientific domain. We also demonstrate via a OnTimeRecommend case study on how our “bring-your-own” approach can be seamlessly integrated by a science gateway administrator/developer using a web application.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Science gateways hide the complexities for scientific domain users to access distributed computing resources, and perform big data management. Through easy-to-use domain application interfaces, they handle user needs for a variety of scientific tasks related to research and education. Science gateways have been developed in many scientific domains, including bioinformatics, neuroscience, physics, chemistry, and material science <ref type="bibr">[1]</ref>  <ref type="bibr">[2]</ref>. Many of today's science gateways help domain science users in execution of workflows, automation for data integration, and analysis/visualization of voluminous data -on distributed high-performance computing resources (HPC) and cloud resources (e.g., Amazon Web Services, CyVerse <ref type="bibr">[3]</ref>).</p><p>However, most science gateways today are built for specific purposes with pre-defined workflows, user interfaces, and fixed computing resources. Such a state-of-practice makes it difficult for users whose science gateway needs constantly evolve in terms of e.g., data-intensive workflow automation or choice of cloud computing platform. Also, science gateway administrators/developers often are challenged to integrate advanced technologies (e.g., knowledge bases, recommenders, machine learning tools) due to original architecture design limitations. To overcome such issues, there is a need to modernize science gateways with middleware that can provide 'plug in' support to programmatically increase their extensibility and scalability to meet users' growing needs. By thus increasing the modularity of science gateways with such a middleware, diverse user needs can be satisfied at the front-end, and dynamic resource management can be supported at the back-end.</p><p>In this paper, we propose a novel middleware that can be integrated into science gateways using a "bring-your-own" plug-in management approach. Our middleware to manage plug-in services is inspired by a "pluganized" management framework developed in <ref type="bibr">[4]</ref>, which enables plugin of network protocols as extensions to support fast/secure data transmission. Using a plug-in management approach, our middleware features microservice architectures to decouple applications, and allows users (i.e., administrators, developers, researchers) to customize and incorporate domain-specific components in an existing science gateway. We detail the application programming interfaces (APIs) in our middleware for creation of end-to-end pipelines with diverse infrastructure, customized processes, detailed monitoring and flexible programmability for a scientific domain. The APIs provide science gateway administrators/developers with the following benefits:</p><p>&#8226; Modularity "Bring-your-own" plug-in approach leverages microservice architectures that decouples the science gateway application code, and enables addition of new services that function independently but interconnect to each other. Consequently, the microservices code can be reused by processes that have similar execution behavior across multiple science gateways. &#8226; Extensibility "Bring-your-own" plug-in management approach allows science gateway administrators/developers to easily extend the middleware with additional application components or plug-ins such as e.g., multi-cloud based workflows with templates, execution pipelines involving machine learning algorithms, and more. &#8226; Scalability "Bring-your-own" plug-in management approach allows users to reserve pre-configured and readyto-use cloud infrastructure resources in order to help them scale their workloads as and when needed. Replacing a microservice for a different scale of resource needs allows for dynamic resource management for changing user workflow needs. &#8226; Programmability "Bring-your-own" plug-in management approach helps science gateway administra-tors/developers to program, register and upload various components into their existing setup using a customizable application interface. Lastly, we demonstrate the benefits of our "bring-your-own" plug-in management approach via an OnTimeRecommend case study. Specifically, we show how our middleware can be seamlessly integrated by a science gateway administrator/developer using a web application.</p><p>The remainder of the paper is organized as follows: Section II presents related work. Section III details our "bring-yourown" plug-in management middleware design and implementation. Section IV describes an OnTimeRecommend case study to show our middleware integration within a science gateway. Section V discusses the challenges we addressed in the integration of emerging AI/ML tools in next-generation science gateways. Section VI concludes the paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. RELATED WORK</head><p>Designing and developing a successful science gateway takes a significant amount of time, funding and personnel effort. However, science gateways have to continuously evolve to adapt to the changing needs in scientific research/education tasks. Leveraging advanced technologies can help science gateways to address this issue. One such technology involves use of microservices via RESTful APIs. For example, GenApp <ref type="bibr">[5]</ref> leverages decoupling of application code from the science gateway to allow researchers to specify only the input and output parameters to run their command line applications via a graphical interface. Agave <ref type="bibr">[6]</ref>, a science-as-a-service API platform, was built largely with Docker container based microservices to seamlessly integrate API management, capacity scaling, and community contributions to provide platform services, science APIs and support services.</p><p>A number of science gateways are looking to use easily reusable and transferable building blocks. Apache Airavata <ref type="bibr">[7]</ref> provides a software suite to compose, manage, execute, and monitor large-scale applications and workflows for science gateways. PaaSage <ref type="bibr">[8]</ref> supports the design and deployment of multi-cloud applications by optimizing and customizing workflows in science gateways. Agave <ref type="bibr">[6]</ref> offers platform-asa-service for hybrid cloud computing and data management purposes and is being adopted as the API layer by several science gateways, such as CyVerse <ref type="bibr">[3]</ref>. Globus Galaxies <ref type="bibr">[9]</ref> is a domain-independent, cloud-based science gateway providing a web-based interface for creating, executing, sharing, and reusing workflows composed of arbitrary applications, tools, and scripts. MiCADO <ref type="bibr">[10]</ref>, a microservice-based application orchestrator middleware, offers scalable Docker containerbased microservice deployment by integrating services from federated private and public cloud resource providers. Today's science gateways provide comprehensive user services and convenient workflow automation; however, they largely lack the capability for users to customize programmable plug-ins. The integration of plug-ins require modular programming to extensively maintain and deploy heterogeneous components. Our middleware development is inspired by these leading science gateways and frameworks to allow highly portable and reusable building blocks in next-generation science gateway development, as well as decoupling of system components to allow users to customize their workflows and computing tasks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. "BRING-YOUR-OWN" PLUG-INS MANAGEMENT MIDDLEWARE</head><p>In this section, we detail design of our "Bring-Your-Own" plug-ins management middleware and its implementation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. System Design</head><p>Design of "Bring-Your-Own" plug-in management middleware is shown in Figure <ref type="figure">1</ref>. The plug-ins are integrated with the Application Layer featuring a science gateway such as e.g., CyNeuro <ref type="bibr">[2]</ref> with the help of a OnTimeRecommend application that is used by science gateway administrators/developers. In addition, the plug-ins also interface with the Infrastructure Layer through REST APIs offered by resource providers such as e.g., CyVerse or AWS. The Infrastructure Layer can include cloud templates, machine learning tools, artificial intelligence platforms featuring chatbots, recommender modules or domain-specific knowledge bases. Application Layer is assumed to be comprised of two categories of user roles: end user researchers/scientists accessing a science gateway such as CyNeuro, and science gateway administrators/developers who use the OnTimeRecommend web application. The OnTimeRecommend web application features an 'Admin interface' to integrate, manage and execute different plug-ins using functional components implemented as microservices. Once integrated, augmented interfaces are provided to end users (researchers, students) when using the domain-specific gateways to access and use the capabilities provided by the plug-ins. As shown in Figure <ref type="figure">1</ref>, several application-specific or infrastructure-specific microservices can be chained to customize capabilities and allow information sharing between the different microservices to execute specific functional components.</p><p>We implement our "Bring-Your-Own" plug-in management middleware as an end-to-end framework that provides administrators/developers plug-in management capabilities. Herein, we detail the 'Admin interface' functionality with the following plug-in components:</p><p>&#8226; Plug-in Registry is the repository for all clients and plugins related data. It also includes the metadata of plugins-which is the configuration to execute the processes related to the client's plug-in selection-as well as the science gateway application details. This metadata is formatted using JSON and stored in the database when add or update actions are taken on plug-ins/processes. Our current implementation of this registry uses the Hibernate framework for object-relational mapping over a relational database i.e., a MySQL backend. &#8226; Plug-in Process Manger is used to create plug-in processes for a science gateway. Plug-in management and execution can be broken down into multiple processes.</p><p>We implemented microservices that help users to configure and execute processes for specific plug-ins. This component is also responsible as a client to the Plugin Registry for persistently storing all process details, metadata information related to each process of plug-ins. &#8226; Plug-in Orchestrator abstracts the configuration of the middleware queuing layer for individual clients. Specifically, it configures execution parameters for different processes of plug-ins, and consequently queues all processes of plug-ins. This component is also responsible as a client to the Plug-in Registry for persistently storing all parameter configurations and queuing requests for processes in a queue. &#8226; Render, Execute and Monitor Plug-in provides the webbased user interface to the application layer using a client software development kit that we implemented. Execute plug-ins component executes different processes for specific plug-ins based on user requests, and the Monitor plug-ins component checks the execution status of the plug-ins.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Implementation Details</head><p>We have implemented microservices that follow the standard practices of RESTful API design. All microservices have been developed in the backend using Spring Boot, which is a widely used framework in Java. Spring Boot is pre-configured and pre-sugared with a set of technologies that drastically minimize the manual efforts of configuration compared to conventional frameworks. In addition, we have used Apache Maven, which is a comprehensive build management tool to manage dependencies and versions, compile source code, runs tests, package code into deployment-ready file formats, and deploy a final production code instance using Docker containers.</p><p>The microservice architecture involves enabling flexible interactions between multiple services. Each instance of a ser-vice exposes a remote REST API at a particular location (host and port), and the number of service instances as well as their locations change dynamically. In this case, a combination of service registry and client-side service discovery allow services to find and communicate with each other without hard-coding the host names and ports. The service registry handles details of services such as their instances and locations. Service instances are registered with the service registry on startup and are de-registered on shutdown. We have implemented microservices for both service registry and discovery client microservices using Spring cloud <ref type="bibr">[11]</ref> and Eureka. We have also developed gateway edge service using Spring cloud and Zuul to enable dynamic routing in our middleware.</p><p>In addition, we have implemented the OAuth 2.0 security protocol with Spring Boot and Spring Security to provide authentication support to science gateway clients. It enables third-party applications (e.g., GitHub API, Google API) to obtain limited access to web applications. This allows for science gateway administrators/developers to enable access control to plug-in services they want to provision. With the integration of OAuth 2.0, we can validate users by allowing them to sign-on to the web application with necessary authorized permissions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Plug-in Management Middleware Benefits</head><p>The implementation of our plug-in management middleware allows developers/administrators to integrate and monitor the use of plug-ins in science gateway applications. Herein, we detail salient benefits of our middleware architecture for administrators/developers:</p><p>&#8226; It enables administrators/developers to modularize the plug-in services used to develop microservices in science gateway applications. These design benefits can support plug-ins to independently operate by decoupling processes into microservices. &#8226; It supports customizable design and deployment to augment scientific workflows. The architecture uses Docker containers to construct deployment patterns across the distributed resources to optimize the pre-defined workflows commonly used in a domain science community (e.g., neuroscience, bioinformatics).</p><p>&#8226; It also provides the flexibility for disparate code bases to be integrated through microservices. Administrators/developers typically create microservices using their preferred coding language. Our architecture allows developers to use their preferred coding language for creating customizable science gateways.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. ONTIMERECOMMEND CASE STUDY</head><p>In our recent prior work, we implemented the OnTimeRecommend <ref type="bibr">[2]</ref>, which features a variety of recommender modules to help novice/expert users with knowledge discovery through data sources such as e.g., publications, funding records, cloud templates and Jupyter notebooks. OnTimeRecommend also features a Vidura Advisor (i.e., a Chatbot) using Google DialogFlow to provide a guided user interface with step-by-step navigational support. Vidura generates distinct responses of OnTimeRecommend recommenders for users based on novice/expert user intent to accomplish targeted research and education tasks.</p><p>A. "Bring-Your-Own" Plug-in Management Middleware Integration in OnTimeRecommend Herein, we detail how we customized our "Bring-Your-Own" plug-in management middleware into the OnTimeRecommend system. The integration involves: (i) "Bring-Your-Own-Recommender" customized middleware and Admin interface for OnTimeRecommend to add, manage, execute recommender modules, and (ii) a recommender user interface for providing recommendations on available resources to science gateway end users. Using these two integration thrusts, OnTimeRecommend provides a 'recommender-as-a-service' functionality to the CyNeuro science gateway that is currently being used by researchers and educators in the neuroscience community. We integrate OAuth 2.0 to authenticate and authorize the clients of the CyNeuro science gateway by allowing them to customize their desired plug-in services. The plug-ins in this case study are the recommender modules within the OnTimeRecommend system, which features different recommender modules that can be integrated as shown in Figure <ref type="figure">2</ref>. The details of the recommender modules are as follows: B. Plug-in Management Middleware Integration Steps Figure <ref type="figure">3</ref> illustrates the steps to configure middleware components to add and execute the different recommender modules. The steps for a science gateway administrator/developer to customize OnTimeRecommend modules for a science gateway are as follows:</p><p>&#8226; Step-1: register different recommender modules in On-TimeRecommend using the web application interface. This interface uses Plug-in Registry services to register information related to recommender modules for a science gateway. &#8226; Step-2: register scientific plug-in processes such as data collection/processing parameters, and knowledge base, such that all the necessary information is provided to execute relevant recommenders in OnTimeRecommend using the 'Plug-in Workflow Manager'. &#8226; Step-3: add a science gateway (i.e., CyNeuro in this case study) client and link recommender modules with the client to allow users to use recommenders on their science gateway interface. &#8226; Step-4: configure parameters of recommender processes and queue processes of recommender modules for a specific science gateway client. &#8226; Step-5: execute all recommender processes using provided configurations and publish recommender outputs to end users on their science gateway interface. Our novel middleware framework for managing plug-in services provides new capabilities for users (e.g., administrators/developers, researcher) in the science gateways community. The core feature of our middleware is in the integration of plug-in management middleware components for AI/ML execution pipelines, which pose new challenges and open questions in next-generation of science gateways. We showed how our plug-in management for application providers could be designed to keep up with the rapid increase of voluminous data, tools, and various resources openly available to scientific communities. The maintenance of the plug-ins also is an important aspect of our middleware, which involves collecting up-to-date data to train and re-train the models to ensure latest guidance is provided from the plug-in services, especially involving recommender modules in OnTimeRecommend.</p><p>Proper facilitation of updated resources via plug-in management also needs to be adapted with respect to human cognition. For instance, it is crucial to develop flexible features that ensure users are able to customize the plug-in management middleware to be relevant across different science domains (e.g., neuroscience, bioinformatics) that have unique workflow requirements. Moreover, our middleware can be augmented with additional plug-ins that support natural language processing as well as context-aware chatbots to handle diversity of user requirements and to ensure update of the knowledge bases on a regular basis. Failure to address these demands in the middleware to manage plug-in services could result in irrelevant information from the recommender responses. Consequently, users may not be willing to adopt the new tools being considered in this work, which provide user guidance to handle big data handling needs in next-generation science gateways.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. CONCLUSION</head><p>In this paper, we presented a novel "Bring-You-Own" plugin management middleware design and implementation to enable science gateways to leverage advanced technologies in a customizable manner to increase their extensiblity and scalability. Through integration of our middleware with an Application Layer and Infrastructure Layer, changing needs of domain scientists can be satisfied. The Application Layer involves a science gateway such as e.g., CyNeuro <ref type="bibr">[2]</ref> for neuroscience researchers/educators. The Infrastructure Layer involves REST APIs offered by resource providers such as e.g., CyVerse or AWS. The Infrastructure Layer can include cloud templates, machine learning tools, artificial intelligence platforms featuring chatbots, recommender modules or domainspecific knowledge bases. Through an OnTimeRecommend case study, we showed how a science gateway administrator/developer can configure relevant recommender modules for the science gateway users (i.e., CyNeuro in the case study) to perform knowledge discovery of cloud templates, Jupyter notebooks, publications and domain experts. Our plug-in management approach involving microservices can be generally applied to extend and scale any science gateway through a series of integration steps supported by our middleware via a web application.</p><p>As part of future work, we plan to develop additional plugin support features in our middleware for: (a) integration of chatbots for guided interfaces, and (b) knowledge bases for intelligent data analytics, in other domain science gateways such as bioinformatics and health informatics.</p></div></body>
		</text>
</TEI>
