skip to main content


Title: Recommender-as-a-Service with Chatbot Guided Domain-science Knowledge Discovery in a Science Gateway
Scientists in disciplines such as neuroscience and bioinformatics are increasingly relying on science gateways for experimentation on voluminous data, as well as analysis and visualization in multiple perspectives. Though current science gateways provide easy access to computing resources, datasets and tools specific to the disciplines, scientists often use slow and tedious manual efforts to perform knowledge discovery to accomplish their research/education tasks. Recommender systems can provide expert guidance and can help them to navigate and discover relevant publications, tools, data sets, or even automate cloud resource configurations suitable for a given scientific task. To realize the potential of integration of recommenders in science gateways in order to spur research productivity,we present a novel “OnTimeRecommend" recommender system. The OnTimeRecommend comprises of several integrated recommender modules implemented as microservices that can be augmented to a science gateway in the form of a recommender-as-a-service. The guidance for use of the recommender modules in a science gateway is aided by a chatbot plug-in viz., Vidura Advisor. To validate our OnTimeRecommend, we integrate and show benefits for both novice and expert users in domain-specific knowledge discovery within two exemplar science gateways, one in neuroscience (CyNeuro) and the other in bioinformatics (KBCommons).  more » « less
Award ID(s):
1730655 2006816
NSF-PAR ID:
10311944
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Wiley Concurrency and Computation: Practice and Experience
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Neuroscientists are increasingly relying on high performance/throughput computing resources for experimentation on voluminous data, analysis and visualization at multiple neural levels. Though current science gateways provide access to computing resources, datasets and tools specific to the disciplines, neuroscientists require guided knowledge discovery at various levels to accomplish their research/education tasks. The guidance can help them to navigate them through relevant publications, tools, topic associations and cloud platform options as they accomplish important research and education activities. To address this need and to spur research productivity and rapid learning platform development, we present “OnTimeRecommend”, a novel recommender system that comprises of several integrated recommender modules through RESTful web services. We detail a neuroscience use case in a CyNeuro science gateway, and show how the OnTimeRecommend design can enable novice/expert user interfaces, as well as template-driven control of heterogeneous cloud resources. 
    more » « less
  2. Neuro-scientists are increasingly relying on parallel and distributed computing resources for analysis and visualization of their neuron simulations. Although science gateways have democratized relevant high performance/throughput resources, users require expert knowledge about programming and infras-tructure configuration that is beyond the repertoire of most neuroscience programs. These factors become deterrents for the successful adoption and the ultimate diffusion (i.e., systemic spread) of science gateways in the neuroscience community. In this paper, we present a novel intuitionistic fuzzy logic based conversational recommender that can provide guidance to users when using science gateways for research and education workflows. The users interact with a context-aware chatbot that is embedded within custom web-portals to obtain simulation tools/resources to accomplish their goals. In order to ensure user goals are met, the chatbot profiles a user’s cyberinfrastructure and neuroscience domain proficiency level using a ‘usability quadrant’ approach. Simulation of user queries for an exemplary neuroscience use case demonstrates that our chatbot can provide step-by-step navigational support and generate distinct responses based on user proficiency. 
    more » « less
  3. Neuro-scientists are increasingly relying on parallel and distributed computing resources for analysis and visualization of their neuron simulations. Although science gateways have democratized relevant high performance/throughput resources, users require expert knowledge about programming and infrastructure configuration that is beyond the repertoire of most neuroscience programs. These factors become deterrents for the successful adoption and the ultimate diffusion (i.e., systemic spread) of science gateways in the neuroscience community. In this paper, we present a novel intuitionistic fuzzy logic based conversational recommender that can provide guidance to users when using science gateways for research and education workflows. The users interact with a context-aware chatbot that is embedded within custom web-portals to obtain simulation tools/resources to accomplish their goals. In order to ensure user goals are met, the chatbot profiles a user’s cyberinfrastructure and neuroscience domain proficiency level using a ‘usability quadrant’ approach. Simulation of user queries for an exemplary neuroscience use case demonstrates that our chatbot can provide step-by-step navigational support and generate distinct responses based on user proficiency. 
    more » « less
  4. Neuroscientists are increasingly relying on parallel and distributed computing resources for analysis and visualization of their neuron simulations. This requires expert knowledge of programming and cyberinfrastructure configuration, which is beyond the repertoire of most neuroscience programs. This paper presents early experiences from a one-credit graduate research training course titled ECE 8001 “Software and Cyber Automation in Neuroscience” at the University of Missouri for engendering multi-disciplinary collaborations between computational neuroscience and cyberinfrastructure students and faculty. Specifically, we discuss the course organization and exemplar outcomes involving a next-generation science gateway for training novice users on exemplar neuroscience use cases that involve using tools such as NEURON and MATLAB on local as well as Neuroscience Gateway resources. We also discuss our vision towards a course sequence curriculum for graduate/undergraduate students from biological/psychological sciences and computer science/engineering to jointly build “self- service” training modules using Jupyter Notebook platforms. Thus, our efforts show how we can create scalable and sustainable cyber and software automation for fulfilling a broad set of neuroscience research and education use cases. 
    more » « less
  5. Machine learning techniques underlying Big Data analytics have the potential to benefit data intensive communities in e.g., bioinformatics and neuroscience domain sciences. Today’s innovative advances in these domain communities are increasingly built upon multi-disciplinary knowledge discovery and cross-domain collaborations. Consequently, shortened time to knowledge discovery is a challenge when investigating new methods, developing new tools, or integrating datasets. The challenge for a domain scientist particularly lies in the actions to obtain guidance through query of massive information from diverse text corpus comprising of a wide-ranging set of topics. In this paper, we propose a novel “domain-specific topic model” (DSTM) that can drive conversational agents for users to discover latent knowledge patterns about relationships among research topics, tools and datasets from exemplar scientific domains. The goal of DSTM is to perform data mining to obtain meaningful guidance via a chatbot for domain scientists to choose the relevant tools or datasets pertinent to solving a computational and data intensive research problem at hand. Our DSTM is a Bayesian hierarchical model that extends the Latent Dirichlet Allocation (LDA) model and uses a Markov chain Monte Carlo algorithm to infer latent patterns within a specific domain in an unsupervised manner. We apply our DSTM to large collections of data from bioinformatics and neuroscience domains that include hundreds of papers from reputed journal archives, hundreds of tools and datasets. Through evaluation experiments with a perplexity metric, we show that our model has better generalization performance within a domain for discovering highly specific latent topics. 
    more » « less