DataChat: Prototyping a Conversational Agent for Dataset Search and Visualization

Fan, Lizhou; Lafia, Sara; Li, Lingyao; Yang, Fangyuan; Hemphill, Libby

doi:10.1002/pra2.820

Citation Details

DataChat: Prototyping a Conversational Agent for Dataset Search and Visualization

Data users need relevant context and research expertise to effectively search for and identify relevant datasets. Leading data providers, such as the Inter‐university Consortium for Political and Social Research (ICPSR), offer standardized metadata and search tools to support data search. Metadata standards emphasize the machine‐readability of data and its documentation. There are opportunities to enhance dataset search by improving users' ability to learn about, and make sense of, information about data. Prior research has shown that context and expertise are two main barriers users face in effectively searching for, evaluating, and deciding whether to reuse data. In this paper, we propose a novel chatbot‐based search system, DataChat, that leverages a graph database and a large language model to provide novel ways for users to interact with and search for research data. DataChat complements data archives' and institutional repositories' ongoing efforts to curate, preserve, and share research data for reuse by making it easier for users to explore and learn about available research data. more »

Award ID(s):: 2121789

PAR ID:: 10478071

Author(s) / Creator(s):: Fan, Lizhou; Lafia, Sara; Li, Lingyao; Yang, Fangyuan; Hemphill, Libby

Publisher / Repository:: Association for Information Science and Technology

Date Published:: 2023-10-01

Journal Name:: Proceedings of the Association for Information Science and Technology

Volume:: 60

Issue:: 1

ISSN:: 2373-9231

Page Range / eLocation ID:: 586 to 591

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1002/pra2.820

More Like this