skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Caragea, C"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available November 14, 2025
  2. Scholarly digital libraries provide access to scientific publications and comprise useful resources for researchers who search for literature on specific subject areas. CiteSeerX is an example of such a digital library search engine that provides access to more than 10 million academic documents and has nearly one million users and three million hits per day. Artificial Intelligence (AI) technologies are used in many components of CiteSeerX including Web crawling, document ingestion, and metadata extraction. CiteSeerX also uses an unsupervised algorithm called noun phrase chunking (NP-Chunking) to extract keyphrases out of documents. However, often NP-Chunking extracts many unimportant noun phrases. In this paper, we investigate and contrast three supervised keyphrase extraction models to explore their deployment in CiteSeerX for extracting high quality keyphrases. To perform user evaluations on the keyphrases predicted by different models, we integrate a voting interface into CiteSeerX. We show the development and deployment of the keyphrase extraction models and the maintenance requirements. 
    more » « less
  3. We develop an enhanced version of CORD-19 dataset released by the Allen Institute for AI. Tools in the SeerSuite project are used to exploit information in original articles not directly provided in the CORD-19 datasets. We add 728 new abstracts, 70,102 figures and 31,446 tables with captions that are not provided in the current data release. We also built a vertical search engine COVIDSeer based on the new dataset we created. COVIDSeer has a relatively simple architecture with features like keyword filtering, and similar paper recommendation. The goal was to provide a system and dataset that can help scientists better navigate through the literature concerning COVID-19. The enriched dataset can serve as a supplement to the existing dataset. The search engine, which offers keyphrase-enhanced search, will hopefully help biomedical and life science researchers, medical students, and the general public to more effectively explore coronavirus-related literature. The entire data set and the system will be made open source 
    more » « less
  4. Every day people share personal stories online, reaching millions of users around the world through blogs, social media and news websites. Why are some of these stories more attractive to readers than others? What features of these personal narratives make readers empathize with the storyteller? Do the readers’ personal characteristics and experiences play a role in feeling connection to the story they read? Experimental studies in psychology show that there are several factors that increase empathy in the aggregate, but there is a need for deeper understanding of empathetic feelings at the individual level of storyteller, story, and reader. Here, we present the design and analysis of a survey that studied the impact of story features and reader predispositions and perceptions on the empathy they feel when reading online stories. We use causal trees to find the individual-level causal factors for empathy and to understand the heterogeneity in the treatment effects. One of our main findings is that empathy is contextual and, while reader personality plays a significant role in evoking empathy, the mood of the reader prior to reading the story and linguistic story features have an impact as well. The results of our analyses can be used to help people create content that others care about and to help them communicate more effectively 
    more » « less