skip to main content


Search for: All records

Creators/Authors contains: "Wang, Xuan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Baeza-Yates, Ricardo ; Bonchi, Francesco (Ed.)
    Fine-grained entity typing (FET), which assigns entities in text with context-sensitive, fine-grained semantic types, is a basic but important task for knowledge extraction from unstructured text. FET has been studied extensively in natural language processing and typically relies on human-annotated corpora for training, which is costly and difficult to scale. Recent studies explore the utilization of pre-trained language models (PLMs) as a knowledge base to generate rich and context-aware weak supervision for FET. However, a PLM still requires direction and guidance to serve as a knowledge base as they often generate a mixture of rough and fine-grained types, or tokens unsuitable for typing. In this study, we vision that an ontology provides a semantics-rich, hierarchical structure, which will help select the best results generated by multiple PLM models and head words. Specifically, we propose a novel annotation-free, ontology-guided FET method, ONTOTYPE, which follows a type ontological structure, from coarse to fine, ensembles multiple PLM prompting results to generate a set of type candidates, and refines its type resolution, under the local context with a natural language inference model. Our experiments on the Ontonotes, FIGER, and NYT datasets using their associated ontological structures demonstrate that our method outperforms the state-of-the-art zero-shot fine-grained entity typing methods as well as a typical LLM method, ChatGPT. Our error analysis shows that refinement of the existing ontology structures will further improve fine-grained entity typing. 
    more » « less
    Free, publicly-accessible full text available August 24, 2025
  2. Free, publicly-accessible full text available April 1, 2025
  3. This paper presents the development of a novel control algorithm designed for tasks involving human-robot collaboration. By using an 8-DOF robotic arm, our approach aims to counteract human-induced uncertainties added to the robot's nominal trajectory. To address this challenge, we incorporate a variable within the regular Model Predictive Control (MPC) framework to account for human uncertainties, which are modeled as following a normal distribution with a non-zero mean and variance. Our solution involves formulating and solving an uncertainty-aware Discrete Algebraic Ricatti Equation (ua-DARE), which yields the optimal control law for all joints to mitigate the impact of these uncertainties. We validate our methodology through theoretical analysis, demonstrating the effectiveness of the ua-DARE in providing an optimal control strategy. Our approach is further validated through simulation experiments using a Fetch robot model, where the results highlight a significant improvement in performance over a baseline algorithm that does not consider human uncertainty while solving for optimal control law. 
    more » « less
    Free, publicly-accessible full text available May 12, 2025
  4. Abstract

    Submarine cables have experienced problems during extreme geomagnetic disturbances because of geomagnetically induced voltages adding or subtracting from the power feed to the repeaters. This is still a concern for modern fiber‐optic cables because they contain a copper conductor to carry power to the repeaters. This paper provides a new examination of geomagnetic induction in submarine cables and makes calculations of the voltages experienced by the TAT‐8 trans‐Atlantic submarine cable during the March 1989 magnetic storm. It is shown that the cable itself experiences an induced electromotive force (emf) and that induction in the ocean also leads to changes of potential of the land at each end of the cable. The process for calculating the electric fields induced in the sea and in the cable from knowledge of the seawater depth and conductivity and subsea conductivity is explained. The cable route is divided into 9 sections and the seafloor electric field is calculated for each section. These are combined to give the total induced emf in the cable. In addition, induction in the seawater and leakage of induced currents through the underlying resistive layers are modeled using a transmission line model of the ocean and underlying layers to determine the change in Earth potentials at the cable ends. The induced emf in the cable and the end potentials are then combined to give the total voltage change experienced by the cable power feed equipment. This gives results very close to those recorded on the TAT‐8 cable in March 1989.

     
    more » « less
    Free, publicly-accessible full text available February 1, 2025
  5. Free, publicly-accessible full text available February 1, 2025
  6. Abstract

    Estimating fire emissions prior to the satellite era is challenging because observations are limited, leading to large uncertainties in the calculated aerosol climate forcing following the preindustrial era. This challenge further limits the ability of climate models to accurately project future climate change. Here, we reconstruct a gridded dataset of global biomass burning emissions from 1750 to 2010 using inverse analysis that leveraged a global array of 31 ice core records of black carbon deposition fluxes, two different historical emission inventories as a priori estimates, and emission-deposition sensitivities simulated by the atmospheric chemical transport model GEOS-Chem. The reconstructed emissions exhibit greater temporal variabilities which are more consistent with paleoclimate proxies. Our ice core constrained emissions reduced the uncertainties in simulated cloud condensation nuclei and aerosol radiative forcing associated with the discrepancy in preindustrial biomass burning emissions. The derived emissions can also be used in studies of ocean and terrestrial biogeochemistry.

     
    more » « less
  7. Human whole-brain functional connectivity networks have been shown to exhibit both local/quasilocal (e.g., a set of functional sub-circuits induced by node or edge attributes) and non-local (e.g., higher-order functional coordination patterns) properties. Nonetheless, the non-local properties of topological strata induced by local/quasilocal functional sub-circuits have yet to be addressed. To that end, we proposed a homological formalism that enables the quantification of higher-order characteristics of human brain functional sub-circuits. Our results indicate that each homological order uniquely unravels diverse, complementary properties of human brain functional sub-circuits. Noticeably, the H1 homological distance between rest and motor task was observed at both the whole-brain and sub-circuit consolidated levels, which suggested the self-similarity property of human brain functional connectivity unraveled by a homological kernel. Furthermore, at the whole-brain level, the rest–task differentiation was found to be most prominent between rest and different tasks at different homological orders: (i) Emotion task (H0), (ii) Motor task (H1), and (iii) Working memory task (H2). At the functional sub-circuit level, the rest–task functional dichotomy of the default mode network is found to be mostly prominent at the first and second homological scaffolds. Also at such scale, we found that the limbic network plays a significant role in homological reconfiguration across both the task and subject domains, which paves the way for subsequent investigations on the complex neuro-physiological role of such network. From a wider perspective, our formalism can be applied, beyond brain connectomics, to study the non-localized coordination patterns of localized structures stretching across complex network fibers.

     
    more » « less
    Free, publicly-accessible full text available February 1, 2025
  8. Rogers, Anna ; Boyd-Graber, Jordan ; Okazaki, Naoaki (Ed.)
    The mission of open knowledge graph (KG) completion is to draw new findings from known facts. Existing works that augment KG completion require either (1) factual triples to enlarge the graph reasoning space or (2) manually designed prompts to extract knowledge from a pre-trained language model (PLM), exhibiting limited performance and requiring expensive efforts from experts. To this end, we propose TagReal that automatically generates quality query prompts and retrieves support information from large text corpora to probe knowledge from PLM for KG completion. The results show that TagReal achieves state-of-the-art performance on two benchmark datasets. We find that TagReal has superb performance even with limited training data, outperforming existing embedding-based, graph-based, and PLM-based methods. 
    more » « less
  9. Structured chemical reaction information plays a vital role for chemists engaged in laboratory work and advanced endeavors such as computer-aided drug design. Despite the importance of extracting structured reactions from scientific literature, data annotation for this purpose is cost-prohibitive due to the significant labor required from domain experts. Consequently, the scarcity of sufficient training data poses an obstacle to the progress of related models in this domain. In this paper, we propose REACTIE, which combines two weakly supervised approaches for pre-training. Our method utilizes frequent patterns within the text as linguistic cues to identify specific characteristics of chemical reactions. Additionally, we adopt synthetic data from patent records as distant supervision to incorporate domain knowledge into the model. Experiments demonstrate that REACTIE achieves substantial improvements and outperforms all existing baselines. 
    more » « less