skip to main content


Search for: All records

Creators/Authors contains: "Wang, Jiaqi"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. In an era dominated by web-based intelligent customer services, the applications of Sentence Pair Matching are profoundly broad. Web agents, for example, automatically respond to customer queries by finding similar past questions, significantly reducing customer service expenses. While current large language models (LLMs) offer powerful text generation capabilities, they often struggle with opacity, potential text toxicity, and difficulty managing domain-specific and confidential business inquiries. Consequently, the widespread adoption of web-based intelligent customer services in real-world business still greatly relies on query-based interactions. In this paper, we introduce a series of model-agnostic techniques aimed at enhancing both the accuracy and interpretability of Chinese pairwise sentence-matching models. Our contributions include (1) An Edit-distance-weighted fine-tuning method, (2) A Bayesian Iterative Prediction algorithm, (3) A Lexical-based Dual Ranking Interpreter, and (4) A Bi-criteria Denoising strategy. Experimental results on the Large-scale Chinese Question Matching Corpus (LCQMC) with a disturbed test demonstrate that our fine-tuning and prediction methods can steadily improve matching accuracy, building on the current state-of-the-art models. Besides, our interpreter with denoising strategy markedly enhances token-level interpretation in rationality and loyalty. In both matching accuracy and interpretation, our approaches outperform classic methods and even LLMs. 
    more » « less
  2. Visual Question Answering (VQA) is a fundamental task in computer vision and natural language process fields. Although the “pre-training & finetuning” learning paradigm significantly improves the VQA performance, the adversarial robustness of such a learning paradigm has not been explored. In this paper, we delve into a new problem: using a pre-trained multimodal source model to create adversarial image-text pairs and then transferring them to attack the target VQA models. Correspondingly, we propose a novel VQATTACK model, which can iteratively generate both im- age and text perturbations with the designed modules: the large language model (LLM)-enhanced image attack and the cross-modal joint attack module. At each iteration, the LLM-enhanced image attack module first optimizes the latent representation-based loss to generate feature-level image perturbations. Then it incorporates an LLM to further enhance the image perturbations by optimizing the designed masked answer anti-recovery loss. The cross-modal joint attack module will be triggered at a specific iteration, which updates the image and text perturbations sequentially. Notably, the text perturbation updates are based on both the learned gradients in the word embedding space and word synonym-based substitution. Experimental results on two VQA datasets with five validated models demonstrate the effectiveness of the proposed VQATTACK in the transferable attack setting, compared with state-of-the-art baselines. This work revealsa significant blind spot in the “pre-training & fine-tuning” paradigm on VQA tasks. The source code can be found in the link https://github.com/ericyinyzy/VQAttack.

     
    more » « less
    Free, publicly-accessible full text available March 25, 2025
  3. Abstract

    Based on a large group/cluster catalog recently constructed from the DESI Legacy Imaging Surveys DR9 using an extended halo-based group finder, we measure and model the group–galaxy weak-lensing signals for groups/clusters in a few redshift bins within redshift range 0.1 ≤z< 0.6. Here, the background shear signals are obtained based on the DECaLS survey shape catalog, derived with the Fourier_Quadmethod. We divide the lens samples into five equispaced redshift bins and seven mass bins, which allow us to probe the redshift and mass dependence of the lensing signals, and hence the resulting halo properties. In addition to these sample selections, we also check the signals around different group centers, e.g., the brightest central galaxy, the luminosity-weighted center, and the number-weighted center. We use a lensing model that includes off-centering to describe the lensing signals that we measure for all mass and redshift bins. The results demonstrate that our model predictions for the halo masses, biases, and concentrations are stable and self-consistent among different samples for different group centers. Taking advantage of the very large and complete sample of groups/clusters, as well as the reliable estimations of their halo masses, we provide measurements of the cumulative halo mass functions up to redshiftz= 0.6, with a mass precision at 0.03 ∼ 0.09 dex.

     
    more » « less