skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Cao, Tianyu"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. As large language models (LLMs) expand the power of natural language processing to handle long inputs, rigorous and systematic analyses are necessary to understand their abilities and behavior. A salient application is summarization, due to its ubiquity and controversy (e.g., researchers have declared the death of summarization). In this paper, we use financial report summarization as a case study because financial reports are not only long but also use numbers and tables extensively. We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Cohere. We find that GPT-3.5 and Cohere fail to perform this summarization task meaningfully. For Claude 2 and GPT-4, we analyze the extractiveness of the summary and identify a position bias in LLMs. This position bias disappears after shuffling the input for Claude, which suggests that Claude seems to recognize important information. We also conduct a comprehensive investigation on the use of numeric data in LLM-generated summaries and offer a taxonomy of numeric hallucination. We employ prompt engineering to improve GPT-4's use of numbers with limited success. Overall, our analyses highlight the strong capability of Claude 2 in handling long multimodal inputs compared to GPT-4. The generated summaries and evaluation code are available at https://github.com/ChicagoHAI/characterizing-multimodal-long-form-summarization. 
    more » « less
  2. Abstract Stable auroral red (SAR) arcs are luminous subauroral emissions produced by the collisional excitation of oxygen atoms during geomagnetically active times. While traditionally attributed to inner magnetospheric electron heating, recent observations and simulations challenge the exclusivity of this mechanism. Here, we resolve the ionospheric origin of SAR arcs using multi‐instrument observations and numerical simulations during the March 2015 geomagnetic storm. Both magnetospheric heat flux and ion‐neutral frictional heating, driven by subauroral plasma flows, independently generate SAR arcs with intensities surpassing background airglow by hundreds of Rayleighs. While thermal electron impact dominates red‐line emissions in both cases, the vertical structures diverge: frictional heating localizes emissions to altitudes of 250–400 km, whereas magnetospheric heating extends emissions above ∼280 km with broader altitudinal coverage. These results redefine SAR arc generation as a product of competing magnetospheric and ionospheric energy pathways, advancing our understanding of cross‐scale interactions in geospace. 
    more » « less