skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 1838770

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. This paper studies the performance of large language models (LLMs), particularly regarding demographic fairness, in solving real-world healthcare tasks. We evaluate state-of-the-art LLMs with three prevalent learning frameworks across six diverse healthcare tasks and find significant challenges in applying LLMs to real-world healthcare tasks and persistent fairness issues across demographic groups. We also find that explicitly providing demographic information yields mixed results, while LLM`s ability to infer such details raises concerns about biased health predictions. Utilizing LLMs as autonomous agents with access to up-to-date guidelines does not guarantee performance improvement. We believe these findings reveal the critical limitations of LLMs in healthcare fairness and the urgent need for specialized research in this area. 
    more » « less
    Free, publicly-accessible full text available January 19, 2026
  2. Free, publicly-accessible full text available October 15, 2025
  3. Free, publicly-accessible full text available June 16, 2025
  4. Health coaching helps patients achieve personalized and lifestyle-related goals, effectively managing chronic conditions and alleviating mental health issues. It is particularly beneficial, however cost-prohibitive, for low-socioeconomic status populations due to its highly personalized and labor-intensive nature. In this paper, we propose a neuro-symbolic goal summarizer to support health coaches in keeping track of the goals and a text-units-text dialogue generation model that converses with patients and helps them create and accomplish specific goals for physical activities. Our models outperform previous state-of-the-art while eliminating the need for predefined schema and corresponding annotation. We also propose a new health coaching dataset extending previous work and a metric to measure the unconventionality of the patient’s response based on data difficulty, facilitating potential coach alerts during deployment. 
    more » « less
    Free, publicly-accessible full text available May 1, 2025
  5. Health coaching helps patients identify and accomplish lifestyle-related goals, effectively improving the control of chronic diseases and mitigating mental health conditions. However, health coaching is cost-prohibitive due to its highly personalized and labor-intensive nature. In this paper, we propose to build a dialogue system that converses with the patients, helps them create and accomplish specific goals, and can address their emotions with empathy. However, building such a system is challenging since real-world health coaching datasets are limited and empathy is subtle. Thus, we propose a modularized health coaching dialogue with simplified NLU and NLG frameworks combined with mechanism-conditioned empathetic response generation. Through automatic and human evaluation, we show that our system generates more empathetic, fluent, and coherent responses and outperforms the state-of-the-art in NLU tasks while requiring less annotation. We view our approach as a key step towards building automated and more accessible health coaching systems. 
    more » « less
  6. Prevalent imitation learning methods seek to produce behavior that matches or exceeds average human performance. This often prevents achieving expert-level or superhuman performance when identifying the better demonstrations to imitate is difficult. We instead assume demonstrations are of varying quality and seek to induce behavior that is unambiguously better (i.e., Pareto dominant or minimally subdominant) than all human demonstrations. Our minimum subdominance inverse optimal control training objective is primarily defined by high quality demonstrations; lower quality demonstrations, which are more easily dominated, are effectively ignored instead of degrading imitation. With increasing probability, our approach produces superhuman behavior incurring lower cost than demonstrations on the demonstrator’s unknown cost function{—}even if that cost function differs for each demonstration. We apply our approach on a computer cursor pointing task, producing behavior that is 78% superhuman, while minimizing demonstration suboptimality provides 50% superhuman behavior{—}and only 72% even after selective data cleaning. 
    more » « less
  7. Background Over half of US adults have at least one chronic disease, including obesity. Although physical activity is an important component of chronic disease self-management, few reach the recommended physical activity goals. Individuals who identify as racial and ethnic minorities are disproportionally affected by chronic diseases and physical inactivity. Interventions using consumer-based wearable devices have shown promise for increasing physical activity among patients with chronic diseases; however, populations with the most to gain, such as minorities, have been poorly represented to date. Objective This study aims to assess the feasibility, acceptability, and preliminary outcomes of an 8-week text-based coaching and Fitbit program aimed at increasing the number of steps in a predominantly overweight ethnic minority population. Methods Overweight patients (BMI >25 kg/m2) were recruited from an internal medicine clinic located in an inner-city academic medical center. Fitbit devices were provided. Using 2-way SMS text messaging, health coaches (HCs) guided patients to establish weekly step goals that were specific, measurable, attainable, realistic, and time-bound. SMS text messaging and Fitbit activities were managed using a custom-designed app. Program feasibility was assessed via the recruitment rate, retention rate (the proportion of eligible participants completing the 8-week program), and patient engagement (based on the number of weekly text message goals set with the HC across the 8-week period). Acceptability was assessed using a qualitative, summative evaluation. Exploratory statistical analysis included evaluating the average weekly steps in week 1 compared with week 8 using a paired t test (2-tailed) and modeling daily steps over time using a linear mixed model. Results Of the 33 patients initially screened; 30 (91%) patients were enrolled in the study. At baseline, the average BMI was 39.3 (SD 9.3) kg/m2, with 70% (23/33) of participants presenting as obese. A total of 30% (9/30) of participants self-rated their health as either fair or poor, and 73% (22/30) of participants set up ≥6 weekly goals across the 8-week program. In total, 93% (28/30) of participants completed a qualitative summative evaluation, and 10 themes emerged from the evaluation: patient motivation, convenient SMS text messaging experience, social support, supportive accountability, technology support, self-determined goals, achievable goals, feedback from Fitbit, challenges, and habit formation. There was no significant group change in the average weekly steps for week 1 compared with week 8 (mean difference 7.26, SD 6209.3; P=.99). However, 17% (5/30) of participants showed a significant increase in their daily steps. Conclusions Overall, the results demonstrate the feasibility and acceptability of a remotely delivered walking study that included an HC; SMS text messaging; a wearable device (Fitbit); and specific, measurable, attainable, realistic, and time-bound goals within an ethnic minority patient population. Results support further development and testing in larger samples to explore efficacy. 
    more » « less
  8. null (Ed.)
    Regular physical activity is associated with a reduced risk of chronic diseases such as type 2 diabetes and improved mental well-being. Yet, more than half of the US population is insufficiently active. Health coaching has been successful in promoting healthy behaviors. In this paper, we present our work towards assisting health coaches by extracting the physical activity goal the user and coach negotiate via text messages. We show that information captured by dialogue acts can help to improve the goal extraction results. We employ both traditional and transformer-based machine learning models for dialogue acts prediction and find them statistically indistinguishable in performance on our health coaching dataset. Moreover, we discuss the feedback provided by the health coaches when evaluating the correctness of the extracted goal summaries. This work is a step towards building a virtual assistant health coach to promote a healthy lifestyle. 
    more » « less
  9. null (Ed.)
    Dialogue systems, also called chatbots, are now used in a wide range of applications. However, they still have some major weaknesses. One key weakness is that they are typically trained from manually-labeled data and/or written with handcrafted rules, and their knowledge bases (KBs) are also compiled by human experts. Due to the huge amount of manual effort involved, they are difficult to scale and also tend to produce many errors ought to their limited ability to understand natural language and the limited knowledge in their KBs. Thus, the level of user satisfactory is often low. In this paper, we propose to dramatically improve the situation by endowing the chatbots the ability to continually learn (1) new world knowledge, (2) new language expressions to ground them to actions, and (3) new conversational skills, during conversation by themselves so that as they chat more and more with users, they become more and more knowledgeable and are better and better able to understand diverse natural language expressions and to improve their conversational skills. 
    more » « less