Reflective listening is a fundamental communication skill in behavioral health counseling. It enables counselors to demonstrate an understanding of and empathy for clients’ experiences and concerns. Training to acquire and refine reflective listening skills is essential for counseling proficiency. Yet, it faces significant barriers, notably the need for specialized and timely feedback to improve counseling skills. In this work, we evaluate and compare several computational models, including transformer-based architectures, for their ability to assess the quality of counselors’ reflective listening skills. We explore a spectrum of neural-based models, ranging from compact, specialized RoBERTa models to advanced large-scale language models such as Flan, Mistral, and GPT-3.5, to score psychotherapy reflections. We introduce a psychotherapy dataset that encompasses three basic levels of reflective listening skills. Through comparative experiments, we show that a finetuned small RoBERTa model with a custom learning objective (Prompt-Aware margIn Ranking (PAIR)) effectively provides constructive feedback to counselors in training. This study also highlights the potential of machine learning in enhancing the training process for motivational interviewing (MI) by offering scalable and effective feedback alternatives for counseling training.
more »
« less
UTSA NLP at SemEval-2022 Task 4: An Exploration of Simple Ensembles of Transformers, Convolutional, and Recurrent Neural Networks
The act of appearing kind or helpful via the use of but having a feeling of superiority condescending and patronizing language can have have serious mental health implications to those that experience it. Thus, detecting this condescending and patronizing language online can be useful for online moderation systems. Thus, in this manuscript, we describe the system developed by Team UTSA SemEval-2022 Task 4, Detecting Patronizing and Condescending Language. Our approach explores the use of several deep learning architectures including RoBERTa, convolutions neural networks, and Bidirectional Long Short-Term Memory Networks. Furthermore, we explore simple and effective methods to create ensembles of neural network models. Overall, we experimented with several ensemble models and found that the a simple combination of five RoBERTa models achieved an F-score of .6441 on the development dataset and .5745 on the final test dataset. Finally, we also performed a comprehensive error analysis to better understand the limitations of the model and provide ideas for further research.
more »
« less
- Award ID(s):
- 1947697
- PAR ID:
- 10412929
- Date Published:
- Journal Name:
- Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
- Page Range / eLocation ID:
- 379 - 386
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Detecting harmful content on social media, such as Twitter, is made difficult by the fact that the seemingly simple yes/no classification conceals a significant amount of complexity. Unfortunately, while several datasets have been collected for training classifiers in hate and offensive speech, there is a scarcity of datasets labeled with a finer granularity of target classes and specific targets. In this paper, we introduce THOS, a dataset of 8.3k tweets manually labeled with fine-grained annotations about the target of the message. We demonstrate that this dataset makes it feasible to train classifiers, based on Large Language Models, to perform classification at this level of granularity.more » « less
-
Recent studies have demonstrated significant success in detecting attacks on the Controller Area Network (CAN) bus network using machine learning and deep learning models, including convolutional neural networks and transformer-based architectures. Building on this foundation, our work investigates the use of large language models (LLMs) not only for intrusion detection but also for providing interpretable explanations of their decisions. We fine-tuned three LLMs, i.e., SecureBERT, LLaMA-2, and LLaMA-3, for intrusion detection on CAN bus data. Among them, LLaMA-3 delivered the best results, achieving SOTA performance on the Car-Hacking dataset. Beyond attack classification, we evaluated LLaMA-3’s ability to generate reasoning for its decisions through zero-shot prompting. The model successfully articulated its rationale, particularly for Denial-of- Service (DoS) attacks, demonstrating strong potential for explainability in intrusion detection systems. These findings highlight the potential of LLMs to serve as a highly accurate intrusion detection system while simultaneously providing interpretable explanations, thereby enhancing the investigative capabilities of cybersecurity professionals.more » « less
-
We study neural network loss landscapes through the lens of mode connectivity, the observation that minimizers of neural networks retrieved via training on a dataset are connected via simple paths of low loss. Specifically, we ask the following question: are minimizers that rely on different mechanisms for making their predictions connected via simple paths of low loss? We provide a definition of mechanistic similarity as shared invariances to input transformations and demonstrate that lack of linear connectivity between two models implies they use dissimilar mechanisms for making their predictions. Relevant to practice, this result helps us demonstrate that naive fine-tuning on a downstream dataset can fail to alter a model’s mechanisms, e.g., fine-tuning can fail to eliminate a model’s reliance on spurious attributes. Our analysis also motivates a method for targeted alteration of a model’s mechanisms, named connectivity-based fine-tuning (CBFT), which we analyze using several synthetic datasets for the task of reducing a model’s reliance on spurious attributes.more » « less
-
Large Language Models (LLMs) have achieved remarkable success in natural language tasks, yet understanding their reasoning processes re- mains a significant challenge. We address this by introducing XplainLLM, a dataset accom- panying an explanation framework designed to enhance LLM transparency and reliability. Our dataset comprises 24,204 instances where each instance interprets the LLM’s reasoning behavior using knowledge graphs (KGs) and graph attention networks (GAT), and includes explanations of LLMs such as the decoder- only Llama-3 and the encoder-only RoBERTa. XplainLLM also features a framework for gener- ating grounded explanations and the debugger- scores for multidimensional quality analysis. Our explanations include why-choose and why- not-choose components, reason-elements, and debugger-scores that collectively illuminate the LLM’s reasoning behavior. Our evaluations demonstrate XplainLLM’s potential to reduce hallucinations and improve grounded explana- tion generation in LLMs. XplainLLM is a re- source for researchers and practitioners to build trust and verify the reliability of LLM outputs. Our code and dataset are publicly available.more » « less
An official website of the United States government

