Ensuring effective public understanding of algorithmic decisions that are powered by machine learning techniques has become an urgent task with the increasing deployment of AI systems into our society. In this work, we present a concrete step toward this goal by redesigning confusion matrices for binary classification to support non-experts in understanding the performance of machine learning models. Through interviews (n=7) and a survey (n=102), we mapped out two major sets of challenges lay people have in understanding standard confusion matrices: the general terminologies and the matrix design. We further identified three sub-challenges regarding the matrix design, namely, confusion about the direction of reading the data, layered relations and quantities involved. We then conducted an online experiment with 483 participants to evaluate how effective a series of alternative representations target each of those challenges in the context of an algorithm for making recidivism predictions. We developed three levels of questions to evaluate users’ objective understanding. We assessed the effectiveness of our alternatives for accuracy in answering those questions, completion time, and subjective understanding. Our results suggest that (1) only by contextualizing terminologies can we significantly improve users’ understanding and (2) flow charts, which help point out the direction of reading the data, were most useful in improving objective understanding. Our findings set the stage for developing more intuitive and generally understandable representations of the performance of machine learning models
more »
« less
A Smart Mobile App to Simplify Medical Documents and Improve Health Literacy: System Design and Feasibility Validation
Background People with low health literacy experience more challenges in understanding instructions given by their health providers, following prescriptions, and understanding their health care system sufficiently to obtain the maximum benefits. People with insufficient health literacy have high risk of making medical mistakes, more chances of experiencing adverse drug effects, and inferior control of chronic diseases. Objective This study aims to design, develop, and evaluate a mobile health app, MediReader, to help individuals better understand complex medical materials and improve their health literacy. Methods MediReader is designed and implemented through several steps, which are as follows: measure and understand an individual’s health literacy level; identify medical terminologies that the individual may not understand based on their health literacy; annotate and interpret the identified medical terminologies tailored to the individual’s reading skill levels, with meanings defined in the appropriate external knowledge sources; evaluate MediReader using task-based user study and satisfaction surveys. Results On the basis of the comparison with a control group, user study results demonstrate that MediReader can improve users’ understanding of medical documents. This improvement is particularly significant for users with low health literacy levels. The satisfaction survey showed that users are satisfied with the tool in general. Conclusions MediReader provides an easy-to-use interface for users to read and understand medical documents. It can effectively identify medical terms that a user may not understand, and then, annotate and interpret them with appropriate meanings using languages that the user can understand. Experimental results demonstrate the feasibility of using this tool to improve an individual’s understanding of medical materials.
more »
« less
- Award ID(s):
- 1722913
- PAR ID:
- 10320334
- Date Published:
- Journal Name:
- JMIR Formative Research
- Volume:
- 6
- Issue:
- 4
- ISSN:
- 2561-326X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Information Retrieval (IR) plays a pivotal role indiverse Software Engineering (SE) tasks, e.g., bug localization and triaging, bug report routing, code retrieval, requirements analysis, etc. SE tasks operate on diverse types of documents including code, text, stack-traces, and structured, semi-structured and unstructured meta-data that often contain specialized vocabularies. As the performance of any IR-based tool critically depends on the underlying document types, and given the diversity of SE corpora, it is essential to understand which models work best for which types of SE documents and tasks.We empirically investigate the interaction between IR models and document types for two representative SE tasks (bug localization and relevant project search), carefully chosen as they require a diverse set of SE artifacts (mixtures of code and text),and confirm that the models’ performance varies significantly with mix of document types. Leveraging this insight, we propose a generalized framework, SRCH, to automatically select the most favorable IR model(s) for a given SE task. We evaluate SRCH w.r.t. these two tasks and confirm its effectiveness. Our preliminary user study shows that SRCH’s intelligent adaption of the IR model(s) to the task at hand not only improves precision and recall for SE tasks but may also improve users’ satisfaction.more » « less
-
null (Ed.)The growing amount of online information today has increased opportunity to discover interesting and useful information. Various recommender systems have been designed to help people discover such information. No matter how accurately the recommender algorithms perform, users’ engagement with recommended results has been complained being less than ideal. In this study, we touched on two human-centered objectives for recommender systems: user satisfaction and curiosity, both of which are believed to play roles in maintaining user engagement and sustain such engagement in the long run. Specifically, we leveraged the concept of surprise and used an existing computational model of surprise to identify relevantly surprising health articles aiming at improving user satisfaction and inspiring their curiosity. We designed a user study to first test the validity of the surprise model in a health news recommender system, called LuckyFind. Then user satisfaction and curiosity were evaluated. We find that the computational surprise model helped identify surprising recommendations at little cost of user satisfaction. Users gave higher ratings on interestingness than usefulness for those surprising recommendations. Curiosity was inspired more for those individuals who have a larger capacity to experience curiosity. Over half of the users have changed their preferences after using LuckyFind, either discovering new areas, reinforcing their existing interests, or stopping following those they did not want anymore. The insights of the research will make researchers and practitioners rethink the objectives of today’s recommender systems as being more human-centered beyond algorithmic accuracy.more » « less
-
Background Supporting mental health and wellness is of increasing interest due to a growing recognition of the prevalence and burden of mental health issues. Mood is a central aspect of mental health, and several technologies, especially mobile apps, have helped people track and understand it. However, despite formative work on and dissemination of mood-tracking apps, it is not well understood how mood-tracking apps used in real-world contexts might benefit people and what people hope to gain from them. Objective To address this gap, the purpose of this study was to understand motivations for and experiences in using mood-tracking apps from people who used them in real-world contexts. Methods We interviewed 22 participants who had used mood-tracking apps using a semistructured interview and card sorting task. The interview focused on their experiences using a mood-tracking app. We then conducted a card sorting task using screenshots of various data entry and data review features from mood-tracking apps. We used thematic analysis to identify themes around why people use mood-tracking apps, what they found useful about them, and where people felt these apps fell short. Results Users of mood-tracking apps were primarily motivated by negative life events or shifts in their own mental health that prompted them to engage in tracking and improve their situation. In general, participants felt that using a mood-tracking app facilitated self-awareness and helped them to look back on a previous emotion or mood experience to understand what was happening. Interestingly, some users reported less inclination to document their negative mood states and preferred to document their positive moods. There was a range of preferences for personalization and simplicity of tracking. Overall, users also liked features in which their previous tracked emotions and moods were visualized in figures or calendar form to understand trends. One gap in available mood-tracking apps was the lack of app-facilitated recommendations or suggestions for how to interpret their own data or improve their mood. Conclusions Although people find various features of mood-tracking apps helpful, the way people use mood-tracking apps, such as avoiding entering negative moods, tracking infrequently, or wanting support to understand or change their moods, demonstrate opportunities for improvement. Understanding why and how people are using current technologies can provide insights to guide future designs and implementations.more » « less
-
Abstract It is widely recognized that the Web contributes to user polarization, and such polarization affects not just politics but also peoples’ stances about public health, such as vaccination. Understanding polarization in social networks is challenging because it depends not only on user attitudes but also their interactions and exposure to information. We adopt Social Judgment Theory to operationalize attitude shift and model user behavior based on empirical evidence from past studies. We design a social simulation to analyze how content sharing affects user satisfaction and polarization in a social network. We investigate the influence of varying tolerance in users and selectively exposing users to congenial views. We find that (1) higher user tolerance slows down polarization and leads to lower user satisfaction; (2) higher selective exposure leads to higher polarization and lower user reach; and (3) both higher tolerance and higher selective exposure lead to a more homophilic social network.more » « less
An official website of the United States government

