Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Topic modeling includes a variety of machine learning techniques for identifying latent themes in a corpus of documents. Generating an exact solution (i.e., finding global optimum) is often computationally intractable. Various optimization techniques (e.g., Variational Bayes or Gibbs Sampling) are employed to generate topic solutions approximately by finding local optima. Such an approximation often begins with a random initialization, which leads to different results with different initializations. The term “stability” refers to a topic model’s ability to produce solutions that are partially or completely identical across multiple runs with different random initializations. Although a variety of work has been done analyzing, measuring, or improving stability, no single paper has provided a thorough review of different stability metrics nor of various techniques that improved the stability of a topic model. This paper fills that gap and provides a systematic review of different approaches to measure stability and of various techniques that are intended to improve stability. It also describes differences and similarities between stability measures and other metrics (e.g., generality, coherence). Finally, the paper discusses the importance of analyzing both stability and quality metrics to assess and to compare topic models.more » « less
-
The emergent, dynamic nature of privacy concerns in a shifting sociotechnical landscape creates a constant need for privacy-related resources and education. One response to this need is community-based privacy groups. We studied privacy groups that host meetings in diverse urban communities and interviewed the meeting organizers to see how they grapple with potentially varied and changeable privacy concerns. Our analysis identified three features of how privacy groups are organized to serve diverse constituencies: situating (finding the right venue for meetings), structuring (finding the right format/content for the meeting), and providing support (offering varied dimensions of assistance). We use these findings to inform a discussion of "privacy pluralism" as a perennial challenge for the HCI privacy research community, and we use the practices of privacy groups as an anchor for reflection on research practices.more » « less
-
As AI-mediated communication (AI-MC) becomes more prevalent in everyday interactions, it becomes increasingly important to develop a rigorous understanding of its effects on interpersonal relationships and on society at large. Controlled experimental studies offer a key means of developing such an understanding, but various complexities make it difficult for experimental AI-MC research to simultaneously achieve the criteria of experimental realism, experimental control, and scalability. After outlining these methodological challenges, this paper offers the concept of methodological middle spaces as a means to address these challenges. This concept suggests that the key to simultaneously achieving all three of these criteria is to abandon the perfect attainment of any single criterion. This concept's utility is demonstrated via its use to guide the design of a platform for conducting text-based AI-MC experiments. Through a series of three example studies, the paper illustrates how the concept of methodological middle spaces can inform the design of specific experimental methods. Doing so enabled these studies to examine research questions that would have been either difficult or impossible to investigate using existing approaches. The paper concludes by describing how future research could similarly apply the concept of methodological middle spaces to expand methodological possibilities for AI-MC research in ways that enable contributions not currently possible.more » « less
-
The rise of automated text processing systems has led to the development of tools designed for a wide variety of application domains. These technologies are often developed to support non-technical users such as domain experts and are often developed in isolation of the tools primary user. While such developments are exciting, less attention has been paid to domain experts’ expectations about the values embedded in these automated systems. As a step toward addressing that gap, we examined values expectations of journalists and legal experts. Both these domains involve extensive text processing and place high importance on values in professional practice. We engaged participants from two non-profit organizations in two separate co-speculation design workshops centered around several speculative automated text processing systems. This study makes three interrelated contributions. First, we provide a detailed investigation of domain experts’ values expectations around future NLP systems. Second, the speculative design fiction concepts, which we specifically crafted for these investigative journalists and legal experts, illuminated a series of tensions around the technical implementation details of automation. Third, our findings highlight the utility of design fiction in eliciting not-to-design implications, not only about automated NLP but also about technology more broadly. Overall, our study findings provide groundwork for the inclusion of domain experts values whose expertise lies outside of the field of computing into the design of automated NLP systems.more » « less
-
Miller, Jody. (Ed.)Until recently, national-level data on criminal victimization in the United States did not include information on immigrant or citizenship status of respondents. This data-infrastructure limitation has hindered scientific understanding of whether immigrants are more or less likely than native-born Americans to be criminally victimized and how victimization may vary among immigrants of different statuses. We address these issues in the present study by using new data from the 2017–2018 National Crime Victimization Survey (NCVS) to explore the association between citizenship status and victimization risk in a nationally representative sample of households and persons aged 12 years and older. The research is guided by a theoretical framing that integrates insights from studies of citizenship with the literature on immigration and crime, as well as with theories of victimization. We find that a person’s foreign-born status (but not their acquired U.S. citizenship) confers protection against victimization. We also find that the protective benefit associated with being foreign born does not extend to those with ambiguous citizenship status, who in our data exhibit attributes similar to the known characteristics of undocumented immigrants. We conclude by discussing the implications of our findings and the potential ways to extend the research.more » « less
-
Investigative data journalists work with a variety of data sources to tell a story. Though prior work has indicated that there is a close relationship between journalists' data work practices and that of data scientists. However, these relationships and data work practices are not empirically examined, and understanding them is crucial to inform the design of tools that are used by different groups of people including data scientists and data journalists. Thus, to bridge this gap, we studied investigative reporters' data work practices with one non-profit investigative newsroom. Our study design includes two activities: 1) semi-structured interviews with journalists, and 2) a sketching activity allowing journalists to depict examples of their work practices. By analyzing these data and synthesizing them across related prior work, we propose the major phases in the data-driven investigative journalism story idea generation process. Our study findings show that the journalists employ a collection of multiple, iterative, cyclic processes to identify journalistically "interesting'' story ideas. These processes both significantly resemble and show subtle nuanced differences with data science work practices identified in prior research. We further verified our proposal through a member check with key informants. This work offers three primary contributions. First, it provides a close glimpse into the main phases of investigative journalists' data-driven story idea generation technique. Second, it complements prior work studying formal data science practices by examining data-driven investigative journalists, whose primary expertise lies outside computing. Third, it identifies particular points in the data exploration processes that would benefit from design interventions and suggests future research directions.more » « less
-
Eating disorders (EDs) constitute a mental illness with the highest mortality. Today, mobile health apps provide promising means to ED patients for managing their condition. Apps enable users to monitor their eating habits, thoughts, and feelings, and offer analytic insights for behavior change. However, not only have scholars critiqued the clinical validity of these apps, their underlying design principles are not well understood. Through a review of 34 ED apps, we uncovered 11 different data types ED apps collect, and 9 strategies they employ to support collection and reflection. Drawing upon personal health informatics and visualization frameworks, we found that most apps did not adhere to best practices on what and how data should be collected from and reflected to users, or how data-driven insights should be communicated. Our review offers suggestions for improving the design of ED apps such that they can be useful and meaningful in ED recovery.more » « less
-
null (Ed.)Design fiction has become so widely adopted that it regularly appears in contexts ranging from CEO speeches to dedicated tracks at academic conferences. However, evaluating this kind of work is difficult; it is not clear what good or bad design fiction is or what the judgment criteria should be. In this paper we assert that design fiction is a heterogeneous set of methods, and practices, able to produce a diversity of scholarly and design contributions. We argue locating these diverse practices under the single header of "design fiction" has resulted in epistemological confusion over the appropriate method of evaluation. We identify different traditions within the HCI literature-critical design; narratology and literary theory; studio-based design "crits"; user studies; scenarios and persona development; and thought experiments-to articulate a typology of evaluative frames. There is often a mismatch between the standards to which design fiction is held and the knowledge that speculative methods seek to produce. We argue that evaluating a given instance of design fiction requires us to properly select the right epistemological tool for the job.more » « less
An official website of the United States government
