skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Tags, Borders, and Catalogs: Social Re-Working of Genre on LibraryThing
Through a computational reading of the online book reviewing community LibraryThing, we examine the dynamics of a collaborative tagging system and learn how its users refine and redefine literary genres. LibraryThing tags are overlapping and multi-dimensional, created in a shared space by thousands of users, including readers, bookstore owners, and librarians. A common understanding of genre is that it relates to the content of books, but this resource allows us to view genre as an intersection of user communities and reader values and interests. We explore different methods of computational genre measurement within the open space of user-created tags. We measure overlap between books, tags, and users, and we also measure the homogeneity of communities associated with genre tags and correlate this homogeneity with reviewing behavior.Finally, by analyzing the text of reviews, we identify the thematic signatures of genres on LibraryThing, revealing similarities and differences between them. These measurements are intended to elucidate the genre conceptions of the users, not, as in prior work, to normalize the tags or enforce a hierarchy. We find that LibraryThing users make sense of genre through a variety of values and expectations, many of which fall outside common definitions and understandings of genre.  more » « less
Award ID(s):
1652536
PAR ID:
10328856
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the ACM on Human-Computer Interaction
Volume:
5
Issue:
CSCW1
ISSN:
2573-0142
Page Range / eLocation ID:
1 to 29
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Making online social communities ‘better’ is a challenging undertaking, as online communities are extraordinarily varied in their size, topical focus, and governance. As such, what is valued by one community may not be valued by another.However, community values are challenging to measure as they are rarely explicitly stated.In this work, we measure community values through the first large-scale survey of community values, including 2,769 reddit users in 2,151 unique subreddits. Through a combination of survey responses and a quantitative analysis of publicly available reddit data, we characterize how these values vary within and across communities.Amongst other findings, we show that community members disagree about how safe their communities are, that longstanding communities place 30.1% more importance on trustworthiness than newer communities, and that community moderators want their communities to be 56.7% less democratic than non-moderator community members.These findings have important implications, including suggesting that care must be taken to protect vulnerable community members, and that participatory governance strategies may be difficult to implement.Accurate and scalable modeling of community values enables research and governance which is tuned to each community's different values. To this end, we demonstrate that a small number of automatically quantifiable features capture a significant yet limited amount of the variation in values between communities with a ROC AUC of 0.667 on a binary classification task.However, substantial variation remains, and modeling community values remains an important topic for future work.We make our models and data public to inform community design and governance. 
    more » « less
  2. The dominant research strategy within the field of music perception and cognition has typically involved new data collection and primary analysis techniques. As a result, numerous information-rich yet underexplored datasets exist in publicly accessible online repositories. In this paper we contribute two secondary analysis methodologies to overcome two common challenges in working with previously collected data: lack of participant stimulus ratings and lack of physiological baseline recordings. Specifically, we focus on methodologies that unlock previously unexplored musical preference questions. Preferred music plays important roles in our personal, social, and emotional well-being, and is capable of inducing emotions that result in psychophysiological responses. Therefore, we select the Study Forrest dataset “auditory perception” extension as a case study, which provides physiological and self-report demographics data for participants (N = 20) listening to clips from different musical genres. In Method 1, we quantitatively model self-report genre preferences using the MUSIC five-factor model: a tool recognized for genre-free characterization of musical preferences. In Method 2, we calculate synthetic baselines for each participant, allowing us to compare physiological responses (pulse and respiration) across individuals. With these methods, we uncover average changes in breathing rate as high as 4.8%, which correlate with musical genres in this dataset (p < .001). High-level musical characteristics from the MUSIC model (mellowness and intensity) further reveal a linear breathing rate trend among genres (p < .001). Although no causation can be inferred given the nature of the analysis, the significant results obtained demonstrate the potential for previous datasets to be more productively harnessed for novel research. 
    more » « less
  3. null (Ed.)
    Collaborative filtering algorithms find useful patterns in rating and consumption data and exploit these patterns to guide users to good items. Many of these patterns reflect important real-world phenomena driving interactions between the various users and items; other patterns may be irrelevant or reflect undesired discrimination, such as discrimination in publishing or purchasing against authors who are women or ethnic minorities. In this work, we examine the response of collaborative filtering recommender algorithms to the distribution of their input data with respect to one dimension of social concern, namely content creator gender. Using publicly available book ratings data, we measure the distribution of the genders of the authors of books in user rating profiles and recommendation lists produced from this data. We find that common collaborative filtering algorithms tend to propagate at least some of each user’s tendency to rate or read male or female authors into their resulting recommendations, although they differ in both the strength of this propagation and the variance in the gender balance of the recommendation lists they produce. The data, experimental design, and statistical methods are designed to be reusable for studying potentially discriminatory social dimensions of recommendations in other domains and settings as well. 
    more » « less
  4. Recent work has recognized the importance of developing and deploying software systems that reflect human values and has explored different approaches for eliciting these values from stakeholders. However, prior studies have also shown that it can be challenging for stakeholders to specify a diverse set of product-related human values. In this paper we therefore explore the use of ChatGPT for generating user stories that describe candidate human values. These generated stories provide inspiration to stakeholder discussions and enrich the human-created user stories. We engineer a series of ChatGPT prompts to retrieve a list of common stakeholders and candidate features for a targeted product, and then, for each pairwise combination of role and feature, and for each individual Schwartz value, we issue an additional prompt to generate a candidate user story reflecting that value. We present the candidate user-stories to stakeholders and, as part of a creative requirements engineering session, we ask them to assess and prioritize the generated user-stories, and then use them as inspiration for discussing and specifying their own product-related human values. Through conducting a series of focus groups we compare the human-values created by stakeholders with and without the benefit of the ChatGPT examples. Results are evaluated with respect to coverage of values, clarity of expression, internal completeness, and through feedback from our participants. Results from our analysis show that the ChatGPT-generated user stories are able to provide creativity triggers that help stakeholders to specify human values for a product. 
    more » « less
  5. As schools and districts across the United States adopt computer science standards and curriculum for K-12 computer science education, they look to integrate the foundational concepts of computational thinking (CT) into existing core subjects of elementary-age students. Research has shown the effectiveness of teaching CT elements (abstraction, generalization, decomposition, algorithmic thinking, debugging) using non-programming, unplugged approaches. These approaches address common barriers teachers face with lack of knowledge, familiarity, or technology tools. Picture books and graphic novels present an unexplored non-programming, unplugged resource for teachers to integrate computational thinking into their CT or CT-integrated lessons. This analysis examines 27 picture books and graphic novels published between 2015 and 2020 targeted to K-6 students for representation of computational thinking elements. Using the computational thinking curriculum framework for K-6, we identify the grade-level competencies of the CT elements featured in the books compared to the books’ target age groups. We compare grade-level competencies to interest level to identify each CT element representation as “foundational,” “on-target,” or “advanced.” We conclude that literature offers teachers a non-programming unplugged resource to expose students to CT and enhance CT and CT-integrated lessons, while also personalizing learning based on CT readiness and interest level. 
    more » « less