skip to main content

Title: Neural User Factor Adaptation for Text Classification: Learning to Generalize Across Author Demographics
Language use varies across different demographic factors, such as gender, age, and geographic location. However, most existing document classification methods ignore demographic variability. In this study, we examine empirically how text data can vary across four demographic factors: gender, age, country, and region. We propose a multitask neural model to account for demographic variations via adversarial training. In experiments on four English-language social media datasets, we find that classification performance improves when adapting for user factors.
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)
Page Range or eLocation-ID:
136 to 146
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper has two aims. One aim is to consider non-structural (language attitude and use) variables as valid in the field of dialect and linguistic geography in an inner Himalayan valley of Nepal, where four languages have traditionally co- existed asymmetrically and which demonstrate different degrees of vitality vs. endangerment. The other aim is an application of modified spatiality as it aligns with speaker attitudes and practices amidst recent and ongoing socio-economic and population changes. We demonstrate that variation in self-reported attitudes and practices across languages in this region can be explained as much with adjusted spatial factors (labeled ‘social space’) as with traditional social factors (e.g. gender, age, formal education, occupation, etc.). As such, our study contributes to a dis- course on the role and potential of spatiality in sociolinguistic analyses of smaller language communities.
  2. Demographic information such as gender, age, ethnicity, level of education, disabilities, employment, and socio-economic status are important in the area of social science, survey and marketing. But it is difficult to obtain the demographic information from users due to reluctance of users to participate and low response rate. Through automated demographics prediction from smart phone sensor data, researchers can obtain this valuable information in a nonintrusive and cost-effective manner. We approach the problem of demographic prediction, namely, classification of gender, age group and job type, through the use of a graphical feature based framework. The framework represents information collected from sensor networks as graphs, extracts useful and relevant graphical features, and predicts demographic information. We evaluated our approach on the Nokia Mobile Phone dataset for the three classification tasks: gender, age-group and job-type. Our approach produced comparable results with most of the state of the art methods while having the additional advantage of general applicability to sensor networks without using sophisticated and application-specific feature generation techniques, background knowledge and special techniques to address class imbalance.
  3. Introduction and Theoretical Frameworks Our study draws upon several theoretical foundations to investigate and explain the educational experiences of Black students majoring in ME, CpE, and EE: intersectionality, critical race theory, and community cultural wealth theory. Intersectionality explains how gender operates together with race, not independently, to produce multiple, overlapping forms of discrimination and social inequality (Crenshaw, 1989; Collins, 2013). Critical race theory recognizes the unique experiences of marginalized groups and strives to identify the micro- and macro-institutional sources of discrimination and prejudice (Delgado & Stefancic, 2001). Community cultural wealth integrates an asset-based perspective to our analysis of engineering education to assist in the identification of factors that contribute to the success of engineering students (Yosso, 2005). These three theoretical frameworks are buttressed by our use of Racial Identity Theory, which expands understanding about the significance and meaning associated with students’ sense of group membership. Sellers and colleagues (1997) introduced the Multidimensional Model of Racial Identity (MMRI), in which they indicated that racial identity refers to the “significance and meaning that African Americans place on race in defining themselves” (p. 19). The development of this model was based on the reality that individuals vary greatly in the extent to whichmore »they attach meaning to being a member of the Black racial group. Sellers et al. (1997) posited that there are four components of racial identity: 1. Racial salience: “the extent to which one’s race is a relevant part of one’s self-concept at a particular moment or in a particular situation” (p. 24). 2. Racial centrality: “the extent to which a person normatively defines himself or herself with regard to race” (p. 25). 3. Racial regard: “a person’s affective or evaluative judgment of his or her race in terms of positive-negative valence” (p. 26). This element consists of public regard and private regard. 4. Racial ideology: “composed of the individual’s beliefs, opinions and attitudes with respect to the way he or she feels that the members of the race should act” (p. 27). The resulting 56-item inventory, the Multidimensional Inventory of Black Identity (MIBI), provides a robust measure of Black identity that can be used across multiple contexts. Research Questions Our 3-year, mixed-method study of Black students in computer (CpE), electrical (EE) and mechanical engineering (ME) aims to identify institutional policies and practices that contribute to the retention and attrition of Black students in electrical, computer, and mechanical engineering. Our four study institutions include historically Black institutions as well as predominantly white institutions, all of which are in the top 15 nationally in the number of Black engineering graduates. We are using a transformative mixed-methods design to answer the following overarching research questions: 1. Why do Black men and women choose and persist in, or leave, EE, CpE, and ME? 2. What are the academic trajectories of Black men and women in EE, CpE, and ME? 3. In what way do these pathways vary by gender or institution? 4. What institutional policies and practices promote greater retention of Black engineering students? Methods This study of Black students in CpE, EE, and ME reports initial results from in-depth interviews at one HBCU and one PWI. We asked students about a variety of topics, including their sense of belonging on campus and in the major, experiences with discrimination, the impact of race on their experiences, and experiences with microaggressions. For this paper, we draw on two methodological approaches that allowed us to move beyond a traditional, linear approach to in-depth interviews, allowing for more diverse experiences and narratives to emerge. First, we used an identity circle to gain a better understanding of the relative importance to the participants of racial identity, as compared to other identities. The identity circle is a series of three concentric circles, surrounding an “inner core” representing one’s “core self.” Participants were asked to place various identities from a provided list that included demographic, family-related, and school-related identities on the identity circle to reflect the relative importance of the different identities to participants’ current engineering education experiences. Second, participants were asked to complete an 8-item survey which measured the “centrality” of racial identity as defined by Sellers et al. (1997). Following Enders’ (2018) reflection on the MMRI and Nigrescence Theory, we chose to use the measure of racial centrality as it is generally more stable across situations and best “describes the place race holds in the hierarchy of identities an individual possesses and answers the question ‘How important is race to me in my life?’” (p. 518). Participants completed the MIBI items at the end of the interview to allow us to learn more about the participants’ identification with their racial group, to avoid biasing their responses to the Identity Circle, and to avoid potentially creating a stereotype threat at the beginning of the interview. This paper focuses on the results of the MIBI survey and the identity circles to investigate whether these measures were correlated. Recognizing that Blackness (race) is not monolithic, we were interested in knowing the extent to which the participants considered their Black identity as central to their engineering education experiences. Combined with discussion about the identity circles, this approach allowed us to learn more about how other elements of identity may shape the participants’ educational experiences and outcomes and revealed possible differences in how participants may enact various points of their identity. Findings For this paper, we focus on the results for five HBCU students and 27 PWI students who completed the MIBI and identity circle. The overall MIBI average for HBCU students was 43 (out of a possible 56) and the overall MIBI scores ranged from 36-51; the overall MIBI average for the PWI students was 40; the overall MIBI scores for the PWI students ranged from 24-51. Twenty-one students placed race in the inner circle, indicating that race was central to their identity. Five placed race on the second, middle circle; three placed race on the third, outer circle. Three students did not place race on their identity circle. For our cross-case qualitative analysis, we will choose cases across the two institutions that represent low, medium and high MIBI scores and different ranges of centrality of race to identity, as expressed in the identity circles. Our final analysis will include descriptive quotes from these in-depth interviews to further elucidate the significance of race to the participants’ identities and engineering education experiences. The results will provide context for our larger study of a total of 60 Black students in engineering at our four study institutions. Theoretically, our study represents a new application of Racial Identity Theory and will provide a unique opportunity to apply the theories of intersectionality, critical race theory, and community cultural wealth theory. Methodologically, our findings provide insights into the utility of combining our two qualitative research tools, the MIBI centrality scale and the identity circle, to better understand the influence of race on the education experiences of Black students in engineering.« less
  4. The explosive growth in citizen science combined with a recalcitrance on the part of mainstream science to fully embrace this data collection technique demands a rigorous examination of the factors influencing data quality and project efficacy. Patterns of contributor effort and task performance have been well reviewed in online projects; however, studies of hands-on citizen science are lacking. We used a single hands-on, out-of-doors project—the Coastal Observation and Seabird Survey Team (COASST)—to quantitatively explore the relationships among participant effort, task performance, and social connectedness as a function of the demographic characteristics and interests of participants, placing these results in the context of a meta-analysis of 54 citizen science projects. Although online projects were typified by high (>90%) rates of one-off participation and low retention (<10%) past 1 y, regular COASST participants were highly likely to continue past their first survey (86%), with 54% active 1 y later. Project-wide, task performance was high (88% correct species identifications over the 31,450 carcasses and 163 species found). However, there were distinct demographic differences. Age, birding expertise, and previous citizen science experience had the greatest impact on participant persistence and performance, albeit occasionally in opposite directions. Gender and sociality were relatively inconsequential, although highlymore »gregarious social types, i.e., “nexus people,” were extremely influential at recruiting others. Our findings suggest that hands-on citizen science can produce high-quality data especially if participants persist, and that understanding the demographic data of participation could be used to maximize data quality and breadth of participation across the larger societal landscape.

    « less
  5. Abstract

    Understanding factors contributing to variation in ‘biological age’ is essential to understanding variation in susceptibility to disease and functional decline. One factor that could accelerate biological aging in women is reproduction. Pregnancy is characterized by extensive, energetically-costly changes across numerous physiological systems. These ‘costs of reproduction’ may accumulate with each pregnancy, accelerating biological aging. Despite evidence for costs of reproduction using molecular and demographic measures, it is unknown whether parity is linked to commonly-used clinical measures of biological aging. We use data collected between 1999 and 2010 from the National Health and Nutrition Examination Survey (n = 4418) to test whether parity (number of live births) predicted four previously-validated composite measures of biological age and system integrity: Levine Method, homeostatic dysregulation, Klemera–Doubal method biological age, and allostatic load. Parity exhibited a U-shaped relationship with accelerated biological aging when controlling for chronological age, lifestyle, health-related, and demographic factors in post-menopausal, but not pre-menopausal, women, with biological age acceleration being lowest among post-menopausal women reporting between three and four live births. Our findings suggest a link between reproductive function and physiological dysregulation, and allude to possible compensatory mechanisms that buffer the effects of reproductive function on physiological dysregulation during a woman’s reproductive lifespan. Future workmore »should continue to investigate links between parity, menopausal status, and biological age using targeted physiological measures and longitudinal studies.

    « less