skip to main content

Search for: All records

Creators/Authors contains: "Lerman, Kristina"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Preferential attachment, homophily, and their consequences such as scale-free (i.e. power-law) degree distributions, the glass ceiling effect (the unseen, yet unbreakable barrier that keeps minorities and women from rising to the upper rungs of the corporate ladder, regardless of their qualifications or achievements) and perception bias are well-studied in undirected networks. However, such consequences and the factors that lead to their emergence in directed networks (e.g. author–citation graphs, Twitter) are yet to be coherently explained in an intuitive, theoretically tractable manner using a single dynamical model. To this end, we present a theoretical and numerical analysis of the novel Directed Mixed Preferential Attachment model in order to explain the emergence of scale-free degree distributions and the glass ceiling effect in directed networks with two groups (minority and majority). Specifically, we first derive closed-form expressions for the power-law exponents of the in-degree and out-degree distributions of each of the two groups and then compare the derived exponents with each other to obtain useful insights. These insights include answers to questions such as: when does the minority group have an out-degree (or in-degree) distribution with a heavier tail compared to the majority group? what factors cause the tail of the out-degree distribution of a group to be heavier than the tail of its own in-degree distribution? what effect does frequent addition of edges between existing nodes have on the in-degree and out-degree distributions of the majority and minority groups? Answers to these questions shed light on the interplay between structure (i.e. the in-degree and out-degree distributions of the two groups) and dynamics (characterized collectively by the homophily, preferential attachment, group sizes and growth dynamics) of various real-world directed networks. We also provide a novel definition of the glass ceiling faced by a group via the number of individuals with large out-degree (i.e. those with many followers) normalized by the number of individuals with large in-degree (i.e. those who follow many others) and then use it to characterize the conditions that cause the glass ceiling effect to emerge in a directed network. Our analytical results are supported by detailed numerical experiments. The DMPA model and its theoretical and numerical analysis provided in this article are useful for analysing various phenomena on directed networks in fields such as network science and computational social science.

    more » « less
  2. Social networks are very important carriers of information. For instance, the political leaning of our friends can serve as a proxy to identify our own political preferences. This explanatory power is leveraged in many scenarios ranging from business decision‐ making to scientific research to infer missing attributes using machine learning. How‐ ever, factors affecting the performance and the direction of bias of these algorithms are not well understood. To this end, we systematically study how structural properties of the network and the training sample influence the results of collective classification. Our main findings show that (i) mean classification performance can empirically and analytically be predicted by structural properties such as homophily, class balance, edge density and sample size, (ii) small training samples are enough for heterophilic networks to achieve high and unbiased classification performance, even with imper‐ fect model estimates, (iii) homophilic networks are more prone to bias issues and low performance when group size differences increase, (iv) when sampling budgets are small, partial crawls achieve the most accurate model estimates, and degree sampling achieves the highest overall performance. Our findings help practitioners to better understand and evaluate their results when sampling budgets are small or when no ground‐truth is available. 
    more » « less
  3. Abstract

    Diachronic word embeddings—vector representations of words over time—offer remarkable insights into the evolution of language and provide a tool for quantifying sociocultural change from text documents. Prior work has used such embeddings to identify shifts in the meaning of individual words. However, simply knowing that a word has changed in meaning is insufficient to identify the instances of word usage that convey the historical meaning or the newer meaning. In this study, we link diachronic word embeddings to documents, by situating those documents as leaders or laggards with respect to ongoing semantic changes. Specifically, we propose a novel method to quantify the degree of semantic progressiveness in each word usage, and then show how these usages can be aggregated to obtain scores for each document. We analyze two large collections of documents, representing legal opinions and scientific articles. Documents that are scored as semantically progressive receive a larger number of citations, indicating that they are especially influential. Our work thus provides a new technique for identifying lexical semantic leaders and demonstrates a new link between progressive use of language and influence in a citation network.

    more » « less
  4. We consider SIS contagion processes over networks where, a classical assumption is that individuals' decisions to adopt a contagion are based on their immediate neighbors. However, recent literature shows that some attributes are more correlated between two-hop neighbors, a concept referred to as monophily. This motivates us to explore monophilic contagion, the case where a contagion (e.g. a product, disease) is adopted by considering two-hop neighbors instead of immediate neighbors (e.g. you ask your friend about the new iPhone and she recommends you the opinion of one of her friends). We show that the phenomenon called friendship paradox makes it easier for the monophilic contagion to spread widely. We also consider the case where the underlying network stochastically evolves in response to the state of the contagion (e.g. depending on the severity of a flu virus, people restrict their interactions with others to avoid getting infected) and show that the dynamics of such a process can be approximated by a differential equation whose trajectory satisfies an algebraic constraint restricting it to a manifold. Our results shed light on how graph theoretic consequences affect contagions and, provide simple deterministic models to approximate the collective dynamics of contagions over stochastic graph processes. 
    more » « less