skip to main content


Search for: All records

Creators/Authors contains: "Singh, R."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Debiased machine learning is a meta-algorithm based on bias correction and sample splitting to calculate confidence intervals for functionals, i.e., scalar summaries, of machine learning algorithms. For example, an analyst may seek the confidence interval for a treatment effect estimated with a neural network. We present a non-asymptotic debiased machine learning theorem that encompasses any global or local functional of any machine learning algorithm that satisfies a few simple, interpretable conditions. Formally, we prove consistency, Gaussian approximation and semiparametric efficiency by finite-sample arguments. The rate of convergence is $n^{-1/2}$ for global functionals, and it degrades gracefully for local functionals. Our results culminate in a simple set of conditions that an analyst can use to translate modern learning theory rates into traditional statistical inference. The conditions reveal a general double robustness property for ill-posed inverse problems. 
    more » « less
  2. Social media has become an important method for information sharing. This has also created opportunities for bad actors to easily spread disinformation and manipulate public opinion. This paper explores the possibility of applying Authorship Verification on online communities to mitigate abuse by analyzing the writing style of online accounts to identify accounts managed by the same person. We expand on our similarity-based authorship verification approach, previously applied on large fanfictions, and show that it works in open-world settings, shorter documents, and is largely topic-agnostic. Our expanded model can link Reddit accounts based on the writing style of only 40 comments with an AUC of 0.95, and the performance increases to 0.98 given more content. We apply this model on a set of suspicious Reddit accounts associated with the disinformation campaign surrounding the 2016 U.S. presidential election and show that the writing style of these accounts are inconsistent, indicating that each account was likely maintained by multiple individuals. We also apply this model to Reddit user accounts that commented on the WallStreetBets subreddit around the 2021 GameStop short squeeze and show that a number of account pairs share very similar writing styles. We also show that this approach can link accounts across Reddit and Twitter with an AUC of 0.91 even when training data is very limited. 
    more » « less
  3. Opioid addiction constitutes a significant contemporary health crisis that is multifarious in its complexity. Modeling the epidemiology of any addiction is challenging in its own right. For opioid addiction, the challenge is exacerbated due to the difficulties in collecting real-time data and the circumscribed nature of information opioid users may disclose owing to stigma associated with prescription misuse. Given this context, identifying the progression of individuals through the stages of (opioid) addiction is one of the more acute problems in epidemiological modeling whose solution is crucial for designing specific interventions at both personal and population levels. We describe a computational approach for determining and characterizing addiction stages of opioid users from their social media posts. The proposed approach combines recurrent neural network learning with information-theoretic analysis of word-associations and context-based word embedding to determine addiction stage-specific language usage. Users who have a high likelihood for relapsing back to drug-use are identified and characterized using propensity score matching and logistic regression. Experimental evaluations indicate that the proposed approach can distinguish between various addiction stages and identify users prone to relapse with high accuracy as evidenced by F1 scores of 0.88 and 0.79 respectively 
    more » « less
  4. null (Ed.)
  5. Ma, Jian (Ed.)
  6. Olanoff, D. ; Johnson, K. ; & Spitzer, S. (Ed.)
    In this study, we explore the relationships between the types of student exclamations in an enacted lesson (e.g., “Wow!”) and the varying dramatic tensions created by the unfolding content. By analyzing student exclamations in six specially-designed high school mathematics lessons, we explore how the dynamic tension between revelations of mathematical ideas at the moment and what is yet to be known connects with the aesthetic pull to react by the student. As students work through novel problems with limited information, their joys and frustrations are expressed in the form of exclamations. 
    more » « less
  7. Olanoff, D. ; Johnson, K. ; & Spitzer, S. (Ed.)
    How does the design of lessons impact the types of questions teachers and students ask during enacted high school mathematics lessons? In this study, we present data that suggests that lessons designed with the mathematical story framework to elicit a specific aesthetic response (“MCLEs”) having a positive influence on the types of teacher and student questions they ask during the lesson. Our findings suggest that when teachers plan and enact lessons with the mathematical story framework, teachers and students are more likely to ask questions that explore mathematical relationships and focus on meaning making. In addition, teachers are less likely to ask short recall or procedural questions in MCLEs. These findings point to the role of lesson design in the quality of questions asked by teachers and students. 
    more » « less