skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Conducting User Experiments in Recommender Systems
This tutorial provides practical training in designing and conduct- ing online user experiments with recommender systems, and in statistically analyzing the results of such experiments. It covers the development of a research question and hypotheses, the selection of study participants, the manipulation of system aspects and mea- surement of behaviors, perceptions and user experiences, and the evaluation of subjective measurement scales and study hypotheses. Interested parties can find the slides, example datset, and other resources at https://www.usabart.nl/QRMS/.  more » « less
Award ID(s):
2232552
PAR ID:
10552820
Author(s) / Creator(s):
;
Publisher / Repository:
ACM
Date Published:
ISBN:
9798400705052
Page Range / eLocation ID:
1272 to 1273
Format(s):
Medium: X
Location:
Bari Italy
Sponsoring Org:
National Science Foundation
More Like this
  1. Mura, Cameron (Ed.)
    Machine learning (ML) is increasingly being used to guide biological discovery in biomedicine such as prioritizing promising small molecules in drug discovery. In those applications, ML models are used to predict the properties of biological systems, and researchers use these predictions to prioritize candidates as new biological hypotheses for downstream experimental validations. However, when applied to unseen situations, these models can be overconfident and produce a large number of false positives. One solution to address this issue is to quantify the model’s prediction uncertainty and provide a set of hypotheses with a controlled false discovery rate (FDR) pre-specified by researchers. We propose CPEC, an ML framework for FDR-controlled biological discovery. We demonstrate its effectiveness using enzyme function annotation as a case study, simulating the discovery process of identifying the functions of less-characterized enzymes. CPEC integrates a deep learning model with a statistical tool known as conformal prediction, providing accurate and FDR-controlled function predictions for a given protein enzyme. Conformal prediction provides rigorous statistical guarantees to the predictive model and ensures that the expected FDR will not exceed a user-specified level with high probability. Evaluation experiments show that CPEC achieves reliable FDR control, better or comparable prediction performance at a lower FDR than existing methods, and accurate predictions for enzymes under-represented in the training data. We expect CPEC to be a useful tool for biological discovery applications where a high yield rate in validation experiments is desired but the experimental budget is limited. 
    more » « less
  2. Molecular interaction networks are a vital tool for studying biological systems. While many tools exist that visualize a protein or a pathway within a network, no tool provides the ability for a researcher to consider a protein's position in a network in the context of a specific biological process or pathway. We developed ProteinWeaver, a web-based tool designed to visualize and analyze non-human protein interaction networks by integrating known biological functions. ProteinWeaver provides users with an intuitive interface to situate a user-specified protein in a user-provided biological context (as a Gene Ontology term) in five model organisms. ProteinWeaver also reports the presence of physical and regulatory network motifs within the queried subnetwork and statistics about the protein's distance to the biological process or pathway within the network. These insights can help researchers generate testable hypotheses about the protein's potential role in the process or pathway under study. Two cell biology case studies demonstrate ProteinWeaver's potential to generate hypotheses from the queried subnetworks. ProteinWeaver is available at https://proteinweaver.reedcompbio.org/. 
    more » « less
  3. null (Ed.)
    Adopting new technology is challenging for volunteer moderation teams of online communities. Challenges are aggravated when communities increase in size. In a prior qualitative study, Kiene et al. found evidence that moderator teams adapted to challenges by relying on their experience in other technological platforms to guide the creation and adoption of innovative custom moderation "bots." In this study, we test three hypotheses on the social correlates of user innovated bot usage drawn from a previous qualitative study. We find strong evidence of the proposed relationship between community size and the use of user innovated bots. Although previous work suggests that smaller teams of moderators will be more likely to use these bots and that users with experience moderating in the previous platform will be more likely to do so, we find little evidence in support of either proposition. 
    more » « less
  4. null (Ed.)
    Abstract Product/service systems (PSSs) are increasingly found in markets, and more resources are being invested in PSS design. Despite the substantial research into PSS design, the current literature exhibits an incomplete understanding of it as a cognitive activity. This article demonstrates that the methods used to analyze product designers’ cognitive behavior can be used to produce comparable and commensurable results when analyzing PSS designers. It also generates empirical grounding for the development of hypotheses based on a cognitive study of a PSS design session in a laboratory environment using protocol analysis. This study is a part of a larger project comparing PSS design with product design. The results, which are based on the function–behavior–structure coding scheme, show that PSS design, when coded using this scheme, can be quantitatively compared with product design. Five hypotheses were developed based on the results of the study of this design session concerning where and how designers expend their cognitive design effort. These hypotheses can be used to design experiments that test them and provide the grounding for a fuller understanding of PSS design. 
    more » « less
  5. Digital experiments are routinely used to test the value of a treatment relative to a status quo control setting — for instance, a new search relevance algorithm for a website or a new results layout for a mobile app. As digital experiments have become increasingly pervasive in organizations and a wide variety of research areas, their growth has prompted a new set of challenges for experimentation platforms. One challenge is that experiments often focus on the average treatment effect (ATE) without explicitly considering differences across major sub-groups — heterogeneous treatment effect (HTE). This is especially problematic because ATEs have decreased in many organizations as the more obvious benefits have already been realized. However, questions abound regarding the pervasiveness of user HTEs and how best to detect them. We propose a framework for detecting and analyzing user HTEs in digital experiments. Our framework combines an array of user characteristics with double machine learning. Analysis of 27 real-world experiments spanning 1.76 billion sessions and simulated data demonstrates the effectiveness of our detection method relative to existing techniques. We also find that transaction, demographic, engagement, satisfaction, and lifecycle characteristics exhibit statistically significant HTEs in 10% to 20% of our real-world experiments, underscoring the importance of considering user heterogeneity when analyzing experiment results, otherwise personalized features and experiences cannot happen, thus reducing effectiveness. In terms of the number of experiments and user sessions, we are not aware of any study that has examined user HTEs at this scale. Our findings have important implications for information retrieval, user modeling, platforms, and digital experience contexts, in which online experiments are often used to evaluate the effectiveness of design artifacts. 
    more » « less