skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Many Facets of Data Equity
Data-driven systems can be unfair, in many different ways. All too often, as data scientists, we focus narrowly on one technical aspect of fairness. In this paper, we attempt to address equity broadly, and identify the many different ways in which it is manifest in data-driven systems.  more » « less
Award ID(s):
1934464
PAR ID:
10287321
Author(s) / Creator(s):
; ;
Editor(s):
Costa, Constantinos; Pitoura, Evaggelia
Date Published:
Journal Name:
theWorkshop Proceedings of the EDBT/ICDT 2021 Joint Conference
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary Many real‐world scientific processes are governed by complex non‐linear dynamic systems that can be represented by differential equations. Recently, there has been an increased interest in learning, or discovering, the forms of the equations driving these complex non‐linear dynamic systems using data‐driven approaches. In this paper, we review the current literature on data‐driven discovery for dynamic systems. We provide a categorisation to the different approaches for data‐driven discovery and a unified mathematical framework to show the relationship between the approaches. Importantly, we discuss the role of statistics in the data‐driven discovery field, describe a possible approach by which the problem can be cast in a statistical framework and provide avenues for future work. 
    more » « less
  2. There are more than 7,000 public transit agencies in the U.S. (and many more private agencies), and together, they are responsible for serving 60 billion passenger miles each year. A well-functioning transit system fosters the growth and expansion of businesses, distributes social and economic benefits, and links the capabilities of community members, thereby enhancing what they can accomplish as a society. Since affordable public transit services are the backbones of many communities, this work investigates ways in which Artificial Intelligence (AI) can improve efficiency and increase utilization from the perspective of transit agencies. This book chapter discusses the primary requirements, objectives, and challenges related to the design of AI-driven smart transportation systems. We focus on three major topics. First, we discuss data sources and data. Second, we provide an overview of how AI can aid decision-making with a focus on transportation. Lastly, we discuss computational problems in the transportation domain and AI approaches to these problems. 
    more » « less
  3. Corlu, C. G.; Hunter, S. R.; Lam, H.; Onggo, B. S.; Shortle, J.; Biller, B. (Ed.)
    Experiments that are games played among a network of players are widely used to study human behavior. Furthermore, bots or intelligent systems can be used in these games to produce contexts that elicit particular types of human responses. Bot behaviors could be specified solely based on experimental data. In this work, we take a different perspective, called the Probability Calibration (PC) approach, to simulate networked group anagram games with certain players having bot-like behaviors. The proposed method starts with data-driven models and calibrates in principled ways the parameters that alter player behaviors. It can alter the performance of each type of agent (e.g., bot) in group anagram games. Further, statistical methods are used to test whether the PC models produce results that are statistically different from those of the original models. Case studies demonstrate the merits of the proposed method. 
    more » « less
  4. null (Ed.)
    Online advertising, as a vast market, has gained significant attention in various platforms ranging from search engines, third-party websites, social media, and mobile apps. The prosperity of online campaigns is a challenge in online marketing and is usually evaluated by user response through different metrics, such as clicks on advertisement (ad) creatives, subscriptions to products, purchases of items, or explicit user feedback through online surveys. Recent years have witnessed a significant increase in the number of studies using computational approaches, including machine learning methods, for user response prediction. However, existing literature mainly focuses on algorithmic-driven designs to solve specific challenges, and no comprehensive review exists to answer many important questions. What are the parties involved in the online digital advertising eco-systems? What type of data are available for user response prediction? How do we predict user response in a reliable and/or transparent way? In this survey, we provide a comprehensive review of user response prediction in online advertising and related recommender applications. Our essential goal is to provide a thorough understanding of online advertising platforms, stakeholders, data availability, and typical ways of user response prediction. We propose a taxonomy to categorize state-of-the-art user response prediction methods, primarily focusing on the current progress of machine learning methods used in different online platforms. In addition, we also review applications of user response prediction, benchmark datasets, and open source codes in the field. 
    more » « less
  5. Dynamical systems that evolve continuously over time are ubiquitous throughout science and engineering. Machine learning (ML) provides data-driven approaches to model and predict the dynamics of such systems. A core issue with this approach is that ML models are typically trained on discrete data, using ML methodologies that are not aware of underlying continuity properties. This results in models that often do not capture any underlying continuous dynamics—either of the system of interest, or indeed of any related system. To address this challenge, we develop a convergence test based on numerical analysis theory. Our test verifies whether a model has learned a function that accurately approximates an underlying continuous dynamics. Models that fail this test fail to capture relevant dynamics, rendering them of limited utility for many scientific prediction tasks; while models that pass this test enable both better interpolation and better extrapolation in multiple ways. Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications. 
    more » « less