Diversity, group representation, and similar needs often apply to query results, which in turn require constraints on the sizes of various subgroups in the result set. Traditional relational queries only specify conditions as part of the query predicate(s), and do not support such restrictions on the output. In this paper, we study the problem of modifying queries to have the result satisfy constraints on the sizes of multiple subgroups in it. This problem, in the worst case, cannot be solved in polynomial time. Yet, with the help of provenance annotation, we are able to develop a query refinement method that works quite efficiently, as we demonstrate through extensive experiments.
more »
« less
Erica: Query Refinement for Diversity Constraint Satisfaction
Relational queries are commonly used to support decision making in critical domains like hiring and college admissions. For example, a college admissions officer may need to select a subset of the applicants for in-person interviews, who individually meet the qualification requirements (e.g., have a sufficiently high GPA) and are collectively demographically diverse (e.g., include a sufficient number of candidates of each gender and of each race). However, traditional relational queries only support selection conditions checked against each input tuple, and they do not support diversity conditions checked against multiple, possibly overlapping, groups of output tuples. To address this shortcoming, we present Erica, an interactive system that proposes minimal modifications for selection queries to have them satisfy constraints on the cardinalities of multiple groups in the result. We demonstrate the effectiveness of Erica using several real-life datasets and diversity requirements.
more »
« less
- Award ID(s):
- 2106176
- PAR ID:
- 10482057
- Publisher / Repository:
- VLDB
- Date Published:
- Journal Name:
- Proceedings of the VLDB Endowment
- Volume:
- 16
- Issue:
- 12
- ISSN:
- 2150-8097
- Page Range / eLocation ID:
- 4070 to 4073
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The Greater Equity, Access, and Readiness for Engineering and Technology (GEARSET) Program, an NSF funded S-STEM program was developed GEARSET to address several institutional needs at the university. The original target population for the GEARSET program was identified as a subset of the students who applied to the College of Engineering and do not meet all the admissions requirements and are admitted to an Exploratory Studies major in the university’s University College. Historical data indicates that approximately 170 students per year with a high school GPA of 3.00 or higher are admitted to Exploratory Studies because they do not meet the College of Engineering admissions criteria. Of these, roughly 78 students remain at the University after one year. Of those 78, only about 45 students per year transition to college of Engineering majors by the end of their first year. These numbers do not accurately reflect the ability of these students, but rather are due in part to curricular bottlenecks, lack of institutional support, and lack of significant relevant exposure of students to material meant to engage their engineering future selves. This data motivated the creation of the GEARSET program. Specifically, the program was designed to 1. Increase recruitment, retention, student success, and transfer rates into engineering of students who are not admitted directly to engineering but who are instead admitted to the university’s University College. 2. Increase meaningfulness and engineering relevance of pre-engineering curriculum. 3. Increase diversity within the student population of various engineering departments in the College of Engineering. 4. Remove bottlenecks in curriculum and improve access to engineering and decrease length to degree. A key aspect of the program is a curated curriculum. All students in the GEARSET program are enrolled in multiple courses historically proven to promote better understanding of the key areas of Math, Chemistry and Physics needed to be successful engineers. All students have access to advisors within the COE to help them better understand the programs, curriculum and professional outcomes of each discipline of Engineering. Another key component of the program is that low income students in the GEARSET cohort who successfully transfer to a major within the COE after one year receive scholarship support. Here we describe the Program, the results to date, and the impact of the recent global pandemic and the subsequent transition to test optional admissions criteria on the definition of the GEARSET cohort, program implementation, and student participation.more » « less
-
We consider the problem of learning causal re- lationships from relational data. Existing ap- proaches rely on queries to a relational condi- tional independence (RCI) oracle to establish and orient causal relations in such a setting. In practice, queries to a RCI oracle have to be replaced by reliable tests for RCI against available data. Relational data present several unique challenges in testing for RCI. We study the conditions under which traditional iid-based CI tests yield reliable answers to RCI queries against relational data. We show how to conduct CI tests against relational data to robustly recover the underlying relational causal struc- ture. Results of our experiments demonstrate the effectiveness of our proposed approach.more » « less
-
null (Ed.)We consider the problem of learning causal relationships from relational data. Existing approaches rely on queries to a relational conditional independence (RCI) oracle to establish and orient causal relations in such a setting. In practice, queries to a RCI oracle have to be replaced by reliable tests for RCI against available data. Relational data present several unique challenges in testing for RCI. We study the conditions under which traditional iid-based CI tests yield reliable answers to RCI queries against relational data. We show how to con- duct CI tests against relational data to robustly recover the underlying relational causal structure. Results of our experiments demonstrate the effectiveness of our proposed approach.more » « less
-
Low-income students are underrepresented in engineering and are more likely to struggle in engineering programs. Such students may be academically talented and perform well in high school, but may have relatively weak academic preparation for college compared to students who attended better-resourced schools. Four-year engineering and computer science curricula are designed for students who are calculus-ready, but many students who are eager to become engineers or computer scientists need additional time and support to succeed. The NSF-funded Redshirt in Engineering Consortium was formed in 2016 as a collaborative effort to build on the success of three existing “academic redshirt” programs and expand the model to three new schools. The Consortium takes its name from the practice of redshirting in college athletics, with the idea of providing an extra year and support to promising engineering students from low-income backgrounds. The goal of the program is to enhance the students’ ability to successfully graduate with engineering or computer science degrees. This Work in Progress paper describes the redshirt programs at each of the six Consortium institutions, providing a variety of models for how an extra preparatory year or other intensive academic preparatory programs can be accommodated. This paper will pay particular attention to the ways that institutional context shapes the implementation of the redshirt model. For instance, what do the redshirt admissions and selection processes look like at schools with direct-to-college admissions versus schools with post-general education admissions? What substantive elements of the first-year curriculum are consistent across the consortium? Where variation in curriculum occurs, what are the institutional factors that produce this variation? How does the redshirt program fit with other pre-existing academic support services on campus, and what impact does this have on the redshirt program’s areas of focus? Program elements covered include first-year curricula, pre-matriculation summer programs, academic advising and support services, admissions and selection processes, and financial aid. Ongoing assessment efforts and research designed to investigate how the various redshirt models influence faculty and student experiences will be described.more » « less
An official website of the United States government

