skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Small Data
Data is becoming increasingly personal. Individuals regularly interact with a wide variety of structured data, from SQLite databases on phones, to HR spreadsheets, to personal sensors, to open government data appearing in news articles. Although these workloads are important, many of the classical challenges associated with scale and Big Data do not apply. This panel brings together experts in a variety of fields to explore the new opportunities and challenges presented by "Small Data".  more » « less
Award ID(s):
1617586
PAR ID:
10175101
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
IEEE 33rd International Conference on Data Engineering (ICDE)
Page Range / eLocation ID:
1475 to 1476
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Personal visualizations present a separate class of visualizations where users interact with their own data to draw inferences about themselves. In this paper, we study how a realistic understanding of personal visualizations can be gained from analyzing user interactions. We designed an interface presenting visualizations of the personal data gathered in a prior study and logged interactions from 369 participants as they each explored their own data. We found that the participants spent different amounts of time in exploring their data and used a variety of physical devices which could have affected their engagement with the visualizations. Our findings also suggest that the participants made more comparisons between their data instances than with the provided baselines and certain interface design choices, such as the ordering of options, influenced their exploratory behaviors. 
    more » « less
  2. The eruption of big data with the increasing collection and processing of vast volumes and variety of data have led to breakthrough discoveries and innovation in science, engineering, medicine, commerce, criminal justice, and national security that would not have been possible in the past. While there are many benefits to the collection and usage of big data, there are also growing concerns among the general public on what personal information is collected and how it is used. In addition to legal policies and regulations, technological tools and statistical strategies also exist to promote and safeguard individual privacy, while releasing and sharing useful population-level information. In this overview, I introduce some of these approaches, as well as the existing challenges and opportunities in statistical data privacy research and applications to better meet the practical needs of privacy protection and information sharing. 
    more » « less
  3. Researchers and educators have explored a variety of approaches for addressing diversity, equity, inclusion, and belonging (DEIB) challenges in engineering and design. This research builds on recommendations to teach future engineers and designers about DEIB principles and applications, and to challenge the dissociation of engineering and societal concerns. This paper analyses 25 student reflections from a course on Inclusive Engineering and Design to reveal engaging topics, perceived learning, and personal growth. Our conclusion is that such courses are meaningful and worthwhile contributions to the curriculum. 
    more » « less
  4. New laws such as the European Union’s General Data Protection Regulation (GDPR) grant users unprecedented control over personal data stored and processed by businesses. Compliance can require expensive manual labor or retrofitting of existing systems, e.g., to handle data retrieval and removal requests. We argue for treating these new requirements as an opportunity for new system designs. These designs should make data ownership a first-class concern and achieve compliance with privacy legislation by construction. A compliant-by-construction system could build a shared database, with similar performance as current systems, from personal databases that let users contribute, audit, retrieve, and remove their personal data through easy-to-understand APIs. Realizing compliant-by-construction systems requires new cross-cutting abstractions that make data dependencies explicit and that augment classic data processing pipelines with ownership information. We suggest what such abstractions might look like, and highlight existing technologies that we believe make compliant-by-construction systems feasible today. We believe that progress towards such systems is at hand, and highlight challenges for researchers to address to make them a reality. 
    more » « less
  5. Bipolar Disorder, a mood disorder with recurrent mania and depression, requires ongoing monitoring and specialty management. Current monitoring strategies are clinically-based, engaging highly specialized medical professionals who are becoming increasingly scarce. Automatic speech-based monitoring via smartphones has the potential to augment clinical monitoring by providing inexpensive and unobtrusive measurements of a patient’s daily life. The success of such an approach is contingent on the ability to successfully utilize “in-the-wild” data. However, most existing work on automatic mood detection uses datasets collected in clinical or laboratory settings. This study presents experiments in automatically detecting depression severity in individuals with Bipolar Disorder using data derived from clinical interviews and from personal conversations. We find that mood assessment is more accurate using data collected from clinical interactions, in part because of their highly structured nature. We demonstrate that although the features that are most effective in clinical interactions do not extend well to personal conversational data, we can identify alternative features relevant in personal conversational speech to detect mood symptom severity. Our results highlight the challenges unique to working with “in-the-wild” data, providing insight into the degree to which the predictive ability of speech features is preserved outside of a clinical interview. 
    more » « less