It takes great effort to manually or semi-automatically convert free-text phenotype narratives (e.g., morphological descriptions in taxonomic works) to a computable format before they can be used in large-scale analyses. We argue that neither a manual curation approach nor an information extraction approach based on machine learning is a sustainable solution to produce computable phenotypic data that are FAIR (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016). This is because these approaches do not scale to all biodiversity, and they do not stop the publication of free-text phenotypes that would need post-publication curation. In addition, both manual and machine learning approaches face great challenges: the problem of inter-curator variation (curators interpret/convert a phenotype differently from each other) in manual curation, and keywords to ontology concept translation in automated information extraction, make it difficult for either approach to produce data that are truly FAIR. Our empirical studies show that inter-curator variation in translating phenotype characters to Entity-Quality statements (Mabee et al. 2007) is as high as 40% even within a single project. With this level of variation, curated data integrated from multiple curation projects may still not be FAIR. The key causes of this variation have been identified as semantic vaguenessmore »
Smart Webcam Cover: Exploring the Design of an Intelligent Webcam Cover to Improve Usability and Trust
Laptop webcams can be covertly activated by malware and law enforcement agencies. Consequently, 59% percent of Americans manually cover their webcams to avoid being surveilled. However, manual covers are prone to human error---through a survey with 200 users, we found that 61.5% occasionally forget to re-attach their cover after using their webcam. To address this problem, we developed Smart Webcam Cover (SWC): a thin film that covers the webcam (PDLC-overlay) by default until a user manually uncovers the webcam, and automatically covers the webcam when not in use. Through a two-phased design iteration process, we evaluated SWC with 20 webcam cover users through a remote study with a video prototype of SWC, compared to manual operation, and discussed factors that influence users' trust in the effectiveness of SWC and their perceptions of its utility.
- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
- Page Range or eLocation-ID:
- 1 to 21
- Sponsoring Org:
- National Science Foundation
More Like this
Two common approaches for automating IoT smart spaces are having users write rules using trigger-action programming (TAP) or training machine learning models based on observed actions. In this paper, we unite these approaches. We introduce and evaluate Trace2TAP, a novel method for automatically synthesizing TAP rules from traces (time-stamped logs of sensor readings and manual actuations of devices). We present a novel algorithm that uses symbolic reasoning and SAT-solving to synthesize TAP rules from traces. Compared to prior approaches, our algorithm synthesizes generalizable rules more comprehensively and fully handles nuances like out-of-order events. Trace2TAP also iteratively proposes modified TAP rules when users manually revert automations. We implemented our approach on Samsung SmartThings. Through formative deployments in ten offices, we developed a clustering/ranking system and visualization interface to intelligibly present the synthesized rules to users. We evaluated Trace2TAP through a field study in seven additional offices. Participants frequently selected rules ranked highly by our clustering/ranking system. Participants varied in their automation priorities, and they sometimes chose rules that would seem less desirable by traditional metrics like precision and recall. Trace2TAP supports these differing priorities by comprehensively synthesizing TAP rules and bringing humans into the loop during automation.
When journalists cover a news story, they can cover the story from multiple angles or perspectives. These perspectives are called “frames,” and usage of one frame or another may influence public perception and opinion of the issue at hand. We develop a web-based system for analyzing frames in multilingual text documents. We propose and guide users through a five-step end-to-end computational framing analysis framework grounded in media framing theory in communication research. Users can use the framework to analyze multilingual text data, starting from the exploration of frames in user’s corpora and through review of previous framing literature (step 1-3) to frame classification (step 4) and prediction (step 5). The framework combines unsupervised and supervised machine learning and leverages a state-of-the-art (SoTA) multilingual language model, which can significantly enhance frame prediction performance while requiring a considerably small sample of manual annotations. Through the interactive website, anyone can perform the proposed computational framing analysis, making advanced computational analysis available to researchers without a programming background and bridging the digital divide within the communication research discipline in particular and the academic community in general. The system is available online at http://www.openframing.org, via an API http://www.openframing.org:5000/docs/, or through our GitHub page https://github.com/vibss2397/openFraming.
We study the classic set cover problem from the perspective of sub-linear algorithms. Given access to a collection of m sets over n elements in the query model, we show that sub-linear algorithms derived from existing techniques have almost tight query complexities. On one hand, first we show an adaptation of the streaming algorithm presented in  to the sub-linear query model, that returns an α-approximate cover using Õ(m(n/k)^1/(α–1) + nk) queries to the input, where k denotes the value of a minimum set cover. We then complement this upper bound by proving that for lower values of k, the required number of queries is , even for estimating the optimal cover size. Moreover, we prove that even checking whether a given collection of sets covers all the elements would require Ω(nk) queries. These two lower bounds provide strong evidence that the upper bound is almost tight for certain values of the parameter k. On the other hand, we show that this bound is not optimal for larger values of the parameter k, as there exists a (1 + ε)-approximation algorithm with Õ(mn/kε^2) queries. We show that this bound is essentially tight for sufficiently small constant ε, by establishing amore »
We describe a 3/2-approximation algorithm, \lse, for computing a b-edgecover of minimum weight in a graph with weights on the edges. The b-edgecover problem is a generalization of the better-known Edge Cover problem in graphs, where the objective is to choose a subset C of edges in the graph such that at least a specified number b(v) of edges in C are incident on each vertex v. In the weighted b-edgecover problem, we minimize the sum of the weights of the edges in C. We prove that the Locally Subdominant edge (LSE) algorithm computes the same b-edge cover as the one obtained by the Greedy algorithm for the problem. However, the Greedy algorithm requires edges to be sorted by their effective weights, and these weights need to be updated after each iteration. These requirements make the Greedy algorithm sequential and impractical for massive graphs. The LSE algorithm avoids the sorting step, and is amenable for parallelization. We implement the algorithm on a serial machine and compare its performance against a collection of approximation algorithms for the b-edge cover problem. Our results show that the algorithm is 3 to 5 times faster than the Greedy algorithm on a serial processor. Themore »