Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Building Interdisciplinarity in Engineering Doctoral Education: Insights from DTAIS Summer IncubatorIn 2021 GW Engineering was awarded funding to launch an interdisciplinary program on trustworthy AI. Designing Trustworthy AI in Systems (or DTAIS) brings together PhD students from systems engineering and computer science to co-design research and tackle the conceptual and methodological bridge building that cross disciplinary work demands. This paper focuses on how this work has been accomplished thus far, in the context of the cornerstone summer incubator, and shares some of the lessons learned. The 10-week summer incubator course, which was designed specifically for this program, brings systems engineers and computer science PhD students to make sense of “AI in the wild” (real world settings) and build short-run research prototypes together. Leveraging the interdisciplinarity of the core program faculty, the group established a fertile middle ground where a mixed method ethos, design sprint rhythm and intentional sense of community enlivens the normative student-advisor modality most PhD students experience. Along the way, the definitional challenge of what is meant exactly by trust and trustworthiness within a particular problem domain and literature is given plenty of room to form, fall apart and form again through discussion, practice, and reflection. With two iterations of the summer incubator course to glean from, we report on the difficulties of rewiring student-advisor dynamics and the positive effects of growing a diverse community. This represents a potential roadmap for how to scaffold interdisciplinarity in engineering doctoral education.more » « less
-
Abstract—We consider the ability of CLIP features to support text-driven image retrieval. Traditional image-based queries sometimes misalign with user intentions due to their focus on irrelevant image components. To overcome this, we explore the potential of text-based image retrieval, specifically using Contrastive Language-Image Pretraining (CLIP) models. CLIP models, trained on large datasets of image-caption pairs, offer a promising approach by allowing natural language descriptions for more targeted queries. We explore the effectiveness of textdriven image retrieval based on CLIP features by evaluating the image similarity for progressively more detailed queries. We find that there is a sweet-spot of detail in the text that gives best results and find that words describing the “tone” of a scene (such as messy, dingy) are quite important in maximizing text-image similarity.more » « less
-
A large amount of high-dimensional and heterogeneous data appear in practical applications, which are often published to third parties for data analysis, recommendations, targeted advertising, and reliable predictions. However, publishing these data may disclose personal sensitive information, resulting in an increasing concern on privacy violations. Privacy-preserving data publishing has received considerable attention in recent years. Unfortunately, the differentially private publication of high dimensional data remains a challenging problem. In this paper, we propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases: a Markov-blanket-based attribute clustering phase and an invariant post randomization (PRAM) phase. Specifically, splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable allocation of privacy budget, while a double-perturbation mechanism satisfying local differential privacy facilitates an invariant PRAM to ensure no loss of statistical information and thus significantly preserves data utility. We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy. We conduct extensive experiments on four real-world datasets and the experimental results demonstrate that our mechanism can significantly improve the data utility of the published data while satisfying differential privacy.more » « less
-
In this paper, we explore the potential of utilizing time-stamps as labels for Deep Learning from webcams, surveillance cameras, and other fixed viewpoint image situations. Specifically, we explore if learning to classify images by the time they were taken uncovers interesting patterns and behaviors in the scenes captured by these cameras. We describe approaches to building datasets with large quantities of images and their accompanying labels, making them suitable for large-scale deep learning approaches. We share our results from the initial deep learning experiments.more » « less
-
We explore the use of deep convolutional neural networks (CNNs) trained on overhead imagery of biomass sorghum to ascertain the relationship between single nucleotide polymorphisms (SNPs), or groups of related SNPs, and the phenotypes they control. We consider both CNNs trained explicitly on the classification task of predicting whether an image shows a plant with a reference or alternate version of various SNPs as well as CNNs trained to create data-driven features based on learning features so that images from the same plot are more similar than images from different plots, and then using the features this network learns for genetic marker classification. We characterize how efficient both approaches are at predicting the presence or absence of a genetic markers, and visualize what parts of the images are most important for those predictions. We find that the data-driven approaches give somewhat higher prediction performance, but have visualizations that are harder to interpret; and we give suggestions of potential future machine learning research and discuss the possibilities of using this approach to uncover unknown genotype × phenotype relationships.more » « less