The subtle human values we acquire through life experiences govern our thoughts and gets reflected in our speech. It plays an integral part in capturing the essence of our individuality and making it imperative to identify such values in computational systems that mimic human actions. Computational argumentation is a field that deals with the argumentation capabilities of humans and can benefit from identifying such values. Motivated by that, we present an ensemble approach for detecting human values from argument text. Our ensemble comprises three models: (i) An entailment-based model for determining the human values based on their descriptions, (ii) A Roberta-based classifier that predicts the set of human values from an argument. (iii) A Roberta-based classifier to predict a reduced set of human values from an argument. We experiment with different ways of combining the models and report our results. Furthermore, our best combination achieves an overall F1 score of 0.48 on the main test set.
more »
« less
Observations from an expedition to Costa Rican peatlands.
Notes from the field In July and August of 2023, we visited Costa Rica to examine some of the country’s peatlands. The purpose of our trip was to collect peat samples from a variety of wetland habitats from the coast to the highlands for future analysis. We summarize our observations in this short essay. Editorial review only
more »
« less
- Award ID(s):
- 2142177
- PAR ID:
- 10506697
- Publisher / Repository:
- Society of Wetland Scientists
- Date Published:
- Journal Name:
- The Society of Wetland Scientists bulletin
- Volume:
- 42
- Issue:
- 1
- ISSN:
- 1943-6254
- Page Range / eLocation ID:
- 145-149
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The evolutionary classification of protein domains (ECOD) classifies protein domains using a combination of sequence and structural data (http://prodata.swmed.edu/ecod). Here we present the culmination of our previous efforts at classifying domains from predicted structures, principally from the AlphaFold Database (AFDB), by integrating these domains with our existing classification of PDB structures. This combined classification includes both domains from our previous, purely experimental, classification of domains as well as domains from our provisional classification of 48 proteomes in AFDB predicted from model organisms and organisms of concern to global health. ECOD classifies over 1.8 M domains from over 1000 000 proteins collectively deposited in the PDB and AFDB. Additionally, we have changed the F-group classification reference used for ECOD, deprecating our original ECODf library and instead relying on direct collaboration with the Pfam sequence family database to inform our classification. Pfam provides similar coverage of ECOD with family classification while being more accurate and less redundant. By eliminating duplication of effort, we can improve both classifications. Finally, we discuss the initial deployment of DrugDomain, a database of domain-ligand interactions, on ECOD and discuss future plans.more » « less
-
We present a new algorithm that synthesizes functional reactive programs from observation data. The key novelty is to iterate between a functional synthesis step, which attempts to generate a transition function over observed states, and an automata synthesis step, which adds any additional latent state necessary to fully account for the observations. We develop a functional reactive DSL called Autumn that can express a rich variety of causal dynamics in time-varying, Atari-style grid worlds, and apply our method to synthesize Autumn programs from data. We evaluate our algorithm on a benchmark suite of 30 Autumn programs as well as a third-party corpus of grid-world-style video games. We find that our algorithm synthesizes 27 out of 30 programs in our benchmark suite and 21 out of 27 programs from the third-party corpus, including several programs describing complex latent state transformations, and from input traces containing hundreds of observations. We expect that our approach will provide a template for how to integrate functional and automata synthesis in other induction domains.more » « less
-
Law, Edith; Vaughan, Jennifer W (Ed.)In this paper, we analyze PAC learnability from labels produced by crowdsourcing. In our setting, unlabeled examples are drawn from a distribution and labels are crowdsourced from workers who operate under classification noise, each with their own noise parameter. We develop an end-to-end crowdsourced PAC learning algorithm that takes unlabeled data points as input and outputs a trained classifier. Our threestep algorithm incorporates majority voting, pure-exploration bandits, and noisy-PAC learning. We prove several guarantees on the number of tasks labeled by workers for PAC learning in this setting and show that our algorithm improves upon the baseline by reducing the total number of tasks given to workers. We demonstrate the robustness of our algorithm by exploring its application to additional realistic crowdsourcing settings.more » « less
-
As requirements for swift and sustainable data sharing are growing, questions of where and how researchers are sharing data are becoming increasingly important for institutions to answer. One of the goals of the Reality of Academic Data Sharing (RADS) Initiative, comprised of six academic institutions from the Data Curation Network (DCN), was to answer this question. This presentation will discuss the process of how RADS determined where data from our researchers are shared. To do this, we programmatically pulled DOIs from DataCite, making the naive assumption that the information we were collecting, the metadata fields we were utilizing, and the platforms we were using would present us with a neutral and unbiased view of where data from our affiliated researchers were shared. However, as we dug into the data, we found inconsistencies in the use and completeness of the necessary metadata fields for our questions, as well as differences in how DOIs were assigned across repositories. While we expected some differences, we did not anticipate these subtle differences would dramatically affect how we interpret the answer to the question of where data are shared. Our presentation will highlight examples in our work that show how these subtleties in the data are systematic and challenge our assumptions of neutrality of not just the data, but of our platforms and practices as well. By examining these biases, we are forced to reexamine the decisions behind how we practice and, as we move forward as information and repository managers, how to reduce bias or assumption of neutrality. As a community, we often rely on data-driven decisions and decision makers need to be aware of these biases, especially as we are likely to see increased investments due to the evolving data policies and practices.more » « less
An official website of the United States government

