Abstract Each language has its unique way to mark grammatical information such as gender, number and tense. For example, English marks number and tense/aspect information with morphological suffixes (e.g., ‐sor ‐ed). These morphological suffixes are crucial for language acquisition as they are the basic building blocks of syntax, encode relationships, and convey meaning. Previous research shows that English‐learning infants recognize morphological suffixes attached to nonce words by the end of the first year, although even 8‐month‐olds recognize them when they are attached to known words. These results support an acquisition trajectory where discovery of meaning guides infants' acquisition of morphological suffixes. In this paper, we re‐evaluated English–learning infants' knowledge of morphological suffixes in the first year of life. We found that 6–month–olds successfully segmented nonce words suffixed with–s,–ing,–edand a pseudo‐morpheme ‐sh. Additionally, they related nonce words suffixed with–s, but not ‐ing, ‐edor a pseudo‐morpheme–shand stems. By 8–months, infants were also able to relate nonce words suffixed with–ingand stems. Our results show that infants demonstrate knowledge of morphological relatedness from the earliest stages of acquisition. They do so even in the absence of access to meaning. Based on these results, we argue for a developmental timeline where the acquisition of morphology is, at least, concurrent with the acquisition of phonology and meaning.
more »
« less
Dead Science: Most Resources Linked in Biomedical Articles Disappear in Eight Years
Scientific progress critically depends on disseminating analytic pipelines and datasets that make results reproducible and replicable. Increasingly, researchers make resources available for wider reuse and embed links to them in their published manuscripts. Previous research has shown that these resources become unavailable over time but the extent and causes of this problem in open access publications has not been explored well. By using 1.9 million articles from PubMed Open Access, we estimate that half of all resources become unavailable after 8 years. We find that the number of times a resource has been used, the international (int) and organization (org) domain suffixes, and the number of affiliations are positively related to resources being available. In contrast, we found that the length of the URL, Indian (in), European Union (eu), and Chinese (cn) domain suffixes, and abstract length are negatively related to resources being available. Our results contribute to our understanding of resource sharing in science and provide some guidance to solve resource decay.
more »
« less
- PAR ID:
- 10196522
- Date Published:
- Journal Name:
- iConference 2019, LNCS 11420
- Volume:
- 11420
- Page Range / eLocation ID:
- 170–176
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Applications and middleware services, such as data placement engines, I/O scheduling, and prefetching engines, require low-latency access to telemetry data in order to make optimal decisions. However, typical monitoring services store their telemetry data in a database in order to allow applications to query them, resulting in significant latency penalties. This work presents Apollo: a low-latency monitoring service that aims to provide applications and middleware libraries with direct access to relational telemetry data. Monitoring the system can create interference and overhead, slowing down raw performance of the resources for the job. However, having a current view of the system can aid middleware services in making more optimal decisions which can ultimately improve the overall performance. Apollo has been designed from the ground up to provide low latency, using Publish–Subscribe (Pub-Sub) semantics, and low overhead, using adaptive intervals in order to change the length of time between polling the resource for telemetry data and machine learning in order to predict changes to the telemetry data between actual resource polling. This work also provides some high level abstractions called I/O curators, which can further aid middleware libraries and applications to make optimal decisions. Evaluations showcase that Apollo can achieve sub-millisecond latency for acquiring complex insights with a memory overhead of ~57MB and CPU overhead being only 7% more than existing state-of-the-art systems.more » « less
-
This working group aims to identify available datasets within the context of computing education research. One particular area of interest is programming education, and the data in question may include students' steps, progress, or submissions in the form of program code. To achieve this goal, the working group will review well-known data resources and repositories (e.g., DataShop, GitHub, NSF Public Access Repository, and IEEE DataPort) and recent papers published within the SIGCSE community. As a result of the review process, the working group will create an overview of available datasets and characterize them while reflecting on current data practices, challenges, and the consequences of limited access to research data. Additionally, the group intends to propose a path for the community to become more open and move toward open data practices. This proposal highlights the importance of sharing research data within the computing education research community to make it stronger and more productive.more » « less
-
Abstract The prolonged COVID-19 pandemic has tied up significant medical resources, and its management poses a challenge for the public health care decision making. Accurate predictions of the hospitalizations are crucial for the decision makers to make informed decision for the medical resource allocation. This paper proposes a method named County Augmented Transformer (CAT). To generate accurate predictions of four-week-ahead COVID-19 related hospitalizations for every states in the United States. Inspired by the modern deep learning techniques, our method is based on a self-attention model (known as the transformer model) that is actively used in Natural Language Processing. Our transformer based model can capture both short-term and long-term dependencies within the time series while enjoying computational efficiency. Our model is a data based approach that utilizes the publicly available information including the COVID-19 related number of confirmed cases, deaths, hospitalizations data, and the household median income data. Our numerical experiments demonstrate the strength and the usability of our model as a potential tool for assisting the medical resources allocation.more » « less
-
Cloud computing has become an emerging trend for the software industry with the requirement of large infrastructure and resources. The future success of cloud computing depends on the effectiveness of instantiation of the infrastructure and utilization of available resources. Load Balancing ensures the fulfillment of these conditions to improve the cloud environment for the users. Load Balancing dynamically distributes the workload among the nodes in such a way that no single resource is either overwhelmed with tasks or underutilized. In this paper we propose a threshold based load balancing algorithm to ensure the equal distribution of the workload among the nodes. The main objective of the algorithms is to stop the VMs in the cloud being overloaded with tasks or being idle for lack allocation of tasks, when there are active tasks. We have simulated our proposed algorithm in the Cloudanalyst simulator with real world data scenarios. Simulation results shows that our proposed threshold based algorithm can provide a better response time for the task/requests and data processing time for the datacenters compared to the existing algorithms such as First Come First Serve (FCFS), Round Robin(RR) and Equally Spread Current Execution Load Balancing algorithm(ESCELB).more » « less
An official website of the United States government

