- NSF-PAR ID:
- 10088599
- Date Published:
- Journal Name:
- Proceedings of the Association for Information Science and Technology
- Volume:
- 55
- Issue:
- 1
- ISSN:
- 2373-9231
- Page Range / eLocation ID:
- 337 to 346
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Open Educational Resources (OER) are widely used instructional materials that are freely available and promote equitable access. OER research at the undergraduate level largely focuses on measuring student experiences with using the low cost resources, and instructor awareness of resources and perceived barriers to use. Little is known about how instructors work with materials based on their unique teaching context. To explore how instructors engage with OER, we surveyed users of CourseSource , an open-access, peer-reviewed journal that publishes lessons primarily for undergraduate biology courses. We asked questions aligned with the OER life cycle, which is a framework that includes the phases: Search , Evaluation , Adaptation , Use , and Share . The results show that OER users come from a variety of institution types and positions, generally have positions that focus more on teaching than research, and use scientific teaching practices. To determine how instructors engage throughout the OER life cycle, we examined the frequency of survey responses. Notable trends include that instructors search and evaluate OER based on alignment to course needs, quality of the materials, and ease of implementation. In addition, instructors frequently modify the published materials for their classroom context and use them in a variety of course environments. The results of this work can help developers design current and future OER repositories to better coincide with undergraduate instructor needs and aid content producers in creating materials that encourage implementation by their colleagues.more » « less
-
Abstract The Twin Falls, Idaho wastewater treatment plant (WWTP), currently operates solely to achieve regulatory permit compliance. Research was conducted to evaluate conversion of the WWTP to a water resource recovery facility (WRRF) and to assess the WRRF environmental sustainability; process configurations were evaluated to produce five resources—reclaimed water, biosolids, struvite, biogas, and bioplastics (polyhydroxyalkanoates, PHA). PHA production occurred using fermented dairy manure. State‐of‐the‐art biokinetic modeling, performed using Dynamita's SUMO process model, was coupled with environmental life cycle assessment to quantify environmental sustainability. Results indicate that electricity production via combined heat and power (CHP) was most important in achieving environmental sustainability; energy offset ranged from 43% to 60%, thereby reducing demand for external fossil fuel‐based energy. While struvite production helps maintain a resilient enhanced biological phosphorus removal (EBPR) process, MgO2production exhibits negative environmental impacts; integration with CHP negates the adverse consequences. Integrating dairy manure to produce bioplastics diversifies the resource recovery portfolio while maintaining WRRF environmental sustainability; pilot‐scale evaluations demonstrated that WRRF effluent quality was not affected by the addition of effluent from PHA production. Collectively, results show that a WRRF integrating dairy manure can yield a diverse portfolio of products while operating in an environmentally sustainable manner.
Practitioner points Wastewater carbon recovery via anaerobic digestion with combined heat/power production significantly reduces water resource recovery facility (WRRF) environmental emissions.
Wastewater phosphorus recovery is of value; however, struvite production exhibits negative environmental impacts due to MgO2production emissions.
Bioplastics production on imported organic‐rich agri‐food waste can diversify the WRRF portfolio.
Dairy manure can be successfully integrated into a WRRF for bioplastics production without compromising WRRF performance.
Diversifying the WRRF products portfolio is a strategy to maximize resource recovery from wastewater while concurrently achieving environmental sustainability.
-
Recently, the Allen Institute for Artificial Intelligence released the Semantic Scholar Open Research Corpus (S2ORC), one of the largest open-access scholarly big datasets with more than 130 million schol- arly paper records. S2ORC contains a significant portion of automat- ically generated metadata. The metadata quality could impact down- stream tasks such as citation analysis, citation prediction, and link analysis. In this project, we assess the document linking quality and estimate the document conflation rate for the S2ORC dataset. Using semi-automatically curated ground truth corpora, we estimated that the overall document linking quality is high, with 92.6% of documents correctly linking to six major databases, but the linking quality varies depending on subject domains. The document confla- tion rate is around 2.6%, meaning that about 97.4% of documents are unique. We further quantitatively compared three near-duplicate detection methods using the ground truth created from S2ORC. The experiments indicated that locality-sensitive hashing was the best method in terms of effectiveness and scalability, achieving high performance (F1=0.960) and a much reduced runtime. Our code and data are available at https://github.com/lamps-lab/docconflation.more » « less
-
Automatically extracted metadata from scholarly documents in PDF formats is usually noisy and heterogeneous, often containing incomplete fields and erroneous values. One common way of cleaning metadata is to use a bibliographic reference dataset. The challenge is to match records between corpora with high precision. The existing solution which is based on information retrieval and string similarity on titles works well only if the titles are cleaned. We introduce a system designed to match scholarly document entities with noisy metadata against a reference dataset. The blocking function uses the classic BM25 algorithm to find the matching candidates from the reference data that has been indexed by ElasticSearch. The core components use supervised methods which combine features extracted from all available metadata fields. The system also leverages available citation information to match entities. The combination of metadata and citation achieves high accuracy that significantly outperforms the baseline method on the same test dataset. We apply this system to match the database of CiteSeerX against Web of Science, PubMed, and DBLP. This method will be deployed in the CiteSeerX system to clean metadata and link records to other scholarly big datasets.more » « less