skip to main content


Search for: All records

Creators/Authors contains: "Zheng, Yan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Word vector embeddings have been shown to contain and amplify biases in the data they are extracted from. Consequently, many techniques have been proposed to identify, mitigate, and attenuate these biases in word representations. In this paper, we utilize interactive visualization to increase the interpretability and accessibility of a collection of state-of-the-art debiasing techniques. To aid this, we present the Visualization of Embedding Representations for deBiasing (“VERB”) system, an open-source web-based visualization tool that helps users gain a technical understanding and visual intuition of the inner workings of debiasing techniques, with a focus on their geometric properties. In particular, VERB offers easy-to-follow examples that explore the effects of these debiasing techniques on the geometry of high-dimensional word vectors. To help understand how various debiasing techniques change the underlying geometry, VERB decomposes each technique into interpretable sequences of primitive transformations and highlights their effect on the word vectors using dimensionality reduction and interactive visual exploration. VERB is designed to target natural language processing (NLP) practitioners who are designing decision-making systems on top of word embeddings, and also researchers working with the fairness and ethics of machine learning systems in NLP. It can also serve as a visual medium for education, which helps an NLP novice understand and mitigate biases in word embeddings. 
    more » « less
    Free, publicly-accessible full text available January 1, 2025
  2. Free, publicly-accessible full text available December 15, 2024
  3. Background and Aims Rice accounts for around 20% of the calories consumed by humans. Essential nutrients like zinc (Zn) are crucial for rice growth and for populations relying on rice as a staple food. No well-established study method exists. As a result, we a lack a clear picture of the chemical forms of zinc in rice grain. Furthermore, we do not understand the effects of widespread and variable zinc deficiency in soils on the Zn speciation, and to a lesser extent, its concentration, in grain. Methods The composition and Zn speciation of Cambodian rice grain is analyzed using synchrotron-based microprobe X-ray fluorescence (µ-XRF) and extended X-ray absorption fine-structure spectroscopy (EXAFS). We developed a method to quantify Zn species in different complexes based on the coordination numbers of Zn to oxygen and sulfur at characteristic bond lengths. Results Zn levels in brown rice grain ranged between 15-30 mg kg-1 and were not correlated to Zn availability in soils. 72%-90% of Zn in rice grains is present as Zn-phytate, generally not bioavailable, while smaller quantities of Zn are bound as labile nicotianamine complexes, Zn minerals like ZnCO3¬ or thiols. Conclusion Zn speciation in rice grain is affected by Zn deficiency more than previously recognized. A majority of Zn was bound in phytate complexes in rice grain. Zinc phytate complexes were found in higher concentrations and also in higher proportions, in Zn-deficient soils, consistent with increased phytate production under Zn deficiency. Phytates are generally not bioavailable to humans, so low soil Zn fertility may not only impact grain yields, but also decrease the fraction of grain Zn bioavailable to human consumers. The potential impact of abundant Zn-phytate in environments deficient in Zn on human bioavailability and Zn deficiency requires additional research. 
    more » « less
    Free, publicly-accessible full text available October 1, 2024
  4. Word vector embeddings have been shown to contain and amplify biases in the data they are extracted from. Consequently, many techniques have been proposed to identify, mitigate, and attenuate these biases in word representations. In this paper, we utilize interactive visualization to increase the interpretability and accessibility of a collection of state-of-the-art debiasing techniques. To aid this, we present the Visualization of Embedding Representations for deBiasing (“VERB”) system, an open-source web-based visualization tool that helps users gain a technical understanding and visual intuition of the inner workings of debiasing techniques, with a focus on their geometric properties. In particular, VERB offers easy-to-follow examples that explore the effects of these debiasing techniques on the geometry of high-dimensional word vectors. To help understand how various debiasing techniques change the underlying geometry, VERB decomposes each technique into interpretable sequences of primitive transformations and highlights their effect on the word vectors using dimensionality reduction and interactive visual exploration. VERB is designed to target natural language processing (NLP) practitioners who are designing decision-making systems on top of word embeddings, and also researchers working with the fairness and ethics of machine learning systems in NLP. It can also serve as a visual medium for education, which helps an NLP novice understand and mitigate biases in word embeddings. 
    more » « less
  5. Bougainvillea Comm. ex Juss. is one of the renowned genera in the Nyctaginaceae, but despite its recognized horticultural value, the taxonomy and phylogeny of the genus is not well-studied. Phylogenetic reconstructions based on plastid genomes showed that B. pachyphylla and B. peruviana are basal taxa, while B. spinosa is sister to two distinct clades: the predominantly cultivated Bougainvillea clade (B. spectabilis, B. glabra, B. arborea, B. cultivar, B. praecox) and the clade containing wild species of Bougainvillea (B. berberidifolia, B. campanulata, B. infesta, B. modesta, B. luteoalba, B. stipitata, and B. stipitata var. grisebachiana). Early divergence of B. peruviana, B. pachyphylla and B. spinosa is highly supported, thus the previously proposed division of Bougainvillea into two subgenera (Bougainvillea and Tricycla) was not reflected in this study. Morphological analysis also revealed that leaf arrangement, size, and indumentum together with the perianth tube and anthocarp shape and indumentum are important characteristics in differentiating the species of Bougainvillea. In the present study, 11 species and one variety are recognized in Bougainvillea. Six names are newly reduced to synonymy, and lectotypes are designated for 27 names. In addition, a revised identification key and illustrations of the distinguishing parts are also provided in the paper. 
    more » « less