skip to main content


Search for: All records

Creators/Authors contains: "Fu, Xiao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The recent integration of deep learning and pairwise similarity annotation-based constrained clustering—i.e., deep constrained clustering (DCC)—has proven effective for incorporating weak supervision into massive data clustering: Less than 1% of pair similarity annotations can often substantially enhance the clustering accuracy. However, beyond empirical successes, there is a lack of understanding of DCC. In addition, many DCC paradigms are sensitive to annotation noise, but performance-guaranteed noisy DCC methods have been largely elusive. This work first takes a deep look into a recently emerged logistic loss function of DCC, and characterizes its theoretical properties. Our result shows that the logistic DCC loss ensures the identifiability of data membership under reasonable conditions, which may shed light on its effectiveness in practice. Building upon this understanding, a new loss function based on geometric factor analysis is proposed to fend against noisy annotations. It is shown that even under unknown annotation confusions, the data membership can still be provably identified under our proposed learning criterion. The proposed approach is tested over multiple datasets to validate our claims.} 
    more » « less
    Free, publicly-accessible full text available July 23, 2024
  2. Dipyridyl molecular junctions often show intriguing conductance switching behaviors with mechanical modulations, but the mechanisms are still not completely revealed. By applying the ab initio -based adiabatic simulation method, the configuration evolution and electron transport properties of dipyridyl molecular junctions in stretching and compressing processes are systematically investigated. The numerical results reveal that the dipyridyl molecular junctions tend to form specific contact configurations during formation processes. In small electrode gaps, the pyridyls almost vertically adsorb on the second Au layers of the tip electrodes by pushing the top Au atoms aside. These specific contact configurations result in stronger molecule–electrode couplings and larger electronic incident cross-sectional areas, which consequently lead to large breaking forces and high conductance. On further elongating the molecular junctions, the pyridyls shift to the top Au atoms of the tip electrodes. The additional scattering of the top Au atoms dramatically decreases the conductance and switches the molecular junctions to the lower conductive states. Perfect cyclical conductance switches are obtained as observed in the experiments by repeatedly stretching and compressing the molecular junctions. The O atom in the side-group tends to hinder the pyridyl from adsorbing on the second Au layer and further inhibits the conductance switch of the dipyridyl molecular junction. 
    more » « less
    Free, publicly-accessible full text available August 3, 2024
  3. Using noisy crowdsourced labels from multiple annotators, a deep learning-based end-to-end (E2E) system aims to learn the label correction mechanism and the neural classifier simultaneously. To this end, many E2E systems concatenate the neural classifier with multiple annotator-specific label confusion layers and co-train the two parts in a parameter-coupled manner. The formulated coupled cross-entropy minimization (CCEM)-type criteria are intuitive and work well in practice. Nonetheless, theoretical understanding of the CCEM criterion has been limited. The contribution of this work is twofold: First, performance guarantees of the CCEM criterion are presented. Our analysis reveals for the first time that the CCEM can indeed correctly identify the annotators' confusion characteristics and the desired ``ground-truth'' neural classifier under realistic conditions, e.g., when only incomplete annotator labeling and finite samples are available. Second, based on the insights learned from our analysis, two regularized variants of the CCEM are proposed. The regularization terms provably enhance the identifiability of the target model parameters in various more challenging cases. A series of synthetic and real data experiments are presented to showcase the effectiveness of our approach. 
    more » « less
  4. Unsupervised mixture learning (UML) aims at identifying linearly or nonlinearly mixed latent components in a blind manner. UML is known to be challenging: Even learning linear mixtures requires highly nontrivial analytical tools, e.g., independent component analysis or nonnegative matrix factorization. In this work, the post-nonlinear (PNL) mixture model---where {\it unknown} element-wise nonlinear functions are imposed onto a linear mixture---is revisited. The PNL model is widely employed in different fields ranging from brain signal classification, speech separation, remote sensing, to causal discovery. To identify and remove the unknown nonlinear functions, existing works often assume different properties on the latent components (e.g., statistical independence or probability-simplex structures). This work shows that under a carefully designed UML criterion, the existence of a nontrivial {\it null space} associated with the underlying mixing system suffices to guarantee identification/removal of the unknown nonlinearity. Compared to prior works, our finding largely relaxes the conditions of attaining PNL identifiability, and thus may benefit applications where no strong structural information on the latent components is known. A finite-sample analysis is offered to characterize the performance of the proposed approach under realistic settings. To implement the proposed learning criterion, a block coordinate descent algorithm is proposed. A series of numerical experiments corroborate our theoretical claims. 
    more » « less
  5. This paper focuses on downlink channel state information (CSI) acquisition. A frequency division duplex (FDD) of massive MIMO system is considered. In such systems, the base station (BS) obtains the downlink CSI from the mobile users' feedback. A key consideration is to reduce the feedback overhead while ensuring that the BS accurately recovers the downlink CSI. Existing approaches often resort to dictionary-based or tensor/matrix decomposition techniques, which either exhibit unsatisfactory accuracy or induce heavy computational load at the mobile end. To circumvent these challenges, this work formulates the limited channel feedback problem as a quantized and compressed matrix recovery problem. The formulation presents a computationally challenging maximum likelihood estimation (MLE) problem. An ADMM algorithm leveraging existing harmonic retrieval tools is proposed to effectively tackle the optimization problem. Simulations show that the proposed method attains promising channel estimation accuracy, using a much smaller amount of feedback bits relative to existing methods. 
    more » « less
  6. Nonlinear independent component analysis (nICA) aims at recovering statistically independent latent components that are mixed by unknown nonlinear functions. Central to nICA is the identifiability of the latent components, which had been elusive until very recently. Specifically, Hyvärinen et al. have shown that the nonlinearly mixed latent components are identifiable (up to often inconsequential ambiguities) under a generalized contrastive learning (GCL) formulation, given that the latent components are independent conditioned on a certain auxiliary variable. The GCL-based identifiability of nICA is elegant, and establishes interesting connections between nICA and popular unsupervised/self-supervised learning paradigms in representation learning, causal learning, and factor disentanglement. However, existing identifiability analyses of nICA all build upon an unlimited sample assumption and the use of ideal universal function learners—which creates a non-negligible gap between theory and practice. Closing the gap is a nontrivial challenge, as there is a lack of established “textbook” routine for finite sample analysis of such unsupervised problems. This work puts forth a finite-sample identifiability analysis of GCL-based nICA. Our analytical framework judiciously combines the properties of the GCL loss function, statistical generalization analysis, and numerical differentiation. Our framework also takes the learning function’s approximation error into consideration, and reveals an intuitive trade-off between the complexity and expressiveness of the employed function learner. Numerical experiments are used to validate the theorems. 
    more » « less