skip to main content


Search for: All records

Creators/Authors contains: "Zhang, Xinyang"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. A real-world text corpus sometimes comprises not only text documents, but also semantic links between them (e.g., academic papers in a bibliographic network are linked by citations and co-authorships). Text documents and semantic connections form a text-rich network, which empowers a wide range of downstream tasks such as classification and retrieval. However, pretraining methods for such structures are still lacking, making it difficult to build one generic model that can be adapted to various tasks on text-rich networks. Current pretraining objectives, such as masked language modeling, purely model texts and do not take inter-document structure information into consideration. To this end, we propose our PretrAining on TexT-Rich NetwOrk framework PATTON. PATTON1 includes two pretraining strategies: network-contextualized masked language modeling and masked node prediction, to capture the inherent dependency between textual attributes and network structure. We conduct experiments on four downstream tasks in five datasets from both academic and e-commerce domains, where PATTON outperforms baselines significantly and consistently. 
    more » « less
    Free, publicly-accessible full text available July 10, 2024
  2. Resistivity saturation is found on both superconducting and insulating sides of an “avoided” magnetic-field-tuned superconductor-to-insulator transition (H-SIT) in a two-dimensional In/InO x composite, where the anomalous metallic behavior cuts off conductivity or resistivity divergence in the zero-temperature limit. The granular morphology of the material implies a system of Josephson junctions (JJs) with a broad distribution of Josephson coupling E J and charging energy E C , with an H-SIT determined by the competition between E J and E C . By virtue of self-duality across the true H-SIT, we invoke macroscopic quantum tunneling effects to explain the temperature-independent resistance where the “failed superconductor” side is a consequence of phase fluctuations and the “failed insulator” side results from charge fluctuations. While true self-duality is lost in the avoided transition, its vestiges are argued to persist, owing to the incipient duality of the percolative nature of the dissipative path in the underlying random JJ system. 
    more » « less
  3. Safety and security play critical roles for the success of Autonomous Driving (AD) systems. Since AD systems heavily rely on AI components, the safety and security research of such components has also received great attention in recent years. While it is widely recognized that AI component-level (mis)behavior does not necessarily lead to AD system-level impacts, most of existing work still only adopts component-level evaluation. To fill such critical scientific methodology-level gap from component-level to real system-level impact, a system-driven evaluation platform jointly constructed by the community could be the solution. In this paper, we present PASS (Platform for Auto-driving Safety and Security), a system-driven evaluation prototype based on simulation. By sharing our platform building concept and preliminary efforts, we hope to call on the community to build a uniform and extensible platform to make AI safety and security work sufficiently meaningful at the system level. 
    more » « less
  4. null (Ed.)
    Text categorization is an essential task in Web content analysis. Considering the ever-evolving Web data and new emerging categories, instead of the laborious supervised setting, in this paper, we focus on the minimally-supervised setting that aims to categorize documents effectively, with a couple of seed documents annotated per category. We recognize that texts collected from the Web are often structure-rich, i.e., accompanied by various metadata. One can easily organize the corpus into a text-rich network, joining raw text documents with document attributes, high-quality phrases, label surface names as nodes, and their associations as edges. Such a network provides a holistic view of the corpus’ heterogeneous data sources and enables a joint optimization for network-based analysis and deep textual model training. We therefore propose a novel framework for minimally supervised categorization by learning from the text-rich network. Specifically, we jointly train two modules with different inductive biases – a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning. Each module generates pseudo training labels from the unlabeled document set, and both modules mutually enhance each other by co-training using pooled pseudo labels. We test our model on two real-world datasets. On the challenging e-commerce product categorization dataset with 683 categories, our experiments show that given only three seed documents per category, our framework can achieve an accuracy of about 92%, significantly outperforming all compared methods; our accuracy is only less than 2% away from the supervised BERT model trained on about 50K labeled documents. 
    more » « less
  5. null (Ed.)
  6. Abstract

    Many experiments investigating magnetic-field tuned superconductor-insulator transition (H-SIT) often exhibit low-temperature resistance saturation, which is interpreted as an anomalous metallic phase emerging from a ‘failed superconductor’, thus challenging conventional theory. Here we study a random granular array of indium islands grown on a gateable layer of indium-oxide. By tuning the intergrain couplings, we reveal a wide range of magnetic fields where resistance saturation is observed, under conditions of careful electromagnetic filtering and within a wide range of linear response. Exposure to external broadband noise or microwave radiation is shown to strengthen the tendency of superconductivity, where at low field a global superconducting phase is restored. Increasing magnetic field unveils an ‘avoided H-SIT’ that exhibits granularity-induced logarithmic divergence of the resistance/conductance above/below that transition, pointing to possible vestiges of the original emergent duality observed in a true H-SIT. We conclude that anomalous metallic phase is intimately associated with inherent inhomogeneities, exhibiting robust behavior at attainable temperatures for strongly granular two-dimensional systems.

     
    more » « less
  7. The magnetic-field–tuned superconductor-to-insulator transition was studied in a hybrid system of superconducting indium islands, deposited on an indium oxide (InOx) thin film, which exhibits global superconductivity at low magnetic fields. Vacuum annealing was used to tune the conductivity of the InOx film, thereby tuning the inergrain coupling and the nature of the transition. The hybrid system exhibits a “giant” magnetoresistance above the magnetic-field–tuned superconductor-to-insulator transition (H-SIT), with critical behavior similar to that of uniform InOx films but at much lower magnetic fields, that manifests the duality between Cooper pairs and vortices. A key feature of this hybrid system is the separation between the quantum criticality and the onset of nonequilibrium behavior.

     
    more » « less
  8. The automated construction of topic taxonomies can benefit numerous applications, including web search, recommendation, and knowledge discovery. One of the major advantages of automatic taxonomy construction is the ability to capture corpus-specific information and adapt to different scenarios. To better reflect the characteristics of a corpus, we take the meta-data of documents into consideration and view the corpus as a text-rich network. In this paper, we propose NetTaxo, a novel automatic topic taxonomy construction framework, which goes beyond the existing paradigm and allows text data to collaborate with network structure. Specifically, we learn term embeddings from both text and network as contexts. Network motifs are adopted to capture appropriate network contexts. We conduct an instance-level selection for motifs, which further refines term embedding according to the granularity and semantics of each taxonomy node. Clustering is then applied to obtain sub-topics under a taxonomy node. Extensive experiments on two real-world datasets demonstrate the superiority of our method over the state-of-the-art, and further verify the effectiveness and importance of instance-level motif selection. 
    more » « less