skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Graph Learning Indexer: A Contributor-Friendly and Metadata-Rich Platform for Graph Learning Benchmarks
Establishing open and general benchmarks has been a critical driving force behind the success of modern machine learning techniques. As machine learning is being applied to broader domains and tasks, there is a need to establish richer and more diverse benchmarks to better reflect the reality of the application scenarios. Graph learning is an emerging field of machine learning that urgently needs more and better benchmarks. To accommodate the need, we introduce Graph Learning Indexer (GLI), a benchmark curation platform for graph learning. In comparison to existing graph learning benchmark libraries, GLI highlights two novel design objectives. First, GLI is designed to incentivize dataset contributors. In particular, we incorporate various measures to minimize the effort of contributing and maintaining a dataset, increase the usability of the contributed dataset, as well as encourage attributions to different contributors of the dataset. Second, GLI is designed to curate a knowledge base, instead of a plain collection, of benchmark datasets. We use multiple sources of meta information to augment the benchmark datasets with rich characteristics, so that they can be easily selected and used in downstream research or development. The source code of GLI is available at https://github.com/Graph-Learning-Benchmarks/gli.  more » « less
Award ID(s):
1633370
PAR ID:
10388232
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the First Learning on Graphs Conference (LoG 2022)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Label Distribution Learning (LDL), as a more general learning setting than generic single-label and multi-label learning, has been commonly used in computer vision and many other applications. To date, existing LDL approaches are designed and applied to data without considering the interdependence between instances. In this paper, we propose a Graph Label Distribution Learning (GLDL) framework, which explicitly models three types of relationships: instance-instance, label-label, and instance-label, to learn the label distribution for networked data. A label-label network is learned to capture label-to-label correlation, through which GLDL can accurately learn label distributions for nodes. Dual graph convolution network (GCN) Co-training with heterogeneous message passing ensures two GCNs, one focusing on instance-instance relationship and the other one targeting label-label correlation, are jointly trained such that instance-instance relationship can help induce label-label correlation and vice versa. Our theoretical study derives the error bound of GLDL. For verification, four benchmark datasets with label distributions for nodes are created using common graph benchmarks. The experiments show that considering dependency helps learn better label distributions for networked data, compared to state-of-the-art LDL baseline. In addition, GLDL not only outperforms simple GCN and graph attention networks (GAT) using distribution loss but is also superior to its variant considering label-label relationship as a static network. GLDL and its benchmarks are the first research endeavors to address LDL for graphs. Code and benchmark data are released for public access. 
    more » « less
  2. We present the OPEN GRAPH BENCHMARK (OGB), a diverse set of challenging and realistic benchmark datasets to facilitate scalable, robust, and reproducible graph machine learning (ML) research. OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains, ranging from social and information networks to biological networks, molecular graphs, source code ASTs, and knowledge graphs. For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics. In addition to building the datasets, we also perform extensive benchmark experiments for each dataset. Our experiments suggest that OGB datasets present significant challenges of scalability to large-scale graphs and out-of-distribution generalization under realistic data splits, indicating fruitful opportunities for future research. Finally, OGB provides an automated end-to-end graph ML pipeline that simplifies and standardizes the process of graph data loading, experimental setup, and model evaluation. OGB will be regularly updated and welcomes inputs from the community. OGB datasets as well as data loaders, evaluation scripts, baseline code, and leaderboards are publicly available at https://ogb.stanford.edu. 
    more » « less
  3. We present the Temporal Graph Benchmark (TGB), a collection of challenging and diverse benchmark datasets for realistic, reproducible, and robust evaluation of machine learning models on temporal graphs. TGB datasets are of large scale, spanning years in duration, incorporate both node and edge-level prediction tasks and cover a diverse set of domains including social, trade, transaction, and transportation networks. For both tasks, we design evaluation protocols based on realistic use-cases. We extensively benchmark each dataset and find that the performance of common models can vary drastically across datasets. In addition, on dynamic node property prediction tasks, we show that simple methods often achieve superior performance compared to existing temporal graph models. We believe that these findings open up opportunities for future research on temporal graphs. Finally, TGB provides an automated machine learning pipeline for reproducible and accessible temporal graph research, including data loading, experiment setup and performance evaluation. TGB will be maintained and updated on a regular basis and welcomes community feedback. TGB datasets, data loaders, example codes, evaluation setup, and leaderboards are publicly available at https://tgb.complexdatalab.com/. 
    more » « less
  4. Graph Neural Networks (GNNs) extend the success of neural networks to graph-structured data by accounting for their intrinsic geometry. While extensive research has been done on developing GNN models with superior performance according to a collection of graph representation learning benchmarks, it is currently not well understood what aspects of a given model are probed by them. For example, to what extent do they test the ability of a model to leverage graph structure vs. node features? Here, we develop a principled approach to taxonomize benchmarking datasets according to a \textdollar \textit{sensitivity profile}\textdollar that is based on how much GNN performance changes due to a collection of graph perturbations. Our data-driven analysis provides a deeper understanding of which benchmarking data characteristics are leveraged by GNNs. Consequently, our taxonomy can aid in selection and development of adequate graph benchmarks, and better informed evaluation of future GNN methods. Finally, our approach is designed to be extendable to multiple graph prediction task types and future datasets. 
    more » « less
  5. Abstract Benchmark datasets and benchmark problems have been a key aspect for the success of modern machine learning applications in many scientific domains. Consequently, an active discussion about benchmarks for applications of machine learning has also started in the atmospheric sciences. Such benchmarks allow for the comparison of machine learning tools and approaches in a quantitative way and enable a separation of concerns for domain and machine learning scientists. However, a clear definition of benchmark datasets for weather and climate applications is missing with the result that many domain scientists are confused. In this paper, we equip the domain of atmospheric sciences with a recipe for how to build proper benchmark datasets, a (nonexclusive) list of domain-specific challenges for machine learning is presented, and it is elaborated where and what benchmark datasets will be needed to tackle these challenges. We hope that the creation of benchmark datasets will help the machine learning efforts in atmospheric sciences to be more coherent, and, at the same time, target the efforts of machine learning scientists and experts of high-performance computing to the most imminent challenges in atmospheric sciences. We focus on benchmarks for atmospheric sciences (weather, climate, and air-quality applications). However, many aspects of this paper will also hold for other aspects of the Earth system sciences or are at least transferable. Significance Statement Machine learning is the study of computer algorithms that learn automatically from data. Atmospheric sciences have started to explore sophisticated machine learning techniques and the community is making rapid progress on the uptake of new methods for a large number of application areas. This paper provides a clear definition of so-called benchmark datasets for weather and climate applications that help to share data and machine learning solutions between research groups to reduce time spent in data processing, to generate synergies between groups, and to make tool developments more targeted and comparable. Furthermore, a list of benchmark datasets that will be needed to tackle important challenges for the use of machine learning in atmospheric sciences is provided. 
    more » « less