skip to main content


Title: Forecasting the future of artificial intelligence with machine learning-based link prediction in an exponentially growing knowledge network
Abstract

A tool that could suggest new personalized research directions and ideas by taking insights from the scientific literature could profoundly accelerate the progress of science. A field that might benefit from such an approach is artificial intelligence (AI) research, where the number of scientific publications has been growing exponentially over recent years, making it challenging for human researchers to keep track of the progress. Here we use AI techniques to predict the future research directions of AI itself. We introduce a graph-based benchmark based on real-world data—the Science4Cast benchmark, which aims to predict the future state of an evolving semantic network of AI. For that, we use more than 143,000 research papers and build up a knowledge network with more than 64,000 concept nodes. We then present ten diverse methods to tackle this task, ranging from pure statistical to pure learning methods. Surprisingly, the most powerful methods use a carefully curated set of network features, rather than an end-to-end AI approach. These results indicate a great potential that can be unleashed for purely ML approaches without human knowledge. Ultimately, better predictions of new future research directions will be a crucial component of more advanced research suggestion tools.

 
more » « less
NSF-PAR ID:
10469383
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Nature Machine Intelligence
Volume:
5
Issue:
11
ISSN:
2522-5839
Format(s):
Medium: X Size: p. 1326-1335
Size(s):
["p. 1326-1335"]
Sponsoring Org:
National Science Foundation
More Like this
  1. A tool that could suggest new personalized research directions and ideas by taking insights from the scientific literature could significantly accelerate the progress of science. A field that might benefit from such an approach is artificial intelligence (AI) research, where the number of scientific publications has been growing exponentially over the last years, making it challenging for human researchers to keep track of the progress. Here, we use AI techniques to predict the future research directions of AI itself. We develop a new graph-based benchmark based on real-world data -- the Science4Cast benchmark, which aims to predict the future state of an evolving semantic network of AI. For that, we use more than 100,000 research papers and build up a knowledge network with more than 64,000 concept nodes. We then present ten diverse methods to tackle this task, ranging from pure statistical to pure learning methods. Surprisingly, the most powerful methods use a carefully curated set of network features, rather than an end-to-end AI approach. It indicates a great potential that can be unleashed for purely ML approaches without human knowledge. Ultimately, better predictions of new future research directions will be a crucial component of more advanced research suggestion tools. 
    more » « less
  2. Abstract

    Near‐term freshwater forecasts, defined as sub‐daily to decadal future predictions of a freshwater variable with quantified uncertainty, are urgently needed to improve water quality management as freshwater ecosystems exhibit greater variability due to global change. Shifting baselines in freshwater ecosystems due to land use and climate change prevent managers from relying on historical averages for predicting future conditions, necessitating near‐term forecasts to mitigate freshwater risks to human health and safety (e.g., flash floods, harmful algal blooms) and ecosystem services (e.g., water‐related recreation and tourism). To assess the current state of freshwater forecasting and identify opportunities for future progress, we synthesized freshwater forecasting papers published in the past 5 years. We found that freshwater forecasting is currently dominated by near‐term forecasts of waterquantityand that near‐term waterqualityforecasts are fewer in number and in the early stages of development (i.e., non‐operational) despite their potential as important preemptive decision support tools. We contend that more freshwater quality forecasts are critically needed and that near‐term water quality forecasting is poised to make substantial advances based on examples of recent progress in forecasting methodology, workflows, and end‐user engagement. For example, current water quality forecasting systems can predict water temperature, dissolved oxygen, and algal bloom/toxin events 5 days ahead with reasonable accuracy. Continued progress in freshwater quality forecasting will be greatly accelerated by adapting tools and approaches from freshwater quantity forecasting (e.g., machine learning modeling methods). In addition, future development of effective operational freshwater quality forecasts will require substantive engagement of end users throughout the forecast process, funding, and training opportunities. Looking ahead, near‐term forecasting provides a hopeful future for freshwater management in the face of increased variability and risk due to global change, and we encourage the freshwater scientific community to incorporate forecasting approaches in water quality research and management.

     
    more » « less
  3. Abstract

    Although systematic reviews are intended to provide trusted scientific knowledge to meet the needs of decision-makers, their reliability can be threatened by bias and irreproducibility. To help decision-makers assess the risks in systematic reviews that they intend to use as the foundation of their action, we designed and tested a new approach to analyzing the evidence selection of a review: its coverage of the primary literature and its comparison to other reviews. Our approach could also help anyone using or producing reviews understand diversity or convergence in evidence selection. The basis of our approach is a new network construct called the inclusion network, which has two types of nodes: primary study reports (PSRs, the evidence) and systematic review reports (SRRs). The approach assesses risks in a given systematic review (the target SRR) by first constructing an inclusion network of the target SRR and other systematic reviews studying similar research questions (the companion SRRs) and then applying a three-step assessment process that utilizes visualizations, quantitative network metrics, and time series analysis. This paper introduces our approach and demonstrates it in two case studies. We identified the following risks: missing potentially relevant evidence, epistemic division in the scientific community, and recent instability in evidence selection standards. We also compare our inclusion network approach to knowledge assessment approaches based on another influential network construct, the claim-specific citation network, discuss current limitations of the inclusion network approach, and present directions for future work.

     
    more » « less
  4. null (Ed.)
    While scientific collaboration is critical for a scholar, some collaborators can be more significant than others, e.g., lifetime collaborators. It has been shown that lifetime collaborators are more influential on a scholar’s academic performance. However, little research has been done on investigating predicting such special relationships in academic networks. To this end, we propose Scholar2vec, a novel neural network embedding for representing scholar profiles. First, our approach creates scholars’ research interest vector from textual information, such as demographics, research, and influence. After bridging research interests with a collaboration network, vector representations of scholars can be gained with graph learning. Meanwhile, since scholars are occupied with various attributes, we propose to incorporate four types of scholar attributes for learning scholar vectors. Finally, the early-stage similarity sequence based on Scholar2vec is used to predict lifetime collaborators with machine learning methods. Extensive experiments on two real-world datasets show that Scholar2vec outperforms state-of-the-art methods in lifetime collaborator prediction. Our work presents a new way to measure the similarity between two scholars by vector representation, which tackles the knowledge between network embedding and academic relationship mining. 
    more » « less
  5. GeoAI, or geospatial artificial intelligence, has become a trending topic and the frontier for spatial analytics in Geography. Although much progress has been made in exploring the integration of AI and Geography, there is yet no clear definition of GeoAI, its scope of research, or a broad discussion of how it enables new ways of problem solving across social and environmental sciences. This paper provides a comprehensive overview of GeoAI research used in large-scale image analysis, and its methodological foundation, most recent progress in geospatial applications, and comparative advantages over traditional methods. We organize this review of GeoAI research according to different kinds of image or structured data, including satellite and drone images, street views, and geo-scientific data, as well as their applications in a variety of image analysis and machine vision tasks. While different applications tend to use diverse types of data and models, we summarized six major strengths of GeoAI research, including (1) enablement of large-scale analytics; (2) automation; (3) high accuracy; (4) sensitivity in detecting subtle changes; (5) tolerance of noise in data; and (6) rapid technological advancement. As GeoAI remains a rapidly evolving field, we also describe current knowledge gaps and discuss future research directions. 
    more » « less