Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the recent neural language models. In particular, we review Topic-SCORE, a statistical approach to topic modeling, and discuss how to use it to analyze the Multi-Attribute Data Set on Statisticians (MADStat), a data set on statistical publications that we collected and cleaned. The application of Topic-SCORE and other methods to MADStat leads to interesting findings. For example, we identified 11 representative topics in statistics. For each journal, the evolution of topic weights over time can be visualized, and these results are used to analyze the trends in statistical research. In particular, we propose a new statistical model for ranking the citation impacts of 11 topics, and we also build a cross-topic citation graph to illustrate how research results on different topics spread to one another. The results on MADStat provide a data-driven picture of the statistical research from 1975 to 2015, from a text analysis perspective.
more »
« less
Natural and Artificial Dynamics in Graphs: Concept, Progress, and Future
Graph structures have attracted much research attention for carrying complex relational information. Based on graphs, many algorithms and tools are proposed and developed for dealing with real-world tasks such as recommendation, fraud detection, molecule design, etc. In this paper, we first discuss three topics of graph research, i.e., graph mining, graph representations, and graph neural networks (GNNs). Then, we introduce the definitions of natural dynamics and artificial dynamics in graphs, and the related works of natural and artificial dynamics about how they boost the aforementioned graph research topics, where we also discuss the current limitation and future opportunities.
more »
« less
- Award ID(s):
- 2117902
- PAR ID:
- 10441822
- Date Published:
- Journal Name:
- Frontiers in Big Data
- Volume:
- 5
- ISSN:
- 2624-909X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract It is known that there is a one-to-one mapping between oriented directed graphs and zero-sum replicator dynamics (Lotka–Volterra equations) and that furthermore these dynamics are Hamiltonian in an appropriately defined nonlinear Poisson bracket. In this paper, we investigate the problem of determining whether these dynamics are Liouville–Arnold integrable, building on prior work in graph decloning by Evripidouet al(2022J. Phys. A: Math. Theor.55325201) and graph embedding by Paik and Griffin (2024Phys. Rev.E107L052202). Using the embedding procedure from Paik and Griffin, we show (with certain caveats) that when a graph producing integrable dynamics is embedded in another graph producing integrable dynamics, the resulting graph structure also produces integrable dynamics. We also construct a new family of graph structures that produces integrable dynamics that does not arise either from embeddings or decloning. We use these results, along with numerical methods, to classify the dynamics generated by almost all oriented directed graphs on six vertices, with three hold-out graphs that generate integrable dynamics and are not part of a natural taxonomy arising from known families and graph operations. These hold-out graphs suggest more structure is available to be found. Moreover, the work suggests that oriented directed graphs leading to integrable dynamics may be classifiable in an analogous way to the classification of finite simple groups, creating the possibility that there is a deep connection between integrable dynamics and combinatorial structures in graphs.more » « less
-
Knowledge graphs are graph-based data models which can represent real-time data that is constantly growing with the addition of new information. The question-answering systems over knowledge graphs (KGQA) retrieve answers to a natural language question from the knowledge graph. Most existing KGQA systems use static knowledge bases for offline training. After deployment, they fail to learn from unseen new entities added to the graph. There is a need for dynamic algorithms which can adapt to the evolving graphs and give interpretable results. In this research work, we propose using new auction algorithms for question answering over knowledge graphs. These algorithms can adapt to changing environments in real-time, making them suitable for offline and online training. An auction algorithm computes paths connecting an origin node to one or more destination nodes in a directed graph and uses node prices to guide the search for the path. The prices are initially assigned arbitrarily and updated dynamically based on defined rules. The algorithm navigates the graph from the high-price to the low-price nodes. When new nodes and edges are dynamically added or removed in an evolving knowledge graph, the algorithm can adapt by reusing the prices of existing nodes and assigning arbitrary prices to the new nodes. For subsequent related searches, the “learned” prices provide the means to “transfer knowledge” and act as a “guide”: to steer it toward the lower-priced nodes. Our approach reduces the search computational effort by 60% in our experiments, thus making the algorithm computationally efficient. The resulting path given by the algorithm can be mapped to the attributes of entities and relations in knowledge graphs to provide an explainable answer to the query. We discuss some applications for which our method can be used.more » « less
-
On June 4-6, 2019, the NSTC NITRD Program, in collaboration with NSTC’s MLAI Subcommittee, held a workshop to assess the research challenges and opportunities at the intersection of cybersecurity and artificial intelligence. The workshop brought together senior members of the government, academic, and industrial communities to discuss the current state of the art and future research needs, and to identify key research gaps.This report is a summary of those discussions,framed around research questions and possible topics for future research directions.more » « less
-
Abstract Population structure affects the outcome of natural selection. These effects can be modeled using evolutionary games on graphs. Recently, conditions were derived for a trait to be favored under weak selection, on any weighted graph, in terms of coalescence times of random walks. Here we consider isothermal graphs, which have the same total edge weight at each node. The conditions for success on isothermal graphs take a simple form, in which the effects of graph structure are captured in the ‘effective degree’—a measure of the effective number of neighbors per individual. For two update rules (death-Birth and birth-Death), cooperative behavior is favored on a large isothermal graph if the benefit-to-cost ratio exceeds the effective degree. For two other update rules (Birth-death and Death-birth), cooperation is never favored. We relate the effective degree of a graph to its spectral gap, thereby linking evolutionary dynamics to the theory of expander graphs. Surprisingly, we find graphs of infinite average degree that nonetheless provide strong support for cooperation.more » « less
An official website of the United States government

