skip to main content

Search for: All records

Creators/Authors contains: "Chen, Dong"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available October 3, 2023
  2. Free, publicly-accessible full text available August 1, 2023
  3. Free, publicly-accessible full text available January 1, 2024
  4. Data movement is a common performance bottleneck, and its chief remedy is caching. Traditional cache management is transparent to the workload: data that should be kept in cache are determined by the recency information only, while the program information, i.e., future data reuses, is not communicated to the cache. This has changed in a new cache design named Lease Cache . The program control is passed to the lease cache by a compiler technique called Compiler Assigned Reference Lease (CARL). This technique collects the reuse interval distribution for each reference and uses it to compute and assign the lease value to each reference. In this article, we prove that CARL is optimal under certain statistical assumptions. Based on this optimality, we prove miss curve convexity, which is useful for optimizing shared cache, and sub-partitioning monotonicity, which simplifies lease compilation. We evaluate the potential using scientific kernels from PolyBench and show that compiler insertions of up to 34 leases in program code achieve similar or better cache utilization (in variable size cache) than the optimal fixed-size caching policy, which has been unattainable with automatic caching but now within the potential of cache programming for all tested programs and most cache sizes.
  5. Abstract Background

    More details about human movement patterns are needed to evaluate relationships between daily travel and malaria risk at finer scales. A multiagent mobility simulation model was built to simulate the movements of villagers between home and their workplaces in 2 townships in Myanmar.


    An agent-based model (ABM) was built to simulate daily travel to and from work based on responses to a travel survey. Key elements for the ABM were land cover, travel time, travel mode, occupation, malaria prevalence, and a detailed road network. Most visited network segments for different occupations and for malaria-positive cases were extracted and compared. Data from a separate survey were used to validate the simulation.


    Mobility characteristics for different occupation groups showed that while certain patterns were shared among some groups, there were also patterns that were unique to an occupation group. Forest workers were estimated to be the most mobile occupation group, and also had the highest potential malaria exposure associated with their daily travel in Ann Township. In Singu Township, forest workers were not the most mobile group; however, they were estimated to visit regions that had higher prevalence of malaria infection over other occupation groups.


    Using an ABM to simulate dailymore »travel generated mobility patterns for different occupation groups. These spatial patterns varied by occupation. Our simulation identified occupations at a higher risk of being exposed to malaria and where these exposures were more likely to occur.

    « less
  6. Abstract Accurate theoretical predictions of desired properties of materials play an important role in materials research and development. Machine learning (ML) can accelerate the materials design by building a model from input data. For complex datasets, such as those of crystalline compounds, a vital issue is how to construct low-dimensional representations for input crystal structures with chemical insights. In this work, we introduce an algebraic topology-based method, called atom-specific persistent homology (ASPH), as a unique representation of crystal structures. The ASPH can capture both pairwise and many-body interactions and reveal the topology-property relationship of a group of atoms at various scales. Combined with composition-based attributes, ASPH-based ML model provides a highly accurate prediction of the formation energy calculated by density functional theory (DFT). After training with more than 30,000 different structure types and compositions, our model achieves a mean absolute error of 61 meV/atom in cross-validation, which outperforms previous work such as Voronoi tessellations and Coulomb matrix method using the same ML algorithm and datasets. Our results indicate that the proposed topology-based method provides a powerful computational tool for predicting materials properties compared to previous works.
  7. null (Ed.)
    Abstract The ability of molecular property prediction is of great significance to drug discovery, human health, and environmental protection. Despite considerable efforts, quantitative prediction of various molecular properties remains a challenge. Although some machine learning models, such as bidirectional encoder from transformer, can incorporate massive unlabeled molecular data into molecular representations via a self-supervised learning strategy, it neglects three-dimensional (3D) stereochemical information. Algebraic graph, specifically, element-specific multiscale weighted colored algebraic graph, embeds complementary 3D molecular information into graph invariants. We propose an algebraic graph-assisted bidirectional transformer (AGBT) framework by fusing representations generated by algebraic graph and bidirectional transformer, as well as a variety of machine learning algorithms, including decision trees, multitask learning, and deep neural networks. We validate the proposed AGBT framework on eight molecular datasets, involving quantitative toxicity, physical chemistry, and physiology datasets. Extensive numerical experiments have shown that AGBT is a state-of-the-art framework for molecular property prediction.