Achieving the Performance of Global Adaptive Routing using Local Information on Dragonfly through Deep Learning

Chaulagain, Ram Sharan; Liza, Fatema Tabassum; Chunduri, Sudheer; Yuan, Xin; Lang, Michael

Citation Details

he Universal Globally Adaptive Load-balance Routing (UGAL) with global information, referred as UGAL-G, represents an ideal form of adaptive routing on Dragonfly. UGAL-G is impractical to implement, however, since the global information cannot be maintained accurately. Practical adaptive routing schemes, such as UGAL with local information (UGAL-L), performs noticeably worse than UGAL-G. In this work, we investigate a machine learning approach for routing on Dragonfly. Specifically, we develop a machine learning-based routing scheme, called UGAL-ML, that is capable of making routing decisions like UGAL-G based only on the information local to each router. Our preliminary evaluation indicates that UGAL-ML can achieve comparable performance to UGAL-G for some traffic patterns. more »

Award ID(s):: 1822737

PAR ID:: 10231745

Author(s) / Creator(s):: Chaulagain, Ram Sharan; Liza, Fatema Tabassum; Chunduri, Sudheer; Yuan, Xin; Lang, Michael

Date Published:: 2020-11-23

Journal Name:: ACM/IEEE SC tech poster

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this