Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            The Dragonfly networks have been adopted in the current supercomputers, and will be deployed in future generation supercomputers and data centers. Effective routing on Dragonfly is challenging. Universal Globally Adaptive Load-balanced routing (UGAL) is the state-of-the-art routing algorithm for Dragonfly. For each packet, UGAL selects either a minimal path or a non-minimal path based on their estimated latencies. Practical UGAL makes routing decisions with local information, deriving the estimated latency for each path from the local queue occupancy and path hop count information. In this work, we develop techniques to improve the accuracy of the latency estimation for UGAL with local information, which results in more effective routing decisions. In particular, our schemes are able to proactively mitigate the potential network congestion with imbalanced network traffic. Extensive simulation experiments using synthetic traffic patterns and application workloads demonstrate that our enhanced UGAL schemes significantly improve the routing performance for many common traffic conditions.more » « less
- 
            null (Ed.)The Jellyfish network has recently been proposed as an alternative to the fat-tree network for data centers and high-performance computing clusters. Jellyfish uses a random regular graph as its switch-level topology and has shown to be more cost-effective than fat-trees. Effective routing on Jellyfish is challenging. It is known that shortest path routing and equal cost multi-path routing (ECMP) do not work well on Jellyfish. Existing schemes use variations of k-shortest path routing (KSP). In this work, we study two routing components for Jellyfish: path selection that decides the paths to route traffic, and routing mechanisms that decide which path to be used for each packet. We show that the performance of the existing KSP can be significantly improved by incorporating two heuristics, randomization and edge-disjointness. We evaluate a range of routing mechanisms, including traffic oblivious and traffic adaptive schemes, and identify an adaptive routing scheme with noticeably higher performance than others.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available