Characterizing Node Orderings for Improved Performance

Carl Albing

High Performance Computing (HPC) job performance can vary because of location in the HPC system interconnect. Contiguous and compact allocations of compute nodes for parallel jobs is ideal but infeasible once other jobs have been placed. Non-contiguous allocations of individual nodes is ef- fective, but increased dispersal can decrease performance. Reasonable performance of parallel applications has been achieved through application placement in 3D-torus HPC systems using allocation strategies based on an ordered, one- dimensional sequence of nodes. Ordering of this list, node ordering, is an inexpensive way to incorporate topological information into the placement decision. With several or- derings from which to choose - and others that could be created - what is the basis for choosing one ordering over others? Can the choice be made with out expensive, time- consuming benchmarks?

A method is described in this paper for the evaluation of node orderings for grid and torus interconnects based on a measure of application compactness at locations through- out the ordering. Several node orderings are evaluated for actual HPC systems across three sizes. Results provide vi- sually compelling guidance on the choice of node ordering. The choice is dependent not only on system size but may also be dependent on typical job sizes. Some orderings can be seen to be a consistently poor choice regardless of job size; most orderings are seen to be better only for a range of job sizes. Rarely an ordering is consistently better over the full range of job sizes - but only for a certain size and shape of system interconnect. This evalution method pro- vides a cost-effective way to evaluate orderings and thereby provide generally improved application performance on HPC systems.