An Initial Evaluation of Arm’s Scalable Matrix Extension

Finn Wilkinson, Simon McIntosh-Smith
University of Bristol, UK

Expanding upon their Scalable Vector Extension (SVE), Arm have introduced the Scalable Matrix Extension (SME) to improve in-core performance for matrix operations such as matrix multiplication. With the lack of hardware and cycle-accurate simulations available which supports SME, it is unclear how effective this new instruction set extension will be, and for what type of applications it will provide the most benefit.

By adapting The Simulation Engine (SimEng) from the University of Bristol’s High Performance Computing Group to support SME, we aim to compare the simulated performance of a Fujitsu A64FX core (with native SVE support) to a like-for-like hypothetical core with added SME support. By simulating a wide range of Streaming Vector Lengths for our hypothetical SME core model, we provide and discuss first-of-a-kind results for an SME implementation, before discussing future work that will be carried out to further evaluate the suitability of SME.