Chasoň: Supporting Cross HBM Channel Data Migration to Enable Efficient Sparse Algebraic Acceleration

Abstract

High bandwidth memory (HBM) equipped sparse accelerators are emerging as a new class of accelerators that offer concurrent accesses to data and parallel execution to mitigate the memory bound behavior of sparse kernels. However, because of their underlying non-zero scheduling scheme, state-of-the-art HBM-based sparse accelerators (e.g., Serpens) suffer from high resource underutilization causing suboptimal performance, and inefficiency. To solve this challenge, we propose Chasoň, an HBM-based streaming accelerator for sparse kernels, specifically sparse matrix vector multiplication. Chasoň supports our novel non-zero scheduling scheme called Cross-HBM Channel out-of-order (OoO) Scheduling (CrHCS) to enable data migration across HBM channels and mitigate resource underutilization. We implement Chasoň on AMD Alveo U55C, achieving 301MHz clock frequency and evaluate it based on SuiteSparse and SNAP matrix collections. Chasoň improves the resource utilization and achieves up to 8 × , 20.33 × , 11.65 × , and 2.67 × performance improvement and 2.03 × , 34.72 × , 19.48 × , and 14.61 × better energy efficiency over Serpens, Nvidia RTX 4090, Nvidia RTX 6000 Ada, and Intel Core i9-11980HK, respectively. The source code of Chasoň is available at https://github.com/UbaidHunts/Chason.

Publication
Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture (MICRO 2025)
Amirmahdi Namjoo
Amirmahdi Namjoo
Bachelor of Science Student in Computer Engineering

My research interests include computer security and privacy, high performance computing, operating systems, computer architecture, and software engineering.