SIMAI 2025

Feasibility and benefits of integration of Hierarchical Matrices in Deep Neural Network context

Mele, Valeria (University of Naples Federico II)
Carracciuolo, Luisa (Italian National Research Council - CNR)

In session: MS020 - Numerical Issues in Machine Learning for Dynamical Systems

Please login to view abstract download link

This work investigates the feasibility and potential benefits of integrating Hierarchical Matrices (H-matrices) into Deep Neural Networks, and especially in Graph Convolutional Deep Neural Networks (GC-DNNs), focusing on the computational aspects of matrix polynomial evaluation—a central operation in GC-DNN training and inference. The study is motivated by the growing demand for scalable methods in deep learning and the structural advantages offered by H-matrices, especially in the context of Exascale Computing. GC-DNNs process graph-structured data through spectral filtering, which involves applying a polynomial filter p_(K,theta)(L) of the graph Laplacian matrix L to node features. Two primary operations dominate the computational cost: the evaluation of the matrix polynomial p_(K,theta)(L) =$ \sum_{k=0}^K \theta_k L^k$, and the matrix-vector multiplication $y = p_{K,\theta}(L)s$. These operations become particularly expensive for large graphs. H-matrices address this by compressing matrix blocks via low-rank approximations and can lead to reduced storage requirements and lower computational complexity, potentially lowering asymptotic costs from O(m^2) to O(m \log m) or better. The paper also highlights the parallel structure of H-matrix algorithms through tree-based recursive procedures and presents a set of them in a sequential or parallel version. This characteristic of H-matrix algorithms supports efficient execution on modern hybrid architectures, including CPUs and GPUs: indeed, recursive decomposition and block-wise independence naturally enable fine-grained parallelism with minimal synchronization overhead. A theoretical estimate of the performance gains achievable through this integration is provided. Overall, this study suggests that H-matrices offer a promising pathway for improving the scalability and efficiency of graph-based deep learning models and sets the stage for further empirical and theoretical investigation.