Fast Kronecker Matrix-Matrix Multiplications on GPUs (PPoPP 2024 - Main Conference)

Who

Abhinav Jangda, Mohit Yadav

Track

PPoPP 2024 Main Conference

Time Zone

The program is currently displayed in (GMT) London.

Use conference time zone: (GMT) LondonSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 6 Mar 2024 10:20 - 10:40 at Moorfoot - Linear Algebra Chair(s): I-Ting Angelina Lee

Abstract

Kronecker Matrix-Matrix Multiplication (Kron-Matmul) is the multiplication of a matrix with the Kronecker Product of several smaller matrices. Kron-Matmul is a core operation for many scientific and machine learning computations. State-of-the-art Kron-Matmul implementations utilize existing tensor algebra operations, such as matrix multiplication, transpose, and tensor matrix multiplication. However, this design choice prevents several Kron-Matmul specific optimizations, thus, leaving significant performance on the table. To address this issue, we present FastKron, an efficient technique for Kron-Matmul on single and multiple GPUs. FastKron is independent of linear algebra operations enabling several new optimizations for Kron-Matmul. Thus, it performs up to 8.50× and 4.15× faster than existing implementations on 1 and 16 GPUs respectively.

Link to Publication

https://dl.acm.org/doi/pdf/10.1145/3627535.3638489

DOI

https://doi.org/10.1145/3627535.3638489

Abhinav Jangda

Microsoft Research

United States

Mohit Yadav

University of Massachusetts Amherst