-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce really rough linalg matrix proposal #404
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,110 @@ | ||||||
<!-- {% raw %} --> | ||||||
|
||||||
# Linear Algebra Matrix | ||||||
|
||||||
* Proposal: [NNNN](NNNN-linalg-matrix.md) | ||||||
* Author(s): [Chris Bieneman](https://github.com/llvm-beanz) | ||||||
* Sponsor: TBD | ||||||
* Status: **Under Consideration** | ||||||
|
||||||
## Introduction | ||||||
|
||||||
GPUs are exceptional parallel data processors, but increasingly it is becoming | ||||||
important to model operations with cross-thread data dependencies. In HLSL and | ||||||
Direct3D these operations have been called Wave operations, or in Vulkan | ||||||
Subgroup operations. Related terms like Quad or derivatives have similar | ||||||
meaning in different scoping contexts. Vulkan has also recently introduced the | ||||||
term "cooperative" when talking about operations that require participation from | ||||||
multiple threads, these can be viewed much like derivative operations but across | ||||||
the full SIMD unit instead of a subset of threads. | ||||||
|
||||||
All of these terms refer to the way the underlying instructions execute, not | ||||||
necessarily what they do. One big part of this proposal is to take 5 steps back | ||||||
and talk about what they do: linear algebra. | ||||||
|
||||||
## Motivation | ||||||
|
||||||
HLSL has a Vulkan extension for SIMD matrix types [0021 - vk::Cooperative | ||||||
Matrix](0021-vk-coop-matrix.md), and DirectX had previewed a similar feature in | ||||||
SM 6.8 called [Wave Matrix](https://github.com/microsoft/hlsl-specs/pull/61). | ||||||
This proposal is aimed at merging the two into a unified language feature that | ||||||
can be supported on all platforms (with some platform-specific limitations). | ||||||
|
||||||
## Proposed solution | ||||||
|
||||||
Below is a proposed pseudo-HLSL API. The proposal uses C++20 concepts to | ||||||
represent template type constraints so as to avoid needing SFINAE complications. | ||||||
|
||||||
```c++ | ||||||
namespace hlsl { | ||||||
|
||||||
template <class T> | ||||||
concept ArithmeticScalar = std::is_arithmetic<T>::value; | ||||||
|
||||||
namespace linalg { | ||||||
|
||||||
template <typename ComponentTy, uint M, uint N> | ||||||
requires ArithmeticScalar<ComponentTy> | ||||||
class Matrix { | ||||||
template <typename NewCompTy> Matrix<NewCompTy, M, N> cast(); | ||||||
|
||||||
Matrix operator+(Matrix); | ||||||
Matrix operator-(Matrix); | ||||||
Matrix operator*(Matrix); | ||||||
Matrix operator/(Matrix); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. elementwise division, will add comments There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All of the operators are element wise aligning with our conventions for |
||||||
|
||||||
template <typename T> | ||||||
requires ArithmeticScalar<T> | ||||||
Matrix operator+(T); | ||||||
template <typename T> | ||||||
requires ArithmeticScalar<T> | ||||||
Matrix operator-(T); | ||||||
template <typename T> | ||||||
requires ArithmeticScalar<T> | ||||||
Matrix operator*(T); | ||||||
template <typename T> | ||||||
requires ArithmeticScalar<T> | ||||||
Matrix operator/(T); | ||||||
Comment on lines
+56
to
+67
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are these element-wise? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea, I'll add comments. This aligns with our convention for matrix and vector operators for other builtin types |
||||||
|
||||||
static Matrix Splat(ElTy Val); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, I suspect that this code might have used all of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh... yea I crossed some template wires. Let me do fix that. |
||||||
static Matrix Load(ByteAddressBuffer Res, uint StartOffset, uint Stride, | ||||||
bool ColMajor, uint Align = sizeof(ComponentTy)); | ||||||
static Matrix Load(RWByteAddressBuffer Res, uint StartOffset, uint Stride, | ||||||
bool ColMajor, uint Align = sizeof(ComponentTy)); | ||||||
|
||||||
static Matrix Load(groupshared ElTy Arr[], uint StartIdx, uint Stride, | ||||||
bool ColMajor); | ||||||
|
||||||
void Store(RWByteAddressBuffer Res, uint StartOffset, uint Stride, | ||||||
bool ColMajor, uint Align = sizeof(ComponentTy)); | ||||||
|
||||||
void Store(groupshared ElTy Arr[], uint StartIdx, uint Stride, bool ColMajor); | ||||||
|
||||||
void MultiplyAccumulate(const ref Matrix<T, N, M>); | ||||||
void SumAccumulate(const ref Matrix<T, N, M>); | ||||||
}; | ||||||
|
||||||
template <typename T, uint M, uint N, uint K> | ||||||
Matrix<T, M, N> Multiply(const ref Matrix<T, M, K>, const ref Matrix<T, K, N>); | ||||||
|
||||||
} // namespace linalg | ||||||
} // namespace hlsl | ||||||
``` | ||||||
|
||||||
## Detailed design | ||||||
|
||||||
## Outstanding Questions | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||
|
||||||
* Do we need a "scope" parameter or is it reasonable to assume Subgroup scope | ||||||
for all operations at least for an initial feature? | ||||||
* Do we need the usage to be part of the type? | ||||||
* Vulkan has a "use" template parameter, which serves a similar purpose to the | ||||||
D3D WaveMatrix "Left" and "Right" types. The compiler should be able to | ||||||
detect the usage and introduce memory shuffling automatically (with | ||||||
potential performance impact). | ||||||
* Do we need element-wise accessors? The Vulkan extension doesn't support | ||||||
element manipulations, but this has been identified as an important feature? | ||||||
* What will the DXIL representation look like? | ||||||
* This will be addressed in a separate proposal. | ||||||
|
||||||
<!-- {% endraw %} --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth calling out relationship to https://github.com/microsoft/hlsl-specs/blob/main/proposals/0031-hlsl-vector-matrix-operations.md?