-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU][docs] Replace gfx940 and gfx941 with gfx942 in llvm/docs #126887
Conversation
@llvm/pr-subscribers-backend-amdgpu Author: Fabian Ritter (ritter-x2a) Changesgfx940 and gfx941 are no longer supported. This is one of a series of This PR removes all documentation occurrences of gfx940/gfx941 except For SWDEV-512631 Patch is 24.54 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/126887.diff 2 Files Affected:
diff --git a/llvm/docs/AMDGPUOperandSyntax.rst b/llvm/docs/AMDGPUOperandSyntax.rst
index ff6ec6cf71ff2..e8a76322fe76a 100644
--- a/llvm/docs/AMDGPUOperandSyntax.rst
+++ b/llvm/docs/AMDGPUOperandSyntax.rst
@@ -63,7 +63,7 @@ Note: *N* and *K* must satisfy the following conditions:
* 0 <= *K* <= 255.
* *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
-GFX90A and GFX940 have an additional alignment requirement:
+GFX90A and GFX942 have an additional alignment requirement:
pairs of *vector* registers must be even-aligned
(first register must be even).
@@ -183,7 +183,7 @@ Note: *N* and *K* must satisfy the following conditions:
* 0 <= *K* <= 255.
* *K-N+1* must be in the range from 1 to 12 or equal to 16 or 32.
-GFX90A and GFX940 have an additional alignment requirement:
+GFX90A and GFX942 have an additional alignment requirement:
pairs of *accumulator* registers must be even-aligned
(first register must be even).
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 83ec1eecb6e5e..14b3b6fce9e70 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -323,7 +323,7 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following
Add product
names.
- **GCN GFX9 (Vega)** [AMD-GCN-GFX900-GFX904-VEGA]_ [AMD-GCN-GFX906-VEGA7NM]_ [AMD-GCN-GFX908-CDNA1]_ [AMD-GCN-GFX90A-CDNA2]_ [AMD-GCN-GFX940-GFX942-CDNA3]_
+ **GCN GFX9 (Vega)** [AMD-GCN-GFX900-GFX904-VEGA]_ [AMD-GCN-GFX906-VEGA7NM]_ [AMD-GCN-GFX908-CDNA1]_ [AMD-GCN-GFX90A-CDNA2]_ [AMD-GCN-GFX942-CDNA3]_
-----------------------------------------------------------------------------------------------------------------------
``gfx900`` ``amdgcn`` dGPU - xnack - Absolute - *rocm-amdhsa* - Radeon Vega
flat - *pal-amdhsa* Frontier Edition
@@ -378,20 +378,6 @@ Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following
- Ryzen 3 Pro 4350G
- Ryzen 3 Pro 4350GE
- ``gfx940`` ``amdgcn`` dGPU - sramecc - Architected *TBA*
- - tgsplit flat
- - xnack scratch .. TODO::
- - kernarg preload - Packed
- work-item Add product
- IDs names.
-
- ``gfx941`` ``amdgcn`` dGPU - sramecc - Architected *TBA*
- - tgsplit flat
- - xnack scratch .. TODO::
- - kernarg preload - Packed
- work-item Add product
- IDs names.
-
``gfx942`` ``amdgcn`` dGPU - sramecc - Architected - AMD Instinct MI300X
- tgsplit flat - AMD Instinct MI300A
- xnack scratch
@@ -583,10 +569,10 @@ Generic processor code objects are versioned. See :ref:`amdgpu-generic-processor
- ``v_dot2_f32_f16``
- ``gfx9-4-generic`` ``amdgcn`` - ``gfx940`` - sramecc - Architected FP8 and BF8 instructions,
- - ``gfx941`` - tgsplit flat scratch FP8 and BF8 conversion
- - ``gfx942`` - xnack - Packed instructions, as well as
- - ``gfx950`` - kernarg preload work-item instructions with XF32 format
+ ``gfx9-4-generic`` ``amdgcn`` - ``gfx942`` - sramecc - Architected FP8 and BF8 instructions,
+ - ``gfx950`` - tgsplit flat scratch FP8 and BF8 conversion
+ - xnack - Packed instructions, as well as
+ - kernarg preload work-item instructions with XF32 format
IDs support are not available.
``gfx10-1-generic`` ``amdgcn`` - ``gfx1010`` - xnack - Absolute flat - The following instructions are
@@ -4974,7 +4960,7 @@ The fields used by CP for code objects before V3 also match those specified in
bytes
383:352 4 bytes COMPUTE_PGM_RSRC3 GFX6-GFX9
Reserved, must be 0.
- GFX90A, GFX940
+ GFX90A, GFX942
Compute Shader (CS)
program settings used by
CP to set up
@@ -5059,7 +5045,7 @@ The fields used by CP for code objects before V3 also match those specified in
463:460 4 bits Reserved, must be 0.
470:464 7 bits KERNARG_PRELOAD_SPEC_LENGTH GFX6-GFX9
- Reserved, must be 0.
- GFX90A, GFX940
+ GFX90A, GFX942
- The number of dwords from
the kernarg segment to preload
into User SGPRs before kernel
@@ -5067,7 +5053,7 @@ The fields used by CP for code objects before V3 also match those specified in
:ref:`amdgpu-amdhsa-kernarg-preload`).
479:471 9 bits KERNARG_PRELOAD_SPEC_OFFSET GFX6-GFX9
- Reserved, must be 0.
- GFX90A, GFX940
+ GFX90A, GFX942
- An offset in dwords into the
kernarg segment to begin
preloading data into User
@@ -5093,7 +5079,7 @@ The fields used by CP for code objects before V3 also match those specified in
GFX6-GFX9
- vgprs_used 0..256
- max(0, ceil(vgprs_used / 4) - 1)
- GFX90A, GFX940
+ GFX90A, GFX942
- vgprs_used 0..512
- vgprs_used = align(arch_vgprs, 4)
+ acc_vgprs
@@ -5559,7 +5545,7 @@ The fields used by CP for code objects before V3 also match those specified in
..
- .. table:: compute_pgm_rsrc3 for GFX90A, GFX940
+ .. table:: compute_pgm_rsrc3 for GFX90A, GFX942
:name: amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table
======= ======= =============================== ===========================================================================
@@ -9970,15 +9956,15 @@ only accessed by a single thread, and is always write-before-read, there is
never a need to invalidate these entries from the L1 cache. Hence all cache
invalidates are done as ``*_vol`` to only invalidate the volatile cache lines.
-The code sequences used to implement the memory model for GFX940, GFX941, GFX942
-are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx941-gfx942-table`.
+The code sequences used to implement the memory model for GFX942 are defined in
+table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx942-table`.
- .. table:: AMDHSA Memory Model Code Sequences GFX940, GFX941, GFX942
- :name: amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx941-gfx942-table
+ .. table:: AMDHSA Memory Model Code Sequences GFX942
+ :name: amdgpu-amdhsa-memory-model-code-sequences-gfx942-table
============ ============ ============== ========== ================================
LLVM Instr LLVM Memory LLVM Memory AMDGPU AMDGPU Machine Code
- Ordering Sync Scope Address GFX940, GFX941, GFX942
+ Ordering Sync Scope Address GFX942
Space
============ ============ ============== ========== ================================
**Non-Atomic**
@@ -10013,18 +9999,12 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
load *none* *none* - local 1. ds_load
store *none* *none* - global - !volatile & !nontemporal
- generic
- - private 1. GFX940, GFX941
+ - private 1. GFX942
- constant buffer/global/flat_store
- sc0=1 sc1=1
- GFX942
- buffer/global/flat_store
- !volatile & nontemporal
- 1. GFX940, GFX941
- buffer/global/flat_store
- nt=1 sc0=1 sc1=1
- GFX942
+ 1. GFX942
buffer/global/flat_store
nt=1
@@ -10696,11 +10676,8 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
**Release Atomic**
------------------------------------------------------------------------------------
- store atomic release - singlethread - global 1. GFX940, GFX941
+ store atomic release - singlethread - global 1. GFX942
- wavefront - generic buffer/global/flat_store
- sc0=1 sc1=1
- GFX942
- buffer/global/flat_store
store atomic release - singlethread - local *If TgSplit execution mode,
- wavefront local address space cannot
@@ -10738,10 +10715,7 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
store that is being
released.
- 2. GFX940, GFX941
- buffer/global/flat_store
- sc0=1 sc1=1
- GFX942
+ 2. GFX942
buffer/global/flat_store
sc0=1
store atomic release - workgroup - local *If TgSplit execution mode,
@@ -10802,10 +10776,7 @@ are defined in table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx940-gfx9
store that is being
released.
- 3. GFX940, GFX941
- buffer/global/flat_store
- sc0=1 sc1=1
- GFX942
+ 3. GFX942
buffer/global/flat_store
sc1=1
store atomic release - system - global 1. buffer_wbl2 sc0=1 sc1=1
@@ -17563,11 +17534,7 @@ in this description.
CDNA 2 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx90a<AMDGPU/AMDGPUAsmGFX90a>`
- CDNA 3 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx940<AMDGPU/AMDGPUAsmGFX940>`
-
- :doc:`gfx941<AMDGPU/AMDGPUAsmGFX940>`
-
- :doc:`gfx942<AMDGPU/AMDGPUAsmGFX940>`
+ CDNA 3 :doc:`GFX9<AMDGPU/AMDGPUAsmGFX9>` :doc:`gfx942<AMDGPU/AMDGPUAsmGFX940>`
RDNA 1 :doc:`GFX10 RDNA1<AMDGPU/AMDGPUAsmGFX10>` :doc:`gfx1010<AMDGPU/AMDGPUAsmGFX10>`
@@ -17605,7 +17572,7 @@ combinations of operands, refer to one of instruction set architecture manuals
[AMD-GCN-GFX6]_, [AMD-GCN-GFX7]_, [AMD-GCN-GFX8]_,
[AMD-GCN-GFX900-GFX904-VEGA]_, [AMD-GCN-GFX906-VEGA7NM]_,
[AMD-GCN-GFX908-CDNA1]_, [AMD-GCN-GFX90A-CDNA2]_,
-[AMD-GCN-GFX940-GFX942-CDNA3]_, [AMD-GCN-GFX10-RDNA1]_, [AMD-GCN-GFX10-RDNA2]_,
+[AMD-GCN-GFX942-CDNA3]_, [AMD-GCN-GFX10-RDNA1]_, [AMD-GCN-GFX10-RDNA2]_,
[AMD-GCN-GFX11-RDNA3]_ and [AMD-GCN-GFX11-RDNA3.5]_.
Operands
@@ -18118,7 +18085,7 @@ terminated by an ``.end_amdhsa_kernel`` directive.
:ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`
``.amdhsa_user_sgpr_private_segment_buffer`` 0 GFX6-GFX10 Controls ENABLE_SGPR_PRIVATE_SEGMENT_BUFFER in
(except :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
- GFX940)
+ GFX942)
``.amdhsa_user_sgpr_dispatch_ptr`` 0 GFX6-GFX12 Controls ENABLE_SGPR_DISPATCH_PTR in
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
``.amdhsa_user_sgpr_queue_ptr`` 0 GFX6-GFX12 Controls ENABLE_SGPR_QUEUE_PTR in
@@ -18129,7 +18096,7 @@ terminated by an ``.end_amdhsa_kernel`` directive.
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
``.amdhsa_user_sgpr_flat_scratch_init`` 0 GFX6-GFX10 Controls ENABLE_SGPR_FLAT_SCRATCH_INIT in
(except :ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
- GFX940)
+ GFX942)
``.amdhsa_user_sgpr_private_segment_size`` 0 GFX6-GFX12 Controls ENABLE_SGPR_PRIVATE_SEGMENT_SIZE in
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
``.amdhsa_wavefront_size32`` Target GFX10-GFX12 Controls ENABLE_WAVEFRONT_SIZE32 in
@@ -18140,8 +18107,8 @@ terminated by an ``.end_amdhsa_kernel`` directive.
:ref:`amdgpu-amdhsa-kernel-descriptor-v3-table`.
``.amdhsa_system_sgpr_private_segment_wavefront_offset`` 0 GFX6-GFX10 Controls ENABLE_PRIVATE_SEGMENT in
(except :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
- GFX940)
- ``.amdhsa_enable_private_segment`` 0 GFX940, Controls ENABLE_PRIVATE_SEGMENT in
+ GFX942)
+ ``.amdhsa_enable_private_segment`` 0 GFX942, Controls ENABLE_PRIVATE_SEGMENT in
GFX11-GFX12 :ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
``.amdhsa_system_sgpr_workgroup_id_x`` 1 GFX6-GFX12 Controls ENABLE_SGPR_WORKGROUP_ID_X in
:ref:`amdgpu-amdhsa-compute_pgm_rsrc2-gfx6-gfx12-table`.
@@ -18162,14 +18129,14 @@ terminated by an ``.end_amdhsa_kernel`` directive.
Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
``.amdhsa_accum_offset`` Required GFX90A, Offset of a first AccVGPR in the unified register file.
- GFX940 Used to calculate ACCUM_OFFSET in
+ GFX942 Used to calculate ACCUM_OFFSET in
:ref:`amdgpu-amdhsa-compute_pgm_rsrc3-gfx90a-table`.
``.amdhsa_reserve_vcc`` 1 GFX6-GFX12 Whether the kernel may use the special VCC SGPR.
Used to calculate GRANULATED_WAVEFRONT_SGPR_COUNT in
:ref:`amdgpu-amdhsa-compute_pgm_rsrc1-gfx6-gfx12-table`.
``.amdhsa_reserve_flat_scratch`` 1 GFX7-GFX10 Whether the kernel may use flat instructions to access
...
[truncated]
|
79c331e
to
baa8add
Compare
6bc2689
to
a5d359b
Compare
baa8add
to
731f633
Compare
a5d359b
to
ca4a620
Compare
731f633
to
05da561
Compare
05da561
to
4e77aa4
Compare
gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all documentation occurrences of gfx940/gfx941 except for the gfx940 ISA description, which will be the subject of a separate PR. For SWDEV-512631
ca4a620
to
a0b0253
Compare
…vm#126887) gfx940 and gfx941 are no longer supported. This is one of a series of PRs to remove them from the code base. This PR removes all documentation occurrences of gfx940/gfx941 except for the gfx940 ISA description, which will be the subject of a separate PR. For SWDEV-512631
gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.
This PR removes all documentation occurrences of gfx940/gfx941 except
for the gfx940 ISA description, which will be the subject of a separate
PR.
For SWDEV-512631