Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Cost Calculations for Shuffle Ports on Icelake-client / Icelake-server #47454

Closed
Omnisip mannequin opened this issue Nov 8, 2020 · 7 comments
Closed

Incorrect Cost Calculations for Shuffle Ports on Icelake-client / Icelake-server #47454

Omnisip mannequin opened this issue Nov 8, 2020 · 7 comments
Labels
backend:X86 Scheduler Models Accuracy of X86 scheduler models backend:X86 bugzilla Issues migrated from bugzilla

Comments

@Omnisip
Copy link
Mannequin

Omnisip mannequin commented Nov 8, 2020

Bugzilla Link 48110
Version trunk
OS All
Blocks #31672
CC @adibiagio,@topperc,@RKSimon,@MattPD,@phoebewang,@rotateright

Extended Description

Hi there,

It looks like llvm-mca is treating icelake-client and icelake-server as having only one shuffle port. This causes incorrect cost calculations for any operation that would use said shuffle ports. A block diagram showing the architecture can be found here: https://en.wikichip.org/wiki/intel/microarchitectures/sunny_cove (sunny cove is the core used by both icelake-server and icelake-client). Similarly, you can find the uops usage for any given shuffle here: https://uops.info/table.html?search=shuf&cb_lat=on&cb_tp=on&cb_uops=on&cb_ports=on&cb_SKL=on&cb_ICL=on&cb_measurements=on&cb_base=on&cb_avx=on

If the code generating the costs for LLVM-MCA is used by anything else in the toolchain, it's likely this will yield a performance benefit for other users targeting icelake-server and icelake-client.

@topperc
Copy link
Collaborator

topperc commented Nov 8, 2020

llvm-mca and the reset of llvm is using the skylake-server scheduling information for icelake-client and server.

@RKSimon
Copy link
Collaborator

RKSimon commented Nov 8, 2020

This isn't an llvm-mca issue, it just shows the problem - as Craig said icelake doesn't have its own scheduler models.

Ideally we need someone who has the CPUs that can fork the skylake models, adjust them based off llvm-exegesis reports and the intel hw docs, and then confirm that it actually improves performance.

@RKSimon
Copy link
Collaborator

RKSimon commented Aug 30, 2021

https://reviews.llvm.org/D108914

I've put up a candidate patch that creates a Icelake model as a direct copy of the existing SkylakeServer model - this would then allow iterative improvements to address microarch differences such as this.

@RKSimon
Copy link
Collaborator

RKSimon commented Sep 4, 2021

https://reviews.llvm.org/D108914

I've put up a candidate patch that creates a Icelake model as a direct copy
of the existing SkylakeServer model - this would then allow iterative
improvements to address microarch differences such as this.

This landed at 9e2d14c

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
@RKSimon
Copy link
Collaborator

RKSimon commented Dec 12, 2021

https://reviews.llvm.org/D115547 fixes some of the integer shuffles

shiltian pushed a commit to shiltian/llvm-project that referenced this issue Dec 15, 2021
…uilder

SelectionDAGBuilder was inconsistently mangling values based on ABI
Calling Conventions when getting them through copyFromRegs in
SelectionDAGBuilder, causing duplicate value type convertions for
function arguments. The checking for the mangling requirement was based
on the value's originating instruction and was performed outside of, and
inspite of, the regular Calling Convention Lowering.

The issue could be observed in a scenario such as:

```
%arg1 = load half, half* %const, align 2
%arg2 = call fastcc half @someFunc()
call fastcc void @otherFunc(half %arg1, half %arg2)
; Here, %arg2 was incorrectly mangled twice, as the CallConv data from
; the call to @someFunc() was taken into consideration for the check
; when getting the value for processing the call to @otherFunc(...),
; after the proper convertion had taken place when lowering the return
; value of the first call.
```

This patch fixes the issue by disregarding the Calling Convention
information for such copyFromRegs, making sure the ABI mangling is
properly contanined in the Calling Convention Lowering.

This fixes Bugzilla llvm#47454.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87844

(cherry picked from commit 53d238a)
@RKSimon
Copy link
Collaborator

RKSimon commented Dec 19, 2021

https://reviews.llvm.org/D115752 fixes some of the fp shuffles

@RKSimon
Copy link
Collaborator

RKSimon commented Dec 19, 2021

Fixed by 74d1fc7 and 67cce1c

@RKSimon RKSimon closed this as completed Dec 19, 2021
@RKSimon RKSimon added the backend:X86 Scheduler Models Accuracy of X86 scheduler models label Apr 9, 2022
trevor-m pushed a commit to trevor-m/llvm-project that referenced this issue Apr 20, 2023
…uilder

SelectionDAGBuilder was inconsistently mangling values based on ABI
Calling Conventions when getting them through copyFromRegs in
SelectionDAGBuilder, causing duplicate value type convertions for
function arguments. The checking for the mangling requirement was based
on the value's originating instruction and was performed outside of, and
inspite of, the regular Calling Convention Lowering.

The issue could be observed in a scenario such as:

```
%arg1 = load half, half* %const, align 2
%arg2 = call fastcc half @someFunc()
call fastcc void @otherFunc(half %arg1, half %arg2)
; Here, %arg2 was incorrectly mangled twice, as the CallConv data from
; the call to @someFunc() was taken into consideration for the check
; when getting the value for processing the call to @otherFunc(...),
; after the proper convertion had taken place when lowering the return
; value of the first call.
```

This patch fixes the issue by disregarding the Calling Convention
information for such copyFromRegs, making sure the ABI mangling is
properly contanined in the Calling Convention Lowering.

This fixes Bugzilla llvm#47454.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87844
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 Scheduler Models Accuracy of X86 scheduler models backend:X86 bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

2 participants