Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guard SYCL Graph implementation and fallback emulation #71

Merged
merged 7 commits into from
Mar 30, 2023

Conversation

julianmi
Copy link
Collaborator

This PR proposes an emulation mode for SYCL Graph as a fallback if SYCL Graph is not implemented. This provides the APIs of the sycl-graph-poc-v2 branch. The execution of the graph is however emulated by submitting each graph node when executing the graph (repeatedly). This addresses #20.

This provides the first step of merging the SYCL Graph POC into the mainline. The proposed process can be found here: intel#7627 (comment)

This supersedes the previous PR #56.

@julianmi julianmi requested review from EwanC, reble and Bensuo January 20, 2023 17:22
@EwanC EwanC added the Graph Implementation Related to DPC++ implementation and testing label Jan 23, 2023
@julianmi julianmi force-pushed the julianmi/graph-emulation-macro branch from e6ac5dd to b903791 Compare March 13, 2023 17:36
Copy link
Collaborator

@EwanC EwanC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with this change, my only remaining comment is whether we want to merge this into the sycl-graph-develop branch as well to avoid things diverging, and then document the --enable-sycl-graph flag to buildbot/configure.py in the landing page README.

@julianmi julianmi force-pushed the julianmi/graph-emulation-macro branch from b903791 to c5fe139 Compare March 22, 2023 14:37
@julianmi julianmi changed the base branch from sycl-graph-poc-v2 to sycl-graph-develop March 22, 2023 14:38
@julianmi
Copy link
Collaborator Author

I have rebased the code to sycl-graph-develop and removed the guarded members. Please have another look.

@julianmi
Copy link
Collaborator Author

Happy with this change, my only remaining comment is whether we want to merge this into the sycl-graph-develop branch as well to avoid things diverging, and then document the --enable-sycl-graph flag to buildbot/configure.py in the landing page README.

I have cherry-picked @reble's readme changes and included instructions for building the compiler with SYCL Graph support.

@julianmi julianmi force-pushed the julianmi/graph-emulation-macro branch from 24186cd to e285e0a Compare March 28, 2023 17:04
@EwanC
Copy link
Collaborator

EwanC commented Mar 29, 2023

Readme changes LGTM, I suspect this will cause conflicts with #100 but this has been open for a long time so happy to get it in first.

@julianmi julianmi merged commit ec71841 into sycl-graph-develop Mar 30, 2023
Bensuo added a commit that referenced this pull request May 2, 2023
commit 2348227
Author: Ben Tracy <[email protected]>
Date:   Wed Apr 19 14:48:17 2023 +0100

    [SYCL] Update graph constructor/finalize to current spec (#140)

    - Add device and context params to graph constructor
    - Remove context from finalize
    - Minor changes to graph_impl to support this
    - Update all examples to use updated API
    - Tidied up ordering of graph_impl declarations a little

commit 7e580c5
Author: Ben Tracy <[email protected]>
Date:   Wed Apr 19 13:46:52 2023 +0100

    [SYCL] Fix subgraphs, move sync points to exec graph (#134)

    * [SYCL] Fix subgraphs, move sync points to exec graph

    - Fixes subgraph support for command buffer graphs
    - Move sync points to executable graph instead of node
    - Removed unused graph impl from nodes
    - Kernel dims are now correctly reversed before submission with dims > 1
    - Remove unnecessary call to piEventCreate

commit 2f75c88
Author: Ewan Crawford <[email protected]>
Date:   Thu Apr 13 12:48:40 2023 +0100

    [SYCL] Replace lazy queue property with PI command-buffers. (#100)

    - Remove lazy queue property
    - Use command buffers inside graphs for execution
    - Separate executable graph impl from modifiable graph impl
    - Implement handler::depends_on for record and replay nodes
    - New test for finalizing different graphs from the same modifiable one
    - graph-record-dotp now uses handler::depends_on
    - Implement arg filtering before setting args
    - Make applyFuncOnFilteredArgs accessible from commands.hpp
    - Track dependencies through empty nodes in graphs
    - Guard reduction use in device mem example
    - Fix issues with empty node example
    - Guard command buffer behind SYCL_EXT_ONEAPI_GRAPH
    - Recreate simple submission in emulation mode

    ---------

    Co-authored-by: Ben Tracy <[email protected]>

commit 33d64f9
Author: Pablo Reble <[email protected]>
Date:   Fri Mar 31 12:46:04 2023 -0500

    [SYCL] Add empty node implementation (#112)

    Co-authored-by: Ben Tracy <[email protected]>

commit 187c9d0
Merge: ec71841 7d4e315
Author: Julian Miller <[email protected]>
Date:   Thu Mar 30 18:21:03 2023 +0200

    Merge pull request #115 from reble/julianmi/graph-testing-waits

    Graph Testing: Add missing waits and USM device tests

commit ec71841
Merge: 1efde99 9b95a70
Author: Julian Miller <[email protected]>
Date:   Thu Mar 30 18:20:44 2023 +0200

    Merge pull request #71 from reble/julianmi/graph-emulation-macro

    Guard SYCL Graph implementation and fallback emulation

commit 7d4e315
Author: Julian Miller <[email protected]>
Date:   Wed Mar 29 12:05:55 2023 +0200

    Add USM device graph test

commit 6b89b23
Author: Julian Miller <[email protected]>
Date:   Wed Mar 29 12:04:50 2023 +0200

    Add missing waits in graph tests

commit 9b95a70
Author: Julian Miller <[email protected]>
Date:   Tue Mar 28 19:15:55 2023 +0200

    Remove unneeded includes

commit e285e0a
Author: Julian Miller <[email protected]>
Date:   Wed Mar 22 17:43:00 2023 +0100

    Add compiler configuration instructions for SYCL Graph

commit 5f31bfa
Author: Pablo Reble <[email protected]>
Date:   Wed Mar 1 14:59:14 2023 -0600

    Update README.md

commit 7370c0b
Author: Pablo Reble <[email protected]>
Date:   Wed Mar 1 08:48:16 2023 -0600

    Update README.md

    add first draft of landing page

commit e5f4da8
Author: Julian Miller <[email protected]>
Date:   Tue Mar 21 17:27:01 2023 +0100

    Remove guarded members

commit 26b24a9
Author: Julian Miller <[email protected]>
Date:   Mon Mar 13 17:25:43 2023 +0100

    Add feature test macro

commit 152ccea
Author: Julian Miller <[email protected]>
Date:   Fri Jan 20 18:06:55 2023 +0100

    Guard SYCL Graph implementation and fallback emulation

commit 1efde99
Author: Ben Tracy <[email protected]>
Date:   Thu Mar 23 12:26:50 2023 +0000

    [SYCL] Remove CGF reuse in graph nodes

    - Note reductions are broken by this commit due to missing accessor support
    - Handler info is extracted and copied into nodes
    - Adding nodes in record and replay moved to finalize.
    - Workarounds for reduction wg sizes added.
    - Introduce `graph-record-temp-scope.cpp` test case which fails before this commit and passes afterwards.

    Instead of USM arguments, it is buffer accessors that should be used for
    edge detection. Fixes `graph-explicit-node-ordering.cpp` test ordering which is currently
    creating incorrect extra edges

    Also added `graph-explicit-dotp-buffer.cpp` test for explicit API with accessor edges, we can use to see if this
    logic works once accessors are better supported.

    This change adds a new handler constructor which takes
    a graph, rather than creating a default temporary queue object
    to pass to the existing constructor.

    Co-authored-by: Ewan Crawford <[email protected]>

commit b7f17c8
Author: Ewan Crawford <[email protected]>
Date:   Tue Mar 21 08:15:57 2023 +0000

    [SYCL] Update record & replay tests

    Update the record & replay tests to match changes from
    #72 which were missed after
    merging the record and replay branch:

    * Remove unused headers
    * Uses asserts instead of printing to std out

commit d2ff468
Author: Julian Miller <[email protected]>
Date:   Thu Mar 16 10:08:27 2023 +0100

    [SYCL] Improve Graphs testing

    * Extend testing

    * Fix reduction test

    * Add test to verify node ordering

    * Update sycl include

    * Switch to assertions in graph tests

    * Formatting

commit 068dd95
Author: Pablo Reble <[email protected]>
Date:   Mon Mar 13 11:14:29 2023 -0500

    Resolving naming style mismatch (#86)

commit 66d1b6b
Author: Pablo Reble <[email protected]>
Date:   Thu Mar 2 23:54:48 2023 -0600

    Improve code location and replace shared ptr aliases (#82)

commit 62d6b15
Author: Ben Tracy <[email protected]>
Date:   Tue Feb 28 10:53:46 2023 +0000

    [SYCL][PI] Prototype command_buffer API in level zero

    - Adds a prototype of an explicit command buffer
    - Implemented only for level zero backend
    - Unit tests added which test new entry points.

commit d4c1ed3
Author: Ewan Crawford <[email protected]>
Date:   Mon Feb 27 08:48:23 2023 +0000

    [SYCL] Record & Replay Implementation

    Implementation of Record & Replay API with tests

    Co-authored-by: Ben Tracy <[email protected]>

commit 06c588f
Author: Pablo Reble <[email protected]>
Date:   Thu Feb 9 10:53:47 2023 -0600

    Apply suggestions from code review

    Co-authored-by: Steffen Larsen <[email protected]>

commit 0ac7a7e
Author: Pablo Reble <[email protected]>
Date:   Thu Jan 19 10:29:46 2023 -0600

    Adding new example using make edge function (#63)

    Co-authored-by: Ben Tracy <[email protected]>

commit 1249fbc
Author: Ewan Crawford <[email protected]>
Date:   Thu Jan 19 10:03:56 2023 +0000

    [SYCL] Pass property_list to APIs

    Adds the `sycl::property_list` to the constructor of
    `command_graph<modifiable>()` and `finalize()` to
    match spec change #67

commit 4a306ed
Author: Ben Tracy <[email protected]>
Date:   Wed Jan 11 10:53:16 2023 +0000

    [SYCL] Add unit tests for command graph POC

    - Add some unit tests for the command graph POC
    -Add missing specializations for lazy queue property

commit fb28d59
Author: Ben Tracy <[email protected]>
Date:   Mon Jan 9 11:10:26 2023 +0000

    [SYCL] Rename exec_graph to ext_oneapi_graph

    [SYCL] handler::ext_oneapi_graph

    Update to reflect changes from #65

    - In line with recent spec changes, rename handler and queue shortcut functions from exec_graph to ext_oneapi_graph
    - Also updated usage in the examples

    Co-authored-by: Ewan Crawford <[email protected]>

commit 1448cb5
Author: Ben Tracy <[email protected]>
Date:   Wed Dec 21 09:10:40 2022 +0000

    [SYCL] Enable submitting sub-graphs

    - Enable submitting a sub-graph as part of a larger command_graph
    - Flag added to queue_impl to enable graph to be aware it is a sub-graph and delay flush
    - Adds an example whichuses a subgraph in the middle of a command_graph

commit c99bdca
Author: Ben Tracy <[email protected]>
Date:   Tue Dec 13 10:57:15 2022 +0000

    [SYCL] Fix reductions not working inside graph

    * Graph submission now properly creates a host visible event on the command list allowing auxilliary resources to be cleaned up

    * executeCommandList slightly modified to block execution only for command lists not allowed to be batched.

commit 3073cfc
Author: Ewan Crawford <[email protected]>
Date:   Fri Dec 2 10:47:32 2022 +0000

    [SYCL] Clean-up lazy queue PI changes

    * PI Minor version bump for new flag
    * Document new PI property as comments
    * Make value next consecutive bit `1 << 5`, rather
      than `1 << 11`.

commit 7bb11ce
Author: Ewan Crawford <[email protected]>
Date:   Wed Nov 30 13:14:50 2022 +0000

    [SYCL] Use handler to execute graph

    Update API to match the spec change from #26
    to execute a graph via the handler rather than queue submit.

    This spec update includes queue shortcut functions, which i've added
    a new test for.

commit 578692f
Author: Ewan Crawford <[email protected]>
Date:   Thu Nov 24 09:26:27 2022 +0000

    [SYCL] PIMPL refactor

    Refactor the command_graph and node classes so that
    we interface with the impl types rather than
    user exposed types, and just the interface lives in the
    public facing headers.

    This change also means we can use a `.cpp` file for implementation
    code rather than being header only.

    The motivation for these changes was trying to get graph submission
    through a handler, at which point only the `sycl::detail::queue_impl` class
    is available rather than `sycl::queue`

commit 9f127d7
Author: Ewan Crawford <[email protected]>
Date:   Fri Nov 18 16:27:54 2022 +0000

    [SYCL] Repro for reduction fail

    * Add RUN lines to tests so that tests are run by LIT
    * clang-format existing tests, and other minor cleanups
    * Add `graph-explicit-reduction.cpp` which shows fail from #24 by using the `sycl::ext::oneapi::property::queue::lazy_execution` property on a queue which uses a reduction outwith  the graph building API

commit 2cf9d0f
Author: Pablo Reble <[email protected]>
Date:   Tue Nov 29 21:26:28 2022 -0600

    Cosmetic changes

commit df971e5
Author: Ben Tracy <[email protected]>
Date:   Thu Nov 24 08:46:12 2022 +0000

    [SYCL] Minor graph classes refactor (#36)

    - getSyclObjImpl and createSyclObjFromImpl support added
    - Minor renaming to enable this.
    - Adds basic results validation to dotp test
    - Minor fixes to address warnings etc.

commit f71ea49
Author: Ewan Crawford <[email protected]>
Date:   Mon Nov 21 12:25:44 2022 +0000

    Common changes from record & replay API (#32)

    Changes to common code from #6
    which has already been reviewed and merged into the
    `sycl-graph-record-replay` branch.

    This patch should not contain anything specific to the record and
    replay API.

commit 383459c
Author: Pablo Reble <[email protected]>
Date:   Tue Nov 1 13:35:42 2022 -0500

    Renaming variables

commit 4478390
Author: Pablo Reble <[email protected]>
Date:   Tue Nov 1 12:45:31 2022 -0500

    clang-format

commit fa58aa3
Author: Pablo Reble <[email protected]>
Date:   Wed Oct 19 20:16:21 2022 -0700

    renaming macro and bugfix

commit 38da3c6
Author: Pablo Reble <[email protected]>
Date:   Tue Oct 18 07:49:47 2022 -0700

    add basic tests

commit 7581915
Author: Pablo Reble <[email protected]>
Date:   Tue Oct 18 07:40:15 2022 -0700

    bugfix

commit fa7494d
Author: Pablo Reble <[email protected]>
Date:   Tue Oct 18 07:39:19 2022 -0700

    starting to rework lazy execution logic

commit 446ac53
Author: Pablo Reble <[email protected]>
Date:   Tue Oct 18 07:37:41 2022 -0700

    revert changes to level-zero plugin

commit 8850b18
Author: Pablo Reble <[email protected]>
Date:   Wed Oct 12 11:33:57 2022 -0700

    fix rebase issue

commit a3164de
Author: Pablo Reble <[email protected]>
Date:   Wed Oct 12 08:03:55 2022 -0700

    update API to recent proposal

commit 7917086
Author: Pablo Reble <[email protected]>
Date:   Tue May 10 11:25:51 2022 -0500

    fix formatting

commit 7d81618
Author: Pablo Reble <[email protected]>
Date:   Fri May 6 11:54:58 2022 -0500

    fix issue introd. by recent merge

commit 9b46c4b
Author: Pablo Reble <[email protected]>
Date:   Fri May 6 10:30:29 2022 -0500

    fix formatting issues

commit 50d49a1
Author: Julian Miller <[email protected]>
Date:   Tue May 3 11:29:34 2022 -0500

    Propagate lazy queue property

commit 0d8a5f4
Author: Pablo Reble <[email protected]>
Date:   Mon Mar 14 14:08:02 2022 +0100

    Apply suggestions from code review

    Co-authored-by: Ronan Keryell <[email protected]>

commit f957996
Author: Pablo Reble <[email protected]>
Date:   Mon May 2 21:06:42 2022 -0500

    fix typos and syntax issues

commit 047839b
Author: Pablo Reble <[email protected]>
Date:   Fri Mar 11 20:47:16 2022 +0100

    typo

commit 2b50af4
Author: Pablo Reble <[email protected]>
Date:   Fri Mar 11 16:42:43 2022 +0100

    update extension proposal started to incorporate feedback

commit a8b5b32
Author: Pablo Reble <[email protected]>
Date:   Tue Feb 22 10:46:54 2022 -0600

    Update pi_level_zero.cpp

    Fix merge conflict

commit 0bad787
Author: Pablo Reble <[email protected]>
Date:   Mon Feb 21 22:25:38 2022 -0600

    fix merge

commit 656f5c3
Author: Pablo Reble <[email protected]>
Date:   Tue Feb 15 17:18:32 2022 -0600

    Adding lazy execution property to queue

commit d286c71
Author: Pablo Reble <[email protected]>
Date:   Fri Feb 18 15:15:10 2022 -0600

    Adding initial sycl graph doc

commit 1acf57e
Author: Pablo Reble <[email protected]>
Date:   Fri Feb 18 15:16:27 2022 -0600

    Inital version of sycl graph prototype
@Bensuo Bensuo deleted the julianmi/graph-emulation-macro branch August 7, 2023 16:27
EwanC pushed a commit that referenced this pull request Nov 8, 2023
…vg (#70914)

This removes explicit invalidation of vg and svg that was done in
`GDBRemoteRegisterContext::AArch64Reconfigure`. This was in fact
covering up a bug elsehwere.

Register information says that a write to vg also invalidates svg (it
does not unless you are in streaming mode, but we decided to keep it
simple and say it always does).

This invalidation was not being applied until *after* AArch64Reconfigure
was called. This meant that without those manual invalidates this
happened:
* vg is written
* svg is not invalidated
* Reconfigure uses the written vg value
* Reconfigure uses the *old* svg value

I have moved the AArch64Reconfigure call to after we've processed the
invalidations caused by the register write, so we no longer need the
manual invalidates in AArch64Reconfigure.

In addition I have changed the order in which expedited registers as
parsed. These registers come with a stop notification and include,
amongst others, vg and svg.

So now we:
* Parse them and update register values (including vg and svg)
* AArch64Reconfigure, which uses those values, and invalidates every
register, because offsets may have changed.
* Parse the expedited registers again, knowing that none of the values
will have changed due to the scaling.

This means we use the expedited registers during the reconfigure, but
the invalidate does not mean we throw all of them away.

The cost is we parse them twice client side, but this is cheap compared
to a network packet, and is limited to AArch64 targets only.

On a system with SVE and SME, these are the packets sent for a step:
```
(lldb) b-remote.async>  < 803> read packet:
$T05thread:p1f80.1f80;name:main.o;threads:1f80;thread-pcs:000000000040056c<...>a1:0800000000000000;d9:0400000000000000;reason:trace;#fc
intern-state     <  21> send packet: $xfffffffff200,200#5e
intern-state     < 516> read packet:
$e4f2ffffffff000000<...>#71
intern-state     <  15> send packet: $Z0,400568,4#4d
intern-state     <   6> read packet: $OK#9a
dbg.evt-handler  <  16> send packet: $jThreadsInfo#c1
dbg.evt-handler  < 224> read packet:
$[{"name":"main.o","reason":"trace","registers":{"161":"0800000000000000",<...>}],"signal":5,"tid":8064}]]#73
```

You can see there are no extra register reads which means we're using
the expedited registers.

For a write to vg:
```
(lldb) register write vg 4
lldb             <  37> send packet:
$Pa1=0400000000000000;thread:1f80;#4a
lldb             <   6> read packet: $OK#9a
lldb             <  20> send packet: $pa1;thread:1f80;#29
lldb             <  20> read packet: $0400000000000000#04
lldb             <  20> send packet: $pd9;thread:1f80;#34
lldb             <  20> read packet: $0400000000000000#04
```

There is the initial P write, and lldb correctly assumes that SVG is
invalidated by this also so we read back the new vg and svg values
afterwards.
EwanC pushed a commit that referenced this pull request Jan 3, 2024
… (#76167)

…396)"

This reverts commit 8773c9b.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Graph Implementation Related to DPC++ implementation and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants