forked from intel/llvm
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Execute graph using handler #26
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Bensuo
approved these changes
Nov 17, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just a minor suggestion.
reble
reviewed
Nov 22, 2022
reble
reviewed
Nov 29, 2022
reble
approved these changes
Nov 29, 2022
Change the `queue::submit(command_graph<executable>)` API for launching an executable graph to `handler::exec_graph(command_graph<executable>)`. See Issue #21 Using the handler is more in-keeping with the existing SYCL API and allows the execution of graph to depend on an arbitrary event through `handler::depends_on`. This design makes it easier for users to write the following code without have to block in host for waits. ```cpp auto ev1 = q.submit([&](handler& cgh){ cgh.exec_graph(g); }); // dest is some input to graph `g` auto ev2 = q.memcpy(dest, src, numBytes, ev1); auto ev3 = q.submit([&](handler& cgh){ cgh.depends_on(ev2); cgh.exec_graph(g); }); ``` Queue shortcut functions are also included, as is the case in the core SYCL spec for other handler functionality. This change should also enable the explicit API to capture nested sub-graph executions, which is not currently possible in the explicit API but is possible in the record & replay API. See issue #23 For example, a user can now do: ```cpp command_graph<executable> executable_graph; auto node = recordable_graph.add([&](handler& cgh){ cgh.exec_graph(executable_graph); }); ```
This was referenced Nov 30, 2022
EwanC
added a commit
that referenced
this pull request
Dec 2, 2022
Update API to match the spec change from #26 to execute a graph via the handler rather than queue submit. This spec update includes queue shortcut functions, which i've added a new test for.
EwanC
added a commit
that referenced
this pull request
Dec 2, 2022
Update API to match the spec change from #26 to execute a graph via the handler rather than queue submit. This spec update includes queue shortcut functions, which i've added a new test for.
Bensuo
added a commit
that referenced
this pull request
May 2, 2023
commit 2348227 Author: Ben Tracy <[email protected]> Date: Wed Apr 19 14:48:17 2023 +0100 [SYCL] Update graph constructor/finalize to current spec (#140) - Add device and context params to graph constructor - Remove context from finalize - Minor changes to graph_impl to support this - Update all examples to use updated API - Tidied up ordering of graph_impl declarations a little commit 7e580c5 Author: Ben Tracy <[email protected]> Date: Wed Apr 19 13:46:52 2023 +0100 [SYCL] Fix subgraphs, move sync points to exec graph (#134) * [SYCL] Fix subgraphs, move sync points to exec graph - Fixes subgraph support for command buffer graphs - Move sync points to executable graph instead of node - Removed unused graph impl from nodes - Kernel dims are now correctly reversed before submission with dims > 1 - Remove unnecessary call to piEventCreate commit 2f75c88 Author: Ewan Crawford <[email protected]> Date: Thu Apr 13 12:48:40 2023 +0100 [SYCL] Replace lazy queue property with PI command-buffers. (#100) - Remove lazy queue property - Use command buffers inside graphs for execution - Separate executable graph impl from modifiable graph impl - Implement handler::depends_on for record and replay nodes - New test for finalizing different graphs from the same modifiable one - graph-record-dotp now uses handler::depends_on - Implement arg filtering before setting args - Make applyFuncOnFilteredArgs accessible from commands.hpp - Track dependencies through empty nodes in graphs - Guard reduction use in device mem example - Fix issues with empty node example - Guard command buffer behind SYCL_EXT_ONEAPI_GRAPH - Recreate simple submission in emulation mode --------- Co-authored-by: Ben Tracy <[email protected]> commit 33d64f9 Author: Pablo Reble <[email protected]> Date: Fri Mar 31 12:46:04 2023 -0500 [SYCL] Add empty node implementation (#112) Co-authored-by: Ben Tracy <[email protected]> commit 187c9d0 Merge: ec71841 7d4e315 Author: Julian Miller <[email protected]> Date: Thu Mar 30 18:21:03 2023 +0200 Merge pull request #115 from reble/julianmi/graph-testing-waits Graph Testing: Add missing waits and USM device tests commit ec71841 Merge: 1efde99 9b95a70 Author: Julian Miller <[email protected]> Date: Thu Mar 30 18:20:44 2023 +0200 Merge pull request #71 from reble/julianmi/graph-emulation-macro Guard SYCL Graph implementation and fallback emulation commit 7d4e315 Author: Julian Miller <[email protected]> Date: Wed Mar 29 12:05:55 2023 +0200 Add USM device graph test commit 6b89b23 Author: Julian Miller <[email protected]> Date: Wed Mar 29 12:04:50 2023 +0200 Add missing waits in graph tests commit 9b95a70 Author: Julian Miller <[email protected]> Date: Tue Mar 28 19:15:55 2023 +0200 Remove unneeded includes commit e285e0a Author: Julian Miller <[email protected]> Date: Wed Mar 22 17:43:00 2023 +0100 Add compiler configuration instructions for SYCL Graph commit 5f31bfa Author: Pablo Reble <[email protected]> Date: Wed Mar 1 14:59:14 2023 -0600 Update README.md commit 7370c0b Author: Pablo Reble <[email protected]> Date: Wed Mar 1 08:48:16 2023 -0600 Update README.md add first draft of landing page commit e5f4da8 Author: Julian Miller <[email protected]> Date: Tue Mar 21 17:27:01 2023 +0100 Remove guarded members commit 26b24a9 Author: Julian Miller <[email protected]> Date: Mon Mar 13 17:25:43 2023 +0100 Add feature test macro commit 152ccea Author: Julian Miller <[email protected]> Date: Fri Jan 20 18:06:55 2023 +0100 Guard SYCL Graph implementation and fallback emulation commit 1efde99 Author: Ben Tracy <[email protected]> Date: Thu Mar 23 12:26:50 2023 +0000 [SYCL] Remove CGF reuse in graph nodes - Note reductions are broken by this commit due to missing accessor support - Handler info is extracted and copied into nodes - Adding nodes in record and replay moved to finalize. - Workarounds for reduction wg sizes added. - Introduce `graph-record-temp-scope.cpp` test case which fails before this commit and passes afterwards. Instead of USM arguments, it is buffer accessors that should be used for edge detection. Fixes `graph-explicit-node-ordering.cpp` test ordering which is currently creating incorrect extra edges Also added `graph-explicit-dotp-buffer.cpp` test for explicit API with accessor edges, we can use to see if this logic works once accessors are better supported. This change adds a new handler constructor which takes a graph, rather than creating a default temporary queue object to pass to the existing constructor. Co-authored-by: Ewan Crawford <[email protected]> commit b7f17c8 Author: Ewan Crawford <[email protected]> Date: Tue Mar 21 08:15:57 2023 +0000 [SYCL] Update record & replay tests Update the record & replay tests to match changes from #72 which were missed after merging the record and replay branch: * Remove unused headers * Uses asserts instead of printing to std out commit d2ff468 Author: Julian Miller <[email protected]> Date: Thu Mar 16 10:08:27 2023 +0100 [SYCL] Improve Graphs testing * Extend testing * Fix reduction test * Add test to verify node ordering * Update sycl include * Switch to assertions in graph tests * Formatting commit 068dd95 Author: Pablo Reble <[email protected]> Date: Mon Mar 13 11:14:29 2023 -0500 Resolving naming style mismatch (#86) commit 66d1b6b Author: Pablo Reble <[email protected]> Date: Thu Mar 2 23:54:48 2023 -0600 Improve code location and replace shared ptr aliases (#82) commit 62d6b15 Author: Ben Tracy <[email protected]> Date: Tue Feb 28 10:53:46 2023 +0000 [SYCL][PI] Prototype command_buffer API in level zero - Adds a prototype of an explicit command buffer - Implemented only for level zero backend - Unit tests added which test new entry points. commit d4c1ed3 Author: Ewan Crawford <[email protected]> Date: Mon Feb 27 08:48:23 2023 +0000 [SYCL] Record & Replay Implementation Implementation of Record & Replay API with tests Co-authored-by: Ben Tracy <[email protected]> commit 06c588f Author: Pablo Reble <[email protected]> Date: Thu Feb 9 10:53:47 2023 -0600 Apply suggestions from code review Co-authored-by: Steffen Larsen <[email protected]> commit 0ac7a7e Author: Pablo Reble <[email protected]> Date: Thu Jan 19 10:29:46 2023 -0600 Adding new example using make edge function (#63) Co-authored-by: Ben Tracy <[email protected]> commit 1249fbc Author: Ewan Crawford <[email protected]> Date: Thu Jan 19 10:03:56 2023 +0000 [SYCL] Pass property_list to APIs Adds the `sycl::property_list` to the constructor of `command_graph<modifiable>()` and `finalize()` to match spec change #67 commit 4a306ed Author: Ben Tracy <[email protected]> Date: Wed Jan 11 10:53:16 2023 +0000 [SYCL] Add unit tests for command graph POC - Add some unit tests for the command graph POC -Add missing specializations for lazy queue property commit fb28d59 Author: Ben Tracy <[email protected]> Date: Mon Jan 9 11:10:26 2023 +0000 [SYCL] Rename exec_graph to ext_oneapi_graph [SYCL] handler::ext_oneapi_graph Update to reflect changes from #65 - In line with recent spec changes, rename handler and queue shortcut functions from exec_graph to ext_oneapi_graph - Also updated usage in the examples Co-authored-by: Ewan Crawford <[email protected]> commit 1448cb5 Author: Ben Tracy <[email protected]> Date: Wed Dec 21 09:10:40 2022 +0000 [SYCL] Enable submitting sub-graphs - Enable submitting a sub-graph as part of a larger command_graph - Flag added to queue_impl to enable graph to be aware it is a sub-graph and delay flush - Adds an example whichuses a subgraph in the middle of a command_graph commit c99bdca Author: Ben Tracy <[email protected]> Date: Tue Dec 13 10:57:15 2022 +0000 [SYCL] Fix reductions not working inside graph * Graph submission now properly creates a host visible event on the command list allowing auxilliary resources to be cleaned up * executeCommandList slightly modified to block execution only for command lists not allowed to be batched. commit 3073cfc Author: Ewan Crawford <[email protected]> Date: Fri Dec 2 10:47:32 2022 +0000 [SYCL] Clean-up lazy queue PI changes * PI Minor version bump for new flag * Document new PI property as comments * Make value next consecutive bit `1 << 5`, rather than `1 << 11`. commit 7bb11ce Author: Ewan Crawford <[email protected]> Date: Wed Nov 30 13:14:50 2022 +0000 [SYCL] Use handler to execute graph Update API to match the spec change from #26 to execute a graph via the handler rather than queue submit. This spec update includes queue shortcut functions, which i've added a new test for. commit 578692f Author: Ewan Crawford <[email protected]> Date: Thu Nov 24 09:26:27 2022 +0000 [SYCL] PIMPL refactor Refactor the command_graph and node classes so that we interface with the impl types rather than user exposed types, and just the interface lives in the public facing headers. This change also means we can use a `.cpp` file for implementation code rather than being header only. The motivation for these changes was trying to get graph submission through a handler, at which point only the `sycl::detail::queue_impl` class is available rather than `sycl::queue` commit 9f127d7 Author: Ewan Crawford <[email protected]> Date: Fri Nov 18 16:27:54 2022 +0000 [SYCL] Repro for reduction fail * Add RUN lines to tests so that tests are run by LIT * clang-format existing tests, and other minor cleanups * Add `graph-explicit-reduction.cpp` which shows fail from #24 by using the `sycl::ext::oneapi::property::queue::lazy_execution` property on a queue which uses a reduction outwith the graph building API commit 2cf9d0f Author: Pablo Reble <[email protected]> Date: Tue Nov 29 21:26:28 2022 -0600 Cosmetic changes commit df971e5 Author: Ben Tracy <[email protected]> Date: Thu Nov 24 08:46:12 2022 +0000 [SYCL] Minor graph classes refactor (#36) - getSyclObjImpl and createSyclObjFromImpl support added - Minor renaming to enable this. - Adds basic results validation to dotp test - Minor fixes to address warnings etc. commit f71ea49 Author: Ewan Crawford <[email protected]> Date: Mon Nov 21 12:25:44 2022 +0000 Common changes from record & replay API (#32) Changes to common code from #6 which has already been reviewed and merged into the `sycl-graph-record-replay` branch. This patch should not contain anything specific to the record and replay API. commit 383459c Author: Pablo Reble <[email protected]> Date: Tue Nov 1 13:35:42 2022 -0500 Renaming variables commit 4478390 Author: Pablo Reble <[email protected]> Date: Tue Nov 1 12:45:31 2022 -0500 clang-format commit fa58aa3 Author: Pablo Reble <[email protected]> Date: Wed Oct 19 20:16:21 2022 -0700 renaming macro and bugfix commit 38da3c6 Author: Pablo Reble <[email protected]> Date: Tue Oct 18 07:49:47 2022 -0700 add basic tests commit 7581915 Author: Pablo Reble <[email protected]> Date: Tue Oct 18 07:40:15 2022 -0700 bugfix commit fa7494d Author: Pablo Reble <[email protected]> Date: Tue Oct 18 07:39:19 2022 -0700 starting to rework lazy execution logic commit 446ac53 Author: Pablo Reble <[email protected]> Date: Tue Oct 18 07:37:41 2022 -0700 revert changes to level-zero plugin commit 8850b18 Author: Pablo Reble <[email protected]> Date: Wed Oct 12 11:33:57 2022 -0700 fix rebase issue commit a3164de Author: Pablo Reble <[email protected]> Date: Wed Oct 12 08:03:55 2022 -0700 update API to recent proposal commit 7917086 Author: Pablo Reble <[email protected]> Date: Tue May 10 11:25:51 2022 -0500 fix formatting commit 7d81618 Author: Pablo Reble <[email protected]> Date: Fri May 6 11:54:58 2022 -0500 fix issue introd. by recent merge commit 9b46c4b Author: Pablo Reble <[email protected]> Date: Fri May 6 10:30:29 2022 -0500 fix formatting issues commit 50d49a1 Author: Julian Miller <[email protected]> Date: Tue May 3 11:29:34 2022 -0500 Propagate lazy queue property commit 0d8a5f4 Author: Pablo Reble <[email protected]> Date: Mon Mar 14 14:08:02 2022 +0100 Apply suggestions from code review Co-authored-by: Ronan Keryell <[email protected]> commit f957996 Author: Pablo Reble <[email protected]> Date: Mon May 2 21:06:42 2022 -0500 fix typos and syntax issues commit 047839b Author: Pablo Reble <[email protected]> Date: Fri Mar 11 20:47:16 2022 +0100 typo commit 2b50af4 Author: Pablo Reble <[email protected]> Date: Fri Mar 11 16:42:43 2022 +0100 update extension proposal started to incorporate feedback commit a8b5b32 Author: Pablo Reble <[email protected]> Date: Tue Feb 22 10:46:54 2022 -0600 Update pi_level_zero.cpp Fix merge conflict commit 0bad787 Author: Pablo Reble <[email protected]> Date: Mon Feb 21 22:25:38 2022 -0600 fix merge commit 656f5c3 Author: Pablo Reble <[email protected]> Date: Tue Feb 15 17:18:32 2022 -0600 Adding lazy execution property to queue commit d286c71 Author: Pablo Reble <[email protected]> Date: Fri Feb 18 15:15:10 2022 -0600 Adding initial sycl graph doc commit 1acf57e Author: Pablo Reble <[email protected]> Date: Fri Feb 18 15:16:27 2022 -0600 Inital version of sycl graph prototype
reble
pushed a commit
that referenced
this pull request
May 15, 2023
…callback The `TypeSystemMap::m_mutex` guards against concurrent modifications of members of `TypeSystemMap`. In particular, `m_map`. `TypeSystemMap::ForEach` iterates through the entire `m_map` calling a user-specified callback for each entry. This is all done while `m_mutex` is locked. However, there's nothing that guarantees that the callback itself won't call back into `TypeSystemMap` APIs on the same thread. This lead to double-locking `m_mutex`, which is undefined behaviour. We've seen this cause a deadlock in the swift plugin with following backtrace: ``` int main() { std::unique_ptr<int> up = std::make_unique<int>(5); volatile int val = *up; return val; } clang++ -std=c++2a -g -O1 main.cpp ./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b ``` ``` frame #4: std::lock_guard<std::mutex>::lock_guard frame #5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock #2 frame #6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage frame #7: lldb_private::Target::GetScratchTypeSystemForLanguage ... frame #26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths frame #27: lldb_private::SwiftASTContext::LoadModule frame #30: swift::ModuleDecl::collectLinkLibraries frame #31: lldb_private::SwiftASTContext::LoadModule frame #34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl frame #35: lldb_private::SwiftASTContext::PerformCompileUnitImports frame #36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext frame #37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState frame #38: lldb_private::Target::GetPersistentSymbol frame #41: lldb_private::TypeSystemMap::ForEach <<<< Lock #1 frame #42: lldb_private::Target::GetPersistentSymbol frame #43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols frame #44: lldb_private::IRExecutionUnit::FindSymbol frame #45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence frame #46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame #47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame #48: llvm::LinkingSymbolResolver::findSymbol frame #49: llvm::LegacyJITSymbolResolver::lookup frame #50: llvm::RuntimeDyldImpl::resolveExternalSymbols frame #51: llvm::RuntimeDyldImpl::resolveRelocations frame #52: llvm::MCJIT::finalizeLoadedModules frame #53: llvm::MCJIT::finalizeObject frame #54: lldb_private::IRExecutionUnit::ReportAllocations frame #55: lldb_private::IRExecutionUnit::GetRunnableInfo frame #56: lldb_private::ClangExpressionParser::PrepareForExecution frame #57: lldb_private::ClangUserExpression::TryParse frame #58: lldb_private::ClangUserExpression::Parse ``` Our solution is to simply iterate over a local copy of `m_map`. **Testing** * Confirmed on manual reproducer (would reproduce 100% of the time before the patch) Differential Revision: https://reviews.llvm.org/D149949
EwanC
pushed a commit
that referenced
this pull request
Feb 27, 2024
…(#80904)" This reverts commit b1ac052. This commit breaks coroutine splitting for non-swift calling convention functions. In this example: ```ll ; ModuleID = 'repro.ll' source_filename = "stdlib/test/runtime/test_llcl.mojo" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @0 = internal constant { i32, i32 } { i32 trunc (i64 sub (i64 ptrtoint (ptr @craSH to i64), i64 ptrtoint (ptr getelementptr inbounds ({ i32, i32 }, ptr @0, i32 0, i32 1) to i64)) to i32), i32 64 } define dso_local void @af_suspend_fn(ptr %0, i64 %1, ptr %2) #0 { ret void } define dso_local void @craSH(ptr %0) #0 { %2 = call token @llvm.coro.id.async(i32 64, i32 8, i32 0, ptr @0) %3 = call ptr @llvm.coro.begin(token %2, ptr null) %4 = getelementptr inbounds { ptr, { ptr, ptr }, i64, { ptr, i1 }, i64, i64 }, ptr poison, i32 0, i32 0 %5 = call ptr @llvm.coro.async.resume() store ptr %5, ptr %4, align 8 %6 = call { ptr, ptr, ptr } (i32, ptr, ptr, ...) @llvm.coro.suspend.async.sl_p0p0p0s(i32 0, ptr %5, ptr @ctxt_proj_fn, ptr @af_suspend_fn, ptr poison, i64 -1, ptr poison) ret void } define dso_local ptr @ctxt_proj_fn(ptr %0) #0 { ret ptr %0 } ; Function Attrs: nomerge nounwind declare { ptr, ptr, ptr } @llvm.coro.suspend.async.sl_p0p0p0s(i32, ptr, ptr, ...) #1 ; Function Attrs: nounwind declare token @llvm.coro.id.async(i32, i32, i32, ptr) #2 ; Function Attrs: nounwind declare ptr @llvm.coro.begin(token, ptr writeonly) #2 ; Function Attrs: nomerge nounwind declare ptr @llvm.coro.async.resume() #1 attributes #0 = { "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+invpcid,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+pku,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+vaes,+vpclmulqdq,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" } attributes #1 = { nomerge nounwind } attributes #2 = { nounwind } ``` This verifier crashes after the `coro-split` pass with ``` cannot guarantee tail call due to mismatched parameter counts musttail call void @af_suspend_fn(ptr poison, i64 -1, ptr poison) LLVM ERROR: Broken function PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: opt ../../../reduced.ll -O0 #0 0x00007f1d89645c0e __interceptor_backtrace.part.0 /build/gcc-11-XeT9lY/gcc-11-11.4.0/build/x86_64-linux-gnu/libsanitizer/asan/../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4193:28 #1 0x0000556d94d254f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:22 #2 0x0000556d94d19a2f llvm::sys::RunSignalHandlers() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Signals.cpp:105:20 #3 0x0000556d94d1aa42 SignalHandler(int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:371:36 #4 0x00007f1d88e42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520) #5 0x00007f1d88e969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 #6 0x00007f1d88e969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10 #7 0x00007f1d88e969fc pthread_kill ./nptl/pthread_kill.c:89:10 #8 0x00007f1d88e42476 gsignal ./signal/../sysdeps/posix/raise.c:27:6 #9 0x00007f1d88e287f3 abort ./stdlib/abort.c:81:7 #10 0x0000556d8944be01 std::vector<llvm::json::Value, std::allocator<llvm::json::Value>>::size() const /usr/include/c++/11/bits/stl_vector.h:919:40 #11 0x0000556d8944be01 bool std::operator==<llvm::json::Value, std::allocator<llvm::json::Value>>(std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&, std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&) /usr/include/c++/11/bits/stl_vector.h:1893:23 #12 0x0000556d8944be01 llvm::json::operator==(llvm::json::Array const&, llvm::json::Array const&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Support/JSON.h:572:69 #13 0x0000556d8944be01 llvm::json::operator==(llvm::json::Value const&, llvm::json::Value const&) (.cold) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/JSON.cpp:204:28 #14 0x0000556d949ed2bd llvm::report_fatal_error(char const*, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/ErrorHandling.cpp:82:70 #15 0x0000556d8e37e876 llvm::SmallVectorBase<unsigned int>::size() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:91:32 #16 0x0000556d8e37e876 llvm::SmallVectorTemplateCommon<llvm::DiagnosticInfoOptimizationBase::Argument, void>::end() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:282:41 #17 0x0000556d8e37e876 llvm::SmallVector<llvm::DiagnosticInfoOptimizationBase::Argument, 4u>::~SmallVector() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1215:24 #18 0x0000556d8e37e876 llvm::DiagnosticInfoOptimizationBase::~DiagnosticInfoOptimizationBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:413:7 #19 0x0000556d8e37e876 llvm::DiagnosticInfoIROptimization::~DiagnosticInfoIROptimization() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:622:7 #20 0x0000556d8e37e876 llvm::OptimizationRemark::~OptimizationRemark() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:689:7 #21 0x0000556d8e37e876 operator() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2213:14 #22 0x0000556d8e37e876 emit<llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::CGSCCAnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&)::<lambda()> > /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h:83:12 #23 0x0000556d8e37e876 llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2212:13 #24 0x0000556d8c36ecb1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::CoroSplitPass, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 #25 0x0000556d91c1a84f llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:90:12 #26 0x0000556d8c3690d1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 #27 0x0000556d91c2162d llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:278:18 #28 0x0000556d8c369035 llvm::detail::PassModel<llvm::Module, llvm::ModuleToPostOrderCGSCCPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 #29 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20 #30 0x0000556d8e30979e llvm::CoroConditionalWrapper::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroConditionalWrapper.cpp:19:74 #31 0x0000556d8c365755 llvm::detail::PassModel<llvm::Module, llvm::CoroConditionalWrapper, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 #32 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20 #33 0x0000556d89818556 llvm::SmallPtrSetImplBase::isSmall() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33 #34 0x0000556d89818556 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:17 #35 0x0000556d89818556 llvm::SmallPtrSetImpl<llvm::AnalysisKey*>::~SmallPtrSetImpl() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:321:7 #36 0x0000556d89818556 llvm::SmallPtrSet<llvm::AnalysisKey*, 2u>::~SmallPtrSet() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:427:7 #37 0x0000556d89818556 llvm::PreservedAnalyses::~PreservedAnalyses() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/Analysis.h:109:7 #38 0x0000556d89818556 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/NewPMDriver.cpp:532:10 #39 0x0000556d897e3939 optMain /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/optdriver.cpp:737:27 #40 0x0000556d89455461 main /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/opt.cpp:25:33 #41 0x00007f1d88e29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #42 0x00007f1d88e29e40 call_init ./csu/../csu/libc-start.c:128:20 #43 0x00007f1d88e29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5 #44 0x0000556d897b6335 _start (/home/ubuntu/modular/.derived/third-party/llvm-project/build-relwithdebinfo-asan/bin/opt+0x150c335) Aborted (core dumped)
EwanC
pushed a commit
that referenced
this pull request
Mar 19, 2024
TestCases/Misc/Linux/sigaction.cpp fails because dlsym() may call malloc on failure. And then the wrapped malloc appears to access thread local storage using global dynamic accesses, thus calling ___interceptor___tls_get_addr, before REAL(__tls_get_addr) has been set, so we get a crash inside ___interceptor___tls_get_addr. For example, this can happen when looking up __isoc23_scanf which might not exist in some libcs. Fix this by marking the thread local variable accessed inside the debug checks as "initial-exec", which does not require __tls_get_addr. This is probably a better alternative to llvm/llvm-project#83886. This fixes a different crash but is related to llvm/llvm-project#46204. Backtrace: ``` #0 0x0000000000000000 in ?? () #1 0x00007ffff6a9d89e in ___interceptor___tls_get_addr (arg=0x7ffff6b27be8) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:2759 #2 0x00007ffff6a46bc6 in __sanitizer::CheckedMutex::LockImpl (this=0x7ffff6b27be8, pc=140737331846066) at /path/to/llvm/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:218 #3 0x00007ffff6a448b2 in __sanitizer::CheckedMutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:129 #4 __sanitizer::Mutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:167 #5 0x00007ffff6abdbb2 in __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock (mu=0x730000000580, this=<optimized out>) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383 #6 __sanitizer::SizeClassAllocator64<__tsan::AP64>::GetFromAllocator (this=0x7ffff7487dc0 <__tsan::allocator_placeholder>, stat=stat@entry=0x7ffff570db68, class_id=11, chunks=chunks@entry=0x7ffff5702cc8, n_chunks=n_chunks@entry=128) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_primary64.h:207 #7 0x00007ffff6abdaa0 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Refill (this=<optimized out>, c=c@entry=0x7ffff5702cb8, allocator=<optimized out>, class_id=<optimized out>) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:103 #8 0x00007ffff6abd731 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Allocate (this=0x7ffff6b27be8, allocator=0x7ffff5702cc8, class_id=140737311157448) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:39 #9 0x00007ffff6abc397 in __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator64<__tsan::AP64>, __sanitizer::LargeMmapAllocatorPtrArrayDynamic>::Allocate (this=0x7ffff5702cc8, cache=0x7ffff6b27be8, size=<optimized out>, size@entry=175, alignment=alignment@entry=16) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_combined.h:69 #10 0x00007ffff6abaa6a in __tsan::user_alloc_internal (thr=0x7ffff7ebd980, pc=140737331499943, sz=sz@entry=175, align=align@entry=16, signal=true) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:198 #11 0x00007ffff6abb0d1 in __tsan::user_alloc (thr=0x7ffff6b27be8, pc=140737331846066, sz=11, sz@entry=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:223 #12 0x00007ffff6a693b5 in ___interceptor_malloc (size=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:666 #13 0x00007ffff7fce7f2 in malloc (size=175) at ../include/rtld-malloc.h:56 #14 __GI__dl_exception_create_format (exception=exception@entry=0x7fffffffd0d0, objname=0x7ffff7fc3550 "/path/to/llvm/compiler-rt/cmake-build-all-sanitizers/lib/linux/libclang_rt.tsan-x86_64.so", fmt=fmt@entry=0x7ffff7ff2db9 "undefined symbol: %s%s%s") at ./elf/dl-exception.c:157 #15 0x00007ffff7fd50e8 in _dl_lookup_symbol_x (undef_name=0x7ffff6af868b "__isoc23_scanf", undef_map=<optimized out>, ref=0x7fffffffd148, symbol_scope=<optimized out>, version=<optimized out>, type_class=0, flags=2, skip_map=0x7ffff7fc35e0) at ./elf/dl-lookup.c:793 --Type <RET> for more, q to quit, c to continue without paging-- #16 0x00007ffff656d6ed in do_sym (handle=<optimized out>, name=0x7ffff6af868b "__isoc23_scanf", who=0x7ffff6a3bb84 <__interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long)+36>, vers=vers@entry=0x0, flags=flags@entry=2) at ./elf/dl-sym.c:146 #17 0x00007ffff656d9dd in _dl_sym (handle=<optimized out>, name=<optimized out>, who=<optimized out>) at ./elf/dl-sym.c:195 #18 0x00007ffff64a2854 in dlsym_doit (a=a@entry=0x7fffffffd3b0) at ./dlfcn/dlsym.c:40 #19 0x00007ffff7fcc489 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd310, operate=0x7ffff64a2840 <dlsym_doit>, args=0x7fffffffd3b0) at ./elf/dl-catch.c:237 #20 0x00007ffff7fcc5af in _dl_catch_error (objname=0x7fffffffd368, errstring=0x7fffffffd370, mallocedp=0x7fffffffd367, operate=<optimized out>, args=<optimized out>) at ./elf/dl-catch.c:256 #21 0x00007ffff64a2257 in _dlerror_run (operate=operate@entry=0x7ffff64a2840 <dlsym_doit>, args=args@entry=0x7fffffffd3b0) at ./dlfcn/dlerror.c:138 #22 0x00007ffff64a28e5 in dlsym_implementation (dl_caller=<optimized out>, name=<optimized out>, handle=<optimized out>) at ./dlfcn/dlsym.c:54 #23 ___dlsym (handle=<optimized out>, name=<optimized out>) at ./dlfcn/dlsym.c:68 #24 0x00007ffff6a3bb84 in __interception::GetFuncAddr (name=0x7ffff6af868b "__isoc23_scanf", trampoline=140737311157448) at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:42 #25 __interception::InterceptFunction (name=0x7ffff6af868b "__isoc23_scanf", ptr_to_real=0x7ffff74850e8 <__interception::real___isoc23_scanf>, func=11, trampoline=140737311157448) at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:61 #26 0x00007ffff6a9f2d9 in InitializeCommonInterceptors () at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_common_interceptors.inc:10315 ``` Reviewed By: vitalybuka, MaskRay Pull Request: llvm/llvm-project#83890
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change the
queue::submit(command_graph<executable>)
API for launching an executable graph tohandler::exec_graph(command_graph<executable>)
. See Issue #21Using the handler is more in-keeping with the existing SYCL API and allows the execution of graph to depend on an arbitrary event through
handler::depends_on
. This design makes it easier for users to write the following code without have to block in host for waits.Queue shortcut functions are also included in the spec, as is the case in the core SYCL spec for other handler functionality.
This change also enables the explicit API to capture nested sub-graph executions, which is not currently possible in the explicit API but is possible in the record & replay API. See issue #23
For example, with this change a user can now do: