Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic dispatch when defining @device_override Base.max(x::Int64, y::Int64) #547

Open
christiangnrd opened this issue Feb 18, 2025 · 1 comment

Comments

@christiangnrd
Copy link
Contributor

christiangnrd commented Feb 18, 2025

Defining @device_override Base.max(x::Int64, y::Int64) breaks code.

MWE from devved Metal directory:

$ julia --proj -e'using LinearAlgebra, GPUArrays, Metal;Metal.@device_override Base.max(x::Int64, y::Int64)   = ccall("extern air.max.s.i64", llvmcall, Int64, (Int64, Int64), x, y); T=Float32; A, b = mtl(rand(T, 4, 4)), mtl(rand(T, 4)); c=Metal.zeros(T,4); GPUArrays.generic_matmatmul!(c,A,b, LinearAlgebra.MulAddMul(1,0))'
Stacktrace
ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#gpu_matmatmul_kernel!#96"{Float32, LinearAlgebra.MulAddMul{true, true, Int64, Int64}})(::KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, ::MtlDeviceVector{Float32, 1}, ::MtlDeviceMatrix{Float32, 1}, ::MtlDeviceVector{Float32, 1}) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to gpu_malloc)
Stacktrace:
 [1] malloc
   @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:85
 [2] gc_pool_alloc
   @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:116
 [3] copy
   @ ./broadcast.jl:1102
 [4] materialize
   @ ./broadcast.jl:872
 [5] macro expansion
   @ ~/.julia/packages/GPUArrays/uiVyU/src/host/linalg.jl:345
 [6] gpu_matmatmul_kernel!
   @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [7] gpu_matmatmul_kernel!
   @ ./none:0
Reason: unsupported call to an unknown function (call to gpu_malloc)
Stacktrace:
  [1] malloc
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:85
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:180
  [3] macro expansion
    @ ./none:0
  [4] box
    @ ./none:0
  [5] box_int64
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:209
  [6] Val
    @ ./essentials.jl:1037
  [7] copy
    @ ./broadcast.jl:1102
  [8] materialize
    @ ./broadcast.jl:872
  [9] macro expansion
    @ ~/.julia/packages/GPUArrays/uiVyU/src/host/linalg.jl:345
 [10] gpu_matmatmul_kernel!
    @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [11] gpu_matmatmul_kernel!
    @ ./none:0
Reason: unsupported call to an unknown function (call to jl_f_apply_type)
Stacktrace:
 [1] Val
   @ ./essentials.jl:1037
 [2] copy
   @ ./broadcast.jl:1102
 [3] materialize
   @ ./broadcast.jl:872
 [4] macro expansion
   @ ~/.julia/packages/GPUArrays/uiVyU/src/host/linalg.jl:345
 [5] gpu_matmatmul_kernel!
   @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [6] gpu_matmatmul_kernel!
   @ ./none:0
Reason: unsupported call to an unknown function (call to ijl_new_structv)
Stacktrace:
 [1] Val
   @ ./essentials.jl:1035
 [2] Val
   @ ./essentials.jl:1037
 [3] copy
   @ ./broadcast.jl:1102
 [4] materialize
   @ ./broadcast.jl:872
 [5] macro expansion
   @ ~/.julia/packages/GPUArrays/uiVyU/src/host/linalg.jl:345
 [6] gpu_matmatmul_kernel!
   @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [7] gpu_matmatmul_kernel!
   @ ./none:0
Reason: unsupported dynamic function invocation (call to ntuple)
Stacktrace:
 [1] copy
   @ ./broadcast.jl:1102
 [2] materialize
   @ ./broadcast.jl:872
 [3] macro expansion
   @ ~/.julia/packages/GPUArrays/uiVyU/src/host/linalg.jl:345
 [4] gpu_matmatmul_kernel!
   @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [5] gpu_matmatmul_kernel!
   @ ./none:0
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/validation.jl:167
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:382 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/6KVfH/src/TimerOutput.jl:253 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:381 [inlined]
  [5] emit_llvm(job::GPUCompiler.CompilerJob; toplevel::Bool, libraries::Bool, optimize::Bool, cleanup::Bool, validate::Bool, only_entry::Bool)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/utils.jl:110
  [6] codegen(output::Symbol, job::GPUCompiler.CompilerJob; toplevel::Bool, libraries::Bool, optimize::Bool, cleanup::Bool, validate::Bool, strip::Bool, only_entry::Bool, parent_job::Nothing)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:100
  [7] codegen
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:82 [inlined]
  [8] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:79
  [9] compile
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:74 [inlined]
 [10] (::Metal.var"#155#163"{GPUCompiler.CompilerJob{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}})(ctx::LLVM.Context)
    @ Metal ~/.julia/dev/Metal/src/compiler/compilation.jl:108
 [11] JuliaContext(f::Metal.var"#155#163"{GPUCompiler.CompilerJob{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:34
 [12] JuliaContext(f::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:25
 [13] macro expansion
    @ ~/.julia/dev/Metal/src/compiler/compilation.jl:107 [inlined]
 [14] macro expansion
    @ ~/.julia/packages/ObjectiveC/TgrW6/src/os.jl:264 [inlined]
 [15] compile(job::GPUCompiler.CompilerJob)
    @ Metal ~/.julia/dev/Metal/src/compiler/compilation.jl:105
 [16] actual_compilation(cache::Dict{Any, Any}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}, compiler::typeof(Metal.compile), linker::typeof(Metal.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/execution.jl:237
 [17] cached_compilation(cache::Dict{Any, Any}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/execution.jl:151
 [18] macro expansion
    @ ~/.julia/dev/Metal/src/compiler/execution.jl:189 [inlined]
 [19] macro expansion
    @ ./lock.jl:273 [inlined]
 [20] mtlfunction(f::GPUArrays.var"#gpu_matmatmul_kernel!#96"{Float32, LinearAlgebra.MulAddMul{true, true, Int64, Int64}}, tt::Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, MtlDeviceVector{Float32, 1}, MtlDeviceMatrix{Float32, 1}, MtlDeviceVector{Float32, 1}}}; name::Nothing, kwargs::@Kwargs{})
    @ Metal ~/.julia/dev/Metal/src/compiler/execution.jl:184
 [21] mtlfunction(f::GPUArrays.var"#gpu_matmatmul_kernel!#96"{Float32, LinearAlgebra.MulAddMul{true, true, Int64, Int64}}, tt::Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, MtlDeviceVector{Float32, 1}, MtlDeviceMatrix{Float32, 1}, MtlDeviceVector{Float32, 1}}})
    @ Metal ~/.julia/dev/Metal/src/compiler/execution.jl:182
 [22] macro expansion
    @ ~/.julia/dev/Metal/src/compiler/execution.jl:85 [inlined]
 [23] (::KernelAbstractions.Kernel{MetalBackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_matmatmul_kernel!#96"{Float32, LinearAlgebra.MulAddMul{true, true, Int64, Int64}}})(::MtlVector{Float32, Metal.PrivateStorage}, ::Vararg{Any}; ndrange::Tuple{Int64}, workgroupsize::Nothing)
    @ Metal.MetalKernels ~/.julia/dev/Metal/src/MetalKernels.jl:110
 [24] generic_matmatmul!(C::MtlVector{Float32, Metal.PrivateStorage}, A::MtlMatrix{Float32, Metal.PrivateStorage}, B::MtlVector{Float32, Metal.PrivateStorage}, add::LinearAlgebra.MulAddMul{true, true, Int64, Int64})
    @ GPUArrays ~/.julia/packages/GPUArrays/uiVyU/src/host/linalg.jl:358
 [25] top-level scope
    @ none:1
@christiangnrd christiangnrd changed the title Dynamic dispatch when defining @device_override Base.max(x::Int64, y::Int64) Dynamic dispatch in GPUArrays matmul when defining @device_override Base.max(x::Int64, y::Int64) Feb 18, 2025
@christiangnrd christiangnrd changed the title Dynamic dispatch in GPUArrays matmul when defining @device_override Base.max(x::Int64, y::Int64) Dynamic dispatch when defining @device_override Base.max(x::Int64, y::Int64) Feb 18, 2025
@christiangnrd
Copy link
Contributor Author

christiangnrd commented Feb 18, 2025

New MWE.

repro.jl:

using KernelAbstractions, Metal, LLVM.Interop;
Metal.@device_override Base.max(x::Int64, y::Int64) = ccall("extern air.max.s.i64", llvmcall, Int64, (Int64, Int64), x, y);

c=Metal.zeros(4);

# Modified from GPUArrays.generic_matmatmul!
function reproducer!(C::AbstractArray{R}) where {R}
    @kernel function repro_kernel!(C)
        assume.(size(C) .> 0)
    end
    repro_kernel!(get_backend(C))(C; ndrange = size(C))
end
reproducer!(c)
Stacktrace
ERROR: LoadError: InvalidIRError: compiling MethodInstance for (::var"#gpu_repro_kernel!#2")(::KernelAbstractions.CompilerMetadata{…}, ::MtlDeviceVector{…}) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to gpu_malloc)
Stacktrace:
 [1] malloc
   @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:85
 [2] gc_pool_alloc
   @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:116
 [3] copy
   @ ./broadcast.jl:1102
 [4] materialize
   @ ./broadcast.jl:872
 [5] gpu_repro_kernel!
   @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [6] gpu_repro_kernel!
   @ ./none:0
Reason: unsupported call to an unknown function (call to gpu_malloc)
Stacktrace:
  [1] malloc
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:85
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:180
  [3] macro expansion
    @ ./none:0
  [4] box
    @ ./none:0
  [5] box_int64
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/runtime.jl:209
  [6] Val
    @ ./essentials.jl:1037
  [7] copy
    @ ./broadcast.jl:1102
  [8] materialize
    @ ./broadcast.jl:872
  [9] gpu_repro_kernel!
    @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [10] gpu_repro_kernel!
    @ ./none:0
Reason: unsupported call to an unknown function (call to jl_f_apply_type)
Stacktrace:
 [1] Val
   @ ./essentials.jl:1037
 [2] copy
   @ ./broadcast.jl:1102
 [3] materialize
   @ ./broadcast.jl:872
 [4] gpu_repro_kernel!
   @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [5] gpu_repro_kernel!
   @ ./none:0
Reason: unsupported call to an unknown function (call to ijl_new_structv)
Stacktrace:
 [1] Val
   @ ./essentials.jl:1035
 [2] Val
   @ ./essentials.jl:1037
 [3] copy
   @ ./broadcast.jl:1102
 [4] materialize
   @ ./broadcast.jl:872
 [5] gpu_repro_kernel!
   @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [6] gpu_repro_kernel!
   @ ./none:0
Reason: unsupported dynamic function invocation (call to ntuple)
Stacktrace:
 [1] copy
   @ ./broadcast.jl:1102
 [2] materialize
   @ ./broadcast.jl:872
 [3] gpu_repro_kernel!
   @ ~/.julia/packages/KernelAbstractions/sWSE0/src/macros.jl:322
 [4] gpu_repro_kernel!
   @ ./none:0
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/validation.jl:167
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:382 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/6KVfH/src/TimerOutput.jl:253 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:381 [inlined]
  [5] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/utils.jl:110
  [6] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:100
  [7] codegen
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:82 [inlined]
  [8] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:79
  [9] compile
    @ ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:74 [inlined]
 [10] (::Metal.var"#155#163"{GPUCompiler.CompilerJob{GPUCompiler.MetalCompilerTarget, Metal.MetalCompilerParams}})(ctx::LLVM.Context)
    @ Metal ~/.julia/dev/Metal/src/compiler/compilation.jl:108
 [11] JuliaContext(f::Metal.var"#155#163"{GPUCompiler.CompilerJob{…}}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:34
 [12] JuliaContext(f::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/driver.jl:25
 [13] macro expansion
    @ ~/.julia/dev/Metal/src/compiler/compilation.jl:107 [inlined]
 [14] macro expansion
    @ ~/.julia/packages/ObjectiveC/TgrW6/src/os.jl:264 [inlined]
 [15] compile(job::GPUCompiler.CompilerJob)
    @ Metal ~/.julia/dev/Metal/src/compiler/compilation.jl:105
 [16] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(Metal.compile), linker::typeof(Metal.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/execution.jl:237
 [17] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/OGnEB/src/execution.jl:151
 [18] macro expansion
    @ ~/.julia/dev/Metal/src/compiler/execution.jl:189 [inlined]
 [19] macro expansion
    @ ./lock.jl:273 [inlined]
 [20] mtlfunction(f::var"#gpu_repro_kernel!#2", tt::Type{Tuple{…}}; name::Nothing, kwargs::@Kwargs{})
    @ Metal ~/.julia/dev/Metal/src/compiler/execution.jl:184
 [21] mtlfunction
    @ ~/.julia/dev/Metal/src/compiler/execution.jl:182 [inlined]
 [22] macro expansion
    @ ~/.julia/dev/Metal/src/compiler/execution.jl:85 [inlined]
 [23] (::KernelAbstractions.Kernel{…})(args::MtlVector{…}; ndrange::Tuple{…}, workgroupsize::Nothing)
    @ Metal.MetalKernels ~/.julia/dev/Metal/src/MetalKernels.jl:110
 [24] reproducer!(C::MtlVector{Float32, Metal.PrivateStorage})
    @ Main ~/.julia/dev/Metal/repro.jl:11
 [25] top-level scope
    @ ~/.julia/dev/Metal/repro.jl:13
 [26] include(fname::String)
    @ Main ./sysimg.jl:38
 [27] top-level scope
    @ REPL[1]:1
in expression starting at /Users/christian/.julia/dev/Metal/repro.jl:13
Some type information was truncated. Use `show(err)` to see complete types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant