-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extremely slow BLAS muladd when α,β are Int #253
Comments
Note that Here gives an indication of what's going on: julia> @time mul!(C, A, B, 1, 1); # calls native BLAS
0.021381 seconds (1 allocation: 32 bytes)
julia> @time muladd!(1.0, A, B, 1.0, C); # also calls native BLAS
0.021018 seconds
julia> @time ArrayLayouts.tiled_blasmul!(ts, 1, A, B, 1, C); # slower Julia code
0.307682 seconds (9 allocations: 30.609 KiB)
julia> @time muladd!(1, A, B, 1, C); # should call native BLAS but calls tiled_blasmul! cause of types 🤦♂️
0.315244 seconds (9 allocations: 30.609 KiB) |
I see, thanks for the explanation! So - let me check whether I understand - this happens because I used Could you specify what are the conditions for |
I think we should promote the scalars to float in such cases. This is what |
I think this is a good idea - the current behavior is very unexpected and confusing, especially as it is not consistent with what |
It should be an easy fix |
Hi, perhaps I do not understand something about how
ArrayLayouts
are supposed to work and be used, but it seems that BLAS matrix multiplication (bothdefault_blasmul!
andtiled_blasmul!
) is very slow in comparison to 5-argumentLinearAlgebra.mul!
.Below is a reproducible example.
And this of course affects operations using
MulAdd
wrapper. What am I doing wrong? Or is it really a bug?Version info
PKG status
The text was updated successfully, but these errors were encountered: