-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mono] iOS - Size reduction: optimize corelib runtime type checks in generic functions #54849
Comments
I wrote a toy implementation of an intrinsic that allows some type equality comparisons to be evaluated during AOT compilation here: imhameed@4d747b0#diff-ce66e94ce2cdb8e54003a124a4057ce5d5dc5325ad32c9601a4662f8fa39367aR1808 I've disabled partial generic sharing in this exploratory branch to get genuinely monomorphic instantiations of generic functions. Shared generic instantiations fetch auxiliary data through some sort of generic context. I don't completely understand this yet, but there is some documentation covering shared generics here. For posterity, here is my understanding of what partial generic sharing (as distinct from generic sharing in general) is meant to accomplish, as explained to me by Zoltan: Suppose you have a generic function that takes one type parameter. Call it This is fine when the function is parametric over its type parameters but can be problematic otherwise. Examples: Both cause lots of branchy runtime type checking code to be generated. For these examples, we'd ideally generate specialized monomorphic code for each supported primitive type (ideally this would just wrap a single instruction, maybe with a safepoint poll on function entry), plus a single function that does nothing but immediately raise an exception, shared across all other invalid types. Some preliminary size results from full AOT compilation of System.Private.CoreLib.dll on amd64 Linux:
This is a reduction of ~985.66 KiB (the result is ~89% as large as the baseline). Improvements are more modest on a linked version of the stdlib. Here are some local numbers for size-comparison/MySingleView, with a linked/trimmed copy of System.Private.CoreLib.dll-llvm.o compiled by hand:
The same for System.Private.CoreLib.dll.o, compiled by hand:
This is an improvement of ~191.32 KiB (the result is ~92% as large as the baseline). We have a dead code elimination pass and some constant folding passes that operate on mini IR so I'm not sure why the |
I guess the motivation for partial generic sharing is "we want some valuetype sharing, but we don't want gsharedvt where we can't even copy values to a registers". But it doesn't seem like it's finding a lot of opportunities to pay off and in the case of these helper functions that use In the long term, we will want more control over AOT specialization (particularly for numerics heavy code using the static virtual interface methods feature), so perhaps we would want some version of partial generic sharing (perhaps more aggressive - so not only ints and enums are shared, but maybe wrappers around vector types). But I think for now it makes sense to take the new intrinsic and to flip the default for partial generic sharing to off, and provide an option to turn it back on through command line options. |
The reason this type of sharing is used is mainly to support enums, i.e. for a method Sort, a Sort<T_INT> instance could handle all enums whose base type is int. So this is not really an optimization but required to be able to handle enum and generics efficiently. The problem in this case is that optimizing this code requires higher level optimizations than the ones currently implemented in the mono JIT. I.e.
In theory, the JIT could determine that the result of ldtoken !!T is a type which is either int, or an enum whose base type is int, and eliminate the comparison, but it requires the lowering of ldtoken instruction to happen later, i.e. the JIT IR would need to include an LDTOKEN opcode, and a JIT pass which operates on them. |
In the specific case I'm calling out here ( |
Comparisons of the form ``` internal static void ThrowForUnsupportedIntrinsicsVectorBaseType<T>() where T : struct { if (typeof(T) != typeof(byte) && typeof(T) != typeof(sbyte) && ... } ``` are very common in BCL code. They are compiled to: ``` IL_0001: ldtoken !!T IL_0006: call [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle([System.Runtime]System.RuntimeTypeHandle) IL_000b: ldtoken [System.Runtime]System.Byte IL_0010: call [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle([System.Runtime]System.RuntimeTypeHandle) IL_0015: call bool [System.Runtime]System.Type::op_Inequality([System.Runtime]System.Type, [System.Runtime]System.Type) IL_001a: brfalse IL_015a ``` In gshared code, !!T is usually constrained to be a reference, or to have a primitive base type. Use this fact to eliminate some of the Type::op_Equality/op_Inequality () calls. Fixes dotnet#54849.
Seen while investigating size differences between legacy Mono and netcore Mono when used for iOS. Functions that call
ThrowHelper.ThrowForUnsupportedNumericVectorBaseType
are a great example of what goes wrong; with both mini and LLVM we fail to elide impossible branches and we end up repeatedly generating kilobytes of noisy dead code across several functions.The text was updated successfully, but these errors were encountered: