Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement NarrowUtf16ToAscii for AArch64 #70080

Merged

Conversation

SwapnilGaikwad
Copy link
Contributor

Fixes #41292 partially.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jun 1, 2022
@ghost
Copy link

ghost commented Jun 1, 2022

Tagging subscribers to this area: @dotnet/area-system-text-encoding
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #41292 partially.

Author: SwapnilGaikwad
Assignees: -
Labels:

area-System.Text.Encoding

Milestone: -

@dnfadmin
Copy link

dnfadmin commented Jun 1, 2022

CLA assistant check
All CLA requirements met.

@SwapnilGaikwad
Copy link
Contributor Author

Hi @kunalspathak, you might want to take a look at this PR.

@kunalspathak
Copy link
Member

@dotnet/jit-contrib

@kunalspathak
Copy link
Member

Thanks @SwapnilGaikwad for your contribution. Could you also share some performance numbers? You can start with https://github.com/dotnet/performance/blob/d7dac8a7ca12a28d099192f8a901cf8e30361384/src/benchmarks/micro/libraries/System.Text.Encoding/Perf.Encoding.cs.

@SwapnilGaikwad
Copy link
Contributor Author

Thanks @SwapnilGaikwad for your contribution. Could you also share some performance numbers? You can start with https://github.com/dotnet/performance/blob/d7dac8a7ca12a28d099192f8a901cf8e30361384/src/benchmarks/micro/libraries/System.Text.Encoding/Perf.Encoding.cs.

Hi Kunal, did a quick test on A72 for the GetBytes method from the System.Text.Tests.Perf_Encoding class. The patch executes about 6% (for ascii) and 3% (for utf-8) strings of 2048 size relative to the main.
However, had to comment out the debug asserts to avoid emitting them in assembly. We are inspecting the assembly further to spot suboptimal sequence of instructions ('ll work on this in the next week due to public holidays).

@kunalspathak
Copy link
Member

@SwapnilGaikwad - Did you get chance to make any progress?

@SwapnilGaikwad
Copy link
Contributor Author

Hi @kunalspathak, we made progress. Currently benchmarking the version that combines SSE2 and ASIMD implementations along with the above comments. I hope to get it ready by tomorrow.

@SwapnilGaikwad SwapnilGaikwad force-pushed the github-narrowUtf16ToAscii-intrinsic branch 2 times, most recently from ef51903 to daf52d6 Compare June 15, 2022 16:34
@SwapnilGaikwad
Copy link
Contributor Author

Hi @kunalspathak, now the patch makes use of the vector API more. For ASCII strings of 512 size, it executes in about 0.92x on AArch64 and 0.86x on x86 compared to the execution time for the HEAD.
The generic vector implementation (in the Vector.IsHardwareAccelerated block) performs chunked reads so the improvement is not as significant as one would expect while moving from scalar to SIMD version.
Do you have any recommendations to extract the assembly for the intrinsic? Couldn't get it using --disassm option from the docs.

@kunalspathak
Copy link
Member

kunalspathak commented Jun 15, 2022

For ASCII strings of 512 size, it executes in about 0.92x on AArch64 and 0.86x on x86 compared to the execution time for the HEAD.

That sounds great. Do you mind posting the actual numbers like done in #70654 (comment)?

Couldn't get it using --disassm option from the docs.

I think it has a typo. Can you try with --disasm? You can also set COMPlus_JitDisasm=<methodName> to see the disassembly. Note that it will only work on debug/checked clrjit.

@SwapnilGaikwad
Copy link
Contributor Author

Hi @kunalspathak, here are the numbers.

On AArch64 (Altra)

|   Method |        Job |                                                                                              Toolchain | size | encName |      Mean |    Error |   StdDev |    Median |       Min |       Max | Ratio | MannWhitney(2%) |  Gen 0 | Allocated | Alloc Ratio |
|--------- |----------- |------------------------------------------------------------------------------------------------------- |----- |-------- |----------:|---------:|---------:|----------:|----------:|----------:|------:|---------------- |-------:|----------:|------------:|
| GetBytes | Job-RILZMR |     /runtime_HEAD/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun   |   16 |   ascii |  30.58 ns | 0.051 ns | 0.048 ns |  30.59 ns |  30.43 ns |  30.63 ns |  1.00 |            Base | 0.0764 |      40 B |        1.00 |
| GetBytes | Job-UMNKCC | /runtime_intrinsic/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun  |   16 |   ascii |  31.76 ns | 0.023 ns | 0.019 ns |  31.77 ns |  31.74 ns |  31.80 ns |  1.04 |          Slower | 0.0765 |      40 B |        1.00 |
|          |            |                                                                                                        |      |         |           |          |          |           |           |           |       |                 |        |           |             |
| GetBytes | Job-RILZMR |     /runtime_HEAD/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun   |   16 |   utf-8 |  33.70 ns | 0.044 ns | 0.034 ns |  33.70 ns |  33.63 ns |  33.75 ns |  1.00 |            Base | 0.0764 |      40 B |        1.00 |
| GetBytes | Job-UMNKCC | /runtime_intrinsic/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun  |   16 |   utf-8 |  31.55 ns | 0.087 ns | 0.073 ns |  31.56 ns |  31.41 ns |  31.66 ns |  0.94 |          Faster | 0.0765 |      40 B |        1.00 |
|          |            |                                                                                                        |      |         |           |          |          |           |           |           |       |                 |        |           |             |
| GetBytes | Job-RILZMR |     /runtime_HEAD/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun   |  512 |   ascii | 181.52 ns | 0.199 ns | 0.186 ns | 181.42 ns | 181.25 ns | 181.97 ns |  1.00 |            Base | 1.0243 |     536 B |        1.00 |
| GetBytes | Job-UMNKCC | /runtime_intrinsic/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun  |  512 |   ascii | 164.45 ns | 2.145 ns | 2.006 ns | 163.97 ns | 162.13 ns | 168.03 ns |  0.91 |          Faster | 1.0244 |     536 B |        1.00 |
|          |            |                                                                                                        |      |         |           |          |          |           |           |           |       |                 |        |           |             |
| GetBytes | Job-RILZMR |     /runtime_HEAD/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun   |  512 |   utf-8 | 229.04 ns | 0.175 ns | 0.137 ns | 229.02 ns | 228.82 ns | 229.31 ns |  1.00 |            Base | 1.0240 |     536 B |        1.00 |
| GetBytes | Job-UMNKCC | /runtime_intrinsic/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun  |  512 |   utf-8 | 229.45 ns | 0.770 ns | 0.720 ns | 229.49 ns | 228.42 ns | 231.02 ns |  1.00 |            Same | 1.0240 |     536 B |        1.00 |

On x86 (Xeon Gold 5120T)

|   Method |        Job |                                                                                                        Toolchain | size | encName |      Mean |    Error |   StdDev |    Median |       Min |       Max | Ratio | MannWhitney(2%) |  Gen 0 | Allocated | Alloc Ratio |
|--------- |----------- |----------------------------------------------------------------------------------------------------------------- |----- |-------- |----------:|---------:|---------:|----------:|----------:|----------:|------:|---------------- |-------:|----------:|------------:|
| GetBytes | Job-HGFNCZ |         /runtime_HEAD/artifacts/bin/testhost/net7.0-Linux-Release-x64/shared/Microsoft.NETCore.App/7.0.0/corerun |   16 |   ascii |  24.13 ns | 0.091 ns | 0.076 ns |  24.11 ns |  24.06 ns |  24.31 ns |  1.00 |            Base | 0.0039 |      40 B |        1.00 |
| GetBytes | Job-RVIYCK | /runtime_intrinsic/artifacts/bin/testhost/net7.0-Linux-Release-x64/shared/Microsoft.NETCore.App/7.0.0/corerun    |   16 |   ascii |  25.86 ns | 0.229 ns | 0.203 ns |  25.84 ns |  25.58 ns |  26.33 ns |  1.07 |          Slower | 0.0039 |      40 B |        1.00 |
|          |            |                                                                                                                  |      |         |           |          |          |           |           |           |       |                 |        |           |             |
| GetBytes | Job-HGFNCZ |         /runtime_HEAD/artifacts/bin/testhost/net7.0-Linux-Release-x64/shared/Microsoft.NETCore.App/7.0.0/corerun |   16 |   utf-8 |  25.83 ns | 0.102 ns | 0.085 ns |  25.79 ns |  25.74 ns |  26.03 ns |  1.00 |            Base | 0.0039 |      40 B |        1.00 |
| GetBytes | Job-RVIYCK | /runtime_intrinsic/artifacts/bin/testhost/net7.0-Linux-Release-x64/shared/Microsoft.NETCore.App/7.0.0/corerun    |   16 |   utf-8 |  23.94 ns | 0.293 ns | 0.260 ns |  23.80 ns |  23.75 ns |  24.59 ns |  0.93 |          Faster | 0.0040 |      40 B |        1.00 |
|          |            |                                                                                                                  |      |         |           |          |          |           |           |           |       |                 |        |           |             |
| GetBytes | Job-HGFNCZ |         /runtime_HEAD/artifacts/bin/testhost/net7.0-Linux-Release-x64/shared/Microsoft.NETCore.App/7.0.0/corerun |  512 |   ascii | 103.24 ns | 0.175 ns | 0.137 ns | 103.21 ns | 103.11 ns | 103.60 ns |  1.00 |            Base | 0.0532 |     536 B |        1.00 |
| GetBytes | Job-RVIYCK | /runtime_intrinsic/artifacts/bin/testhost/net7.0-Linux-Release-x64/shared/Microsoft.NETCore.App/7.0.0/corerun    |  512 |   ascii |  88.31 ns | 0.355 ns | 0.296 ns |  88.25 ns |  87.88 ns |  89.00 ns |  0.86 |          Faster | 0.0531 |     536 B |        1.00 |
|          |            |                                                                                                                  |      |         |           |          |          |           |           |           |       |                 |        |           |             |
| GetBytes | Job-HGFNCZ |         /runtime_HEAD/artifacts/bin/testhost/net7.0-Linux-Release-x64/shared/Microsoft.NETCore.App/7.0.0/corerun |  512 |   utf-8 | 124.10 ns | 0.469 ns | 0.392 ns | 123.95 ns | 123.82 ns | 125.06 ns |  1.00 |            Base | 0.0531 |     536 B |        1.00 |
| GetBytes | Job-RVIYCK | /runtime_intrinsic/artifacts/bin/testhost/net7.0-Linux-Release-x64/shared/Microsoft.NETCore.App/7.0.0/corerun    |  512 |   utf-8 | 119.51 ns | 0.834 ns | 0.696 ns | 119.62 ns | 118.29 ns | 120.67 ns |  0.96 |          Faster | 0.0531 |     536 B |        1.00 |

The --disasm flag says the disassembly is not supported on AArch64, give Arm64 is not supported (Iced library limitation) message. I used the COMPlus_JitDisasm=<methodName> option earlier but it dumped the assembly for the asserts, so I commented them. To be sure about the sequence, I tried with the release version but couldn't get the assembly (noted that it's not possible). I am not sure whether the assembly with the checked/debug version without asserts as optimal as (or can be assumed to be) the release version. Would you recommend using it for reference?

@kunalspathak
Copy link
Member

You can build everything release - coreclr/libraries and then drop a checked clrjit to make that environment variable work.

@kunalspathak
Copy link
Member

Hi @kunalspathak, here are the numbers.

Thanks for sharing the numbers. It seems the ascii-16 bytes is slightly slower. Do you know why?

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks good overall. I would like to see the disasm code difference of SSE2 as well as the disasm for AdvSimd. @tannergooding - do you mind taking a look as well?

Sse2.StoreScalar((ulong*)pAsciiBuffer, asciiVector.AsUInt64()); // ulong* calculated here is UNALIGNED

Vector128<byte> asciiVector = ExtractAsciiVector(utf16VectorFirst);
asciiVector.GetLower().StoreUnsafe(ref asciiBuffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be curious to see the assembly difference for SSE2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍. Vector64<T> is a bit odd on x86/x64 since its treated as a regular struct { ulong _value; }, so I'd expect promotion/etc to do the right thing here
but it might not be as efficient as extracting the lowest UInt64 scalar (it probably could be with some tweaks however if it does differ)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that by switching to StoreUnsafe introduced a bigger sequence of assembly with a few unnecessary movs.
On HEAD, we get the following assembly for extract and store.

   vpackuswb xmm0, xmm0, xmm0
    vmovq    qword ptr [r15], xmm0
    mov      r12d, 8
    test     r15b, 8

With the above change, extract and store looks as following.

mov      r12, rbx
    vpackuswb xmm0, xmm0, xmm0
    vmovapd  xmmword ptr [rbp-80H], xmm0
    mov      rax, qword ptr [rbp-80H]
    mov      qword ptr [rbp-58H], rax
    mov      rax, qword ptr [rbp-58H]
    mov      qword ptr [r12], rax
    mov      r13d, 8
    test     bl, 8

Would you suggest to stick to Vector128 api in this case?
Full assembly dumps are available here - HEAD, PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant - can you share the arm64 versions? You can also turn off the address displays using set COMPlus_JitDiffableDasm=1.

Copy link
Contributor Author

@SwapnilGaikwad SwapnilGaikwad Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assembly dumps without Debug.Assert() for x86- HEAD, PR.
Having assert introduced additional stack mov. Commenting debug asserts may resemble closely to the assembly with the release builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant - can you share the arm64 versions? You can also turn off the address displays using set COMPlus_JitDiffableDasm=1.

Here they are: HEAD, PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are identical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the updated one. For the PR, assembly for NarrowUtf16ToAscii_Intrinsified should have been dumped instead of NarrowUtf16ToAscii.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this one uses StoreUnsafe because I don't see AV related code generated as you pointed out in #70080 (comment)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this one uses StoreUnsafe. The aligned stores are only performed in a loop, now changing them to StoreUnsafe.

@ghost ghost added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jun 16, 2022
@SwapnilGaikwad
Copy link
Contributor Author

Hi @kunalspathak, here are the numbers.

Thanks for sharing the numbers. It seems the ascii-16 bytes is slightly slower. Do you know why?

It seems the loop peeling logic to get the destination pointing to 8-byte aligned address adds a slight overhead for smaller strings.
However, for longer string the loop peeling logic pays off. If we remove the 8-byte aligned write logic, we see very negligible overhead compared to the current HEAD [1].

[1] Microbenchmark results after removing the aligned write logic:
release_runtime_intrinsic: The current PR
release_without_peel: The current PR without aligned write (commenting lines 1551 to 1570)
runtime_HEAD: Current HEAD

|   Method |        Job |                                                                                                     Toolchain | size | encName |      Mean |    Error |   StdDev |    Median |       Min |       Max | Ratio | MannWhitney(2%) |  Gen 0 | Allocated | Alloc Ratio |
|--------- |----------- |-------------------------------------------------------------------------------------------------------------- |----- |-------- |----------:|---------:|---------:|----------:|----------:|----------:|------:|---------------- |-------:|----------:|------------:|
| GetBytes | Job-TLYBOZ | /release_runtime_intrinsic/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun |   16 |   ascii |  34.72 ns | 0.039 ns | 0.037 ns |  34.72 ns |  34.66 ns |  34.78 ns |  1.07 |          Slower | 0.0765 |      40 B |        1.00 |
| GetBytes | Job-NLAVLI |      /release_without_peel/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun |   16 |   ascii |  32.07 ns | 0.058 ns | 0.051 ns |  32.06 ns |  31.95 ns |  32.15 ns |  0.99 |            Same | 0.0764 |      40 B |        1.00 |
| GetBytes | Job-AMSBUC |              /runtime_HEAD/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun |   16 |   ascii |  32.48 ns | 0.046 ns | 0.043 ns |  32.49 ns |  32.41 ns |  32.54 ns |  1.00 |            Base | 0.0765 |      40 B |        1.00 |
|          |            |                                                                                                               |      |         |           |          |          |           |           |           |       |                 |        |           |             |
| GetBytes | Job-YCMYKR | /release_runtime_intrinsic/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun |  512 |   ascii | 158.93 ns | 0.265 ns | 0.221 ns | 158.93 ns | 158.52 ns | 159.43 ns |  0.90 |          Faster | 1.0241 |     536 B |        1.00 |
| GetBytes | Job-NLAVLI |      /release_without_peel/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun |  512 |   ascii | 168.20 ns | 1.472 ns | 1.377 ns | 168.33 ns | 166.01 ns | 170.06 ns |  0.96 |          Faster | 1.0246 |     536 B |        1.00 |
| GetBytes | Job-AMSBUC |              /runtime_HEAD/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun |  512 |   ascii | 175.96 ns | 0.719 ns | 0.672 ns | 175.88 ns | 174.48 ns | 177.05 ns |  1.00 |            Base | 1.0239 |     536 B |        1.00 |

@ghost ghost removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Jun 16, 2022
@SwapnilGaikwad
Copy link
Contributor Author

How can I dump the assembly for NarrowUtf16ToAscii_Intrinsified to ensure that the constants are not emitted? The COMPlus_JitDump option while executing micro-benchmarks don't dump the assembly for the narrowing method even with the checked+debug build. It dumps other methods. I used the following command.

COMPlus_JitDump="*" dotnet run -c Release -f net7.0 --filter "System.Text.Tests.Perf_Encoding.GetBytes" --corerun "$HEAD/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun" "$PATCH/bin/testhost/net7.0-Linux-Release-arm64/shared/Microsoft.NETCore.App/7.0.0/corerun" --statisticalTest 2%

Also, re-building with ./build.sh clr -rc checked -lc release doesn't update the corerun binaries. A clean build avoids this issue but takes much longer. Alternatively, I created a console app; extracted the NarrowUtf16ToAscii_Intrinsified method and executed it using the skeleton used by ASCIIUtilityTests.cs
The COMPlus_JitDump works fine there. However, it dumps quite suboptimal assembly including logic to throw PlatformNotSupportedException().

@kunalspathak @tannergooding Do you guys have any better ways to extract the assembly reliably? 🤔

@kunalspathak
Copy link
Member

@kunalspathak @tannergooding Do you guys have any better ways to extract the assembly reliably? 🤔

Follow #70080 (comment). To remove unnecessary asserts, you will need release version of SPC. Did you try #70080 (comment)?

@SwapnilGaikwad SwapnilGaikwad force-pushed the github-narrowUtf16ToAscii-intrinsic branch from daf52d6 to a5bb5cf Compare June 17, 2022 13:10
@SwapnilGaikwad
Copy link
Contributor Author

@kunalspathak @tannergooding Do you guys have any better ways to extract the assembly reliably? 🤔

Follow #70080 (comment). To remove unnecessary asserts, you will need release version of SPC. Did you try #70080 (comment)?

Thanks! Now can extract the assembly. The missing piece was that the build wasn't checked.

@SwapnilGaikwad SwapnilGaikwad force-pushed the github-narrowUtf16ToAscii-intrinsic branch from a5bb5cf to 6968765 Compare June 20, 2022 14:52
@SwapnilGaikwad SwapnilGaikwad force-pushed the github-narrowUtf16ToAscii-intrinsic branch 2 times, most recently from 9fc032e to 5ae8594 Compare June 22, 2022 14:37
@SwapnilGaikwad SwapnilGaikwad force-pushed the github-narrowUtf16ToAscii-intrinsic branch from 5ae8594 to 3b775f3 Compare June 24, 2022 15:18
@SwapnilGaikwad SwapnilGaikwad force-pushed the github-narrowUtf16ToAscii-intrinsic branch from 3b775f3 to e6f6cb9 Compare June 28, 2022 10:46
Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update a comment before we can go ahead with the merge.

Also, I see that when the review comments are addressed, they are squashed in previous commits. It's not something that is common on this repo. As a reviewer, I have to review entire changes instead of just the updates that were done as part of review comment. A better approach would be to not squash the commit and so those can be reviewed as a standalone change, and it would make my role as reviewer a lot easier. Is there some benefit to the approach you are using?

@ghost ghost added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jun 28, 2022
@ghost ghost removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Jun 29, 2022
@SwapnilGaikwad
Copy link
Contributor Author

Need to update a comment before we can go ahead with the merge.

Also, I see that when the review comments are addressed, they are squashed in previous commits. It's not something that is common on this repo. As a reviewer, I have to review entire changes instead of just the updates that were done as part of review comment. A better approach would be to not squash the commit and so those can be reviewed as a standalone change, and it would make my role as reviewer a lot easier. Is there some benefit to the approach you are using?

Sure. Apologies for inconvenience. I'll use add separate commits going forward.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for your contribution!

@kunalspathak kunalspathak merged commit 0402550 into dotnet:main Jun 30, 2022
@SwapnilGaikwad SwapnilGaikwad deleted the github-narrowUtf16ToAscii-intrinsic branch June 30, 2022 10:19
@SwapnilGaikwad SwapnilGaikwad restored the github-narrowUtf16ToAscii-intrinsic branch July 28, 2022 16:32
@ghost ghost locked as resolved and limited conversation to collaborators Aug 27, 2022
@SwapnilGaikwad SwapnilGaikwad deleted the github-narrowUtf16ToAscii-intrinsic branch September 28, 2022 10:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Text.Encoding community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize System.Text.ASCIIUtility for arm64 using cross-platform intrinsics
7 participants