-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use UTF-8 Encoding by Default On Windows On Non English Languages #29755
Conversation
Does this affect the emoji that
|
@KalleOlaviNiemitalo That's a really good point. I was considering setting the page to UTF-8. It does indeed break the emoji in certain languages. UTF-8, meanwhile, is fine, though it's somewhat concerning in legacy scenarios. |
I suspect any Console.OutputEncoding change should be limited to Windows only. Is the original problem reproducible on other operating systems? |
Not sure, but you're right in that InputEncoding doesn't work for certain |
I mean, if I'm using Linux via a terminal from the 1980s and have set LANG=fi_FI.ISO-8859-1 to make the tty I/O compatible with that, then I'd like the dotnet CLI to keep the same encoding even if some other environment variable overrides the language. So if the original issues do not reproduce on Linux then it's not worth breaking this scenario just to match Windows. |
Yes, this does not appear to repro on Linux. I'll check other platforms I can access in a bit, but limiting it to Windows for now 👍 |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
… appears to be rather difficult to get the changed encoding if it happens mid process. We would probably need to write more code for this. Let's see if we want to do that.
…cause it appears to be rather difficult to get the changed encoding if it happens mid process. We would probably need to write more code for this. Let's see if we want to do that." This reverts commit eada426.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but I don't know glob/loc details such as the different between CurrentCulture and CurrentUICulture, so we should get someone who does to review this.
Thanks, do you have any contacts for loc? dotnet/dotnet-api-docs#8767 The docs are kinda confusing so I filed an issue here to ask. |
@tarekgh can probably review this or point us to someone who can. |
src/Cli/dotnet/UILanguageOverride.cs
Outdated
if (OperatingSystem.IsWindows() && Environment.OSVersion.Version.Major >= 10) // Encoding is only an issue on Windows, UTF-8 is only officially supported on 10+. | ||
{ | ||
Console.OutputEncoding = DefaultMultilingualEncoding; | ||
Console.InputEncoding = DefaultMultilingualEncoding; // Setting both encodings causes a change in the CHCP, making it so we dont need to P-Invoke ourselves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamsitnik would setting UTF-8 encoding to the console be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to this, https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page, after the May 2019 Windows 10 update UTF-8 is officially supported. Since this only happens in Windows 10+, and when using languages besides English, which already have a bad encoding without this change, I think we can take this change for 8.
cc @dsplaisted for potential approval.
… dotnet to use OS language, but still change the encoding in any case where the language isn't en
/azp run |
Pull request contains merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
…8503) Fixes #1596 Changes Made SetConsoleUI now calls into a helper which sets the encoding to support non-en languages and checks if an environment variable exists to change the language to. Testing Setting DOTNET_CLI_UI_LANGUAGE=ja now changes msbuild correctly: image Doing a complicated build (aka building MSBuild) to use multiple threads shows other threads seem to use the same UI culture: image See that chcp remains the same after execution: image (Was set to 65001 temporarily but back to the original page before execution.) Notes Much of this code is a port of this code: dotnet/sdk#29755 There are some details about the code here. [!] In addition, it will introduce a breaking change for msbuild just like the SDK. The break is documented here for the sdk: dotnet/docs#34250
Fix for: #8833, #24757, #29151, and @rawieo's remark: microsoft/vstest#821
Before:

After:

The main fix is setting the CHCP and OutputEncoding to UTF8 by default on Windows, if a custom language is set.
Things to know:
dotnet new
duplicated a lot of the old code... but it's also fixed because this runs throughMain
in the CLI. We could consider removing that code, but I didn't to lower risk of breaks.dotnet --debug
will have broken encoding, because it runs before the encoding is set. Otherwise the debugger would not be able to attach to the encoding changing code. We don't expect many people to be impacted by this, probably not worth fixing.