-
Notifications
You must be signed in to change notification settings - Fork 8.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
South Asian Languages characters are incorrect in size & spacing #9490
Comments
This isn't the first issue about complex scripting in Unicode. Maybe /dup #8000 ? |
That might address it in the long term, but in the shorter term it is probably worth investigating why these characters have different widths, which is what appears to be the problem here. |
I believe these scripts are all abugidas, so many of the glyphs are probably made up of a combination of characters. So for example you can have two code points rendering as a single glyph, and thus occupying two screen cells, so that's probably what we're seeing here. You can get the same effect with latin text using combining diacritics. For example if you write out an The point of issue #8000, as skyline75489 mentioned, is to improve the rendering of theses sorts of things. I think the problem is going to be in getting that to work in a way that is compatible with wcwidth, and I have my doubts about that being feasible. Personally I would have preferred these writing systems were handled with a separate mode, the same way some of the DEC terminals worked, but it's not something I feel strongly about . |
Yeah -- this is an unfortunate outcome of us summing up the number of cells covered by the codepoints comprising a character and then trying to render its glyph in those cells. #8000 will be a crack at making us support glyphs that take more than 2 cells. Measurement is left for a future workitem. That future workitem is #1472, which tracks getting us capable of properly measuring the space occupied by a stream of codepoints and rendering it. I too wish that there were a separate mode for letting the Terminal figure out how wide things are without having to disagree on wcwidth/wcswidth, but I think that ship's sailed. /dup #1472 |
Hi! We've identified this issue as a duplicate of another one that already exists on this Issue Tracker. This specific instance is being closed in favor of tracking the concern over on the referenced thread. Thanks for your report! |
The issue is that we rely on the "East Asian With" spec by Unicode, like pretty much all terminals: https://www.unicode.org/reports/tr11/ |
Environment
Steps to reproduce
Type or Print any south asian languages characters to the terminal.
Expected behavior
All Characters should be in correct size and correctly spaced.
Actual behavior
in this example, first lang is Bengali , second is Hindi and third is Tamil.
this is how characters should be rendered with correct size and spacing.
data:image/s3,"s3://crabby-images/5190d/5190d1a241978b646f453dc8d9488c659f397a5d" alt="Screenshot 2021-03-14 132718"
instead terminal prints like this below, some characters are normal , some are too small in size. also their spacing is incorrect too.
The text was updated successfully, but these errors were encountered: