Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two-Character Unicode Support #2672

Open
queejie opened this issue Apr 21, 2021 · 8 comments
Open

Two-Character Unicode Support #2672

queejie opened this issue Apr 21, 2021 · 8 comments
Labels
Milestone

Comments

@queejie
Copy link

queejie commented Apr 21, 2021

Is your feature request related to a problem? Please describe.
I'm almost certain one cannot render two-character unicode symbols using any tags supported in MathJax. Some sample codes are those found for national flags, here. In every case where the double-character is rendered, it is rendered as two separate elements. E.g.,
image

Describe the solution you'd like
It would be great if \verb, or \unicode or some other TeX tag could render the double character. E.g., 🇦🇩, or U+1F1E6 U+1F1E9.

Describe alternatives you've considered
I am building a browser based formula editor based on MathJax, and it uses TeX as a basis for constructing elements. I had hoped there was a TeX command that might work, but after two days of reading and experimentation, I am unable to find anything.

Additional context
The output format used is SVG, and the TeX is converted to MathMl using MathJax.

Thank you!

@pkra
Copy link
Contributor

pkra commented Apr 22, 2021

I think #2595 and mathjax/MathJax-src#676 should help.

@dqjauthentrics
Copy link

Thank you very much! Most of the discussion was over my head, but if I got the gist I can now use \symbol{🇦🇩, }, and that this capability is in a new version that is not part of a CDN distribution. Do I have that right? If so, would I need to follow the instructions in the manual for "Hosting Your Own Copy", using the version in the pull request?

@dpvc
Copy link
Member

dpvc commented Apr 22, 2021

if I got the gist I can now use \symbol{🇦🇩, }, and that this capability is in a new version that is not part of a CDN distribution.

Not quite. There is no \symbol macro, but there are ones for the different variant forms, like \symbf for bold, \symsfit for sans-serif-italic, and so on. The reason Peter pointed to this is that these macros also will group multiple characters into a single MathML element, whereas each character normally is put into its own MathML element. So ab is usually translated as <mi>a</mi><mi>b</mi>, but \symbf{ab} would produce <mi mathvariant="bold">ab</mi> with both letters in the same <mi>.

Unfortunately, that does not solve the problem. The multi-character flags seem to be handled as ligatures, which means the two character have to be next to each other in the same DOM node in the final output. When MathJax encounters a character that is not in its math fonts (like these), it places them in separate <mjx-utext> elements (for CHTML output) or <text> elements (for SVG output), and so they are not next to each other in the DOM, and are not elided.

Fortunately, there is a way to get what you want. That is to use \text{🇦🇩} and to configure MathJax to use either the surrounding font for text elements (via the mtextInheritFont: true property), or to use a specific font for text elements (via mtextFont: 'Times' for example). When one of these properties is set, MathJax will render <mtext> MathML elements as a single DOM node, and that will allow the two characters to be combined by the browser. So

MathJax = {
  svg: {
    mtextInheritFont: true
  }
}

together with \text{🇦🇩} in the TeX input should provide the result you are looking for in SVG output (change svg to chtml if you are using CommonHTML output).

The current behavior of putting each unknown character into its own DOM element could probably be improved to combine them into a single one, in which case Peter's suggestion would have worked. But that's not the case right now.

As for v3.1.3, it will be released either later today or tomorrow, so no need to install your own.

@queejie
Copy link
Author

queejie commented Apr 22, 2021 via email

@dqjauthentrics
Copy link

dqjauthentrics commented Apr 27, 2021 via email

@dqjauthentrics
Copy link

Oops! My bad. Setting mtextInheritFont: true worked for SVG as well. Please consider this closed.

@dpvc
Copy link
Member

dpvc commented Apr 27, 2021

Your original configuration with mtextFont also works for me.

BTW, there was not attached screen image. Also, you don't need to load the [tex]/... packages explicitly because they are already included in input/tex-full.

I'm going to leave this open for now, to remind me to look into whether combining adjacent unknown characters into the same container element would be feasible.

@dqjauthentrics
Copy link

Thank you, @dpvc. My image was in an email reply, so it probably didn't make it. Here it is in github:
image

@dpvc dpvc self-assigned this Feb 3, 2022
@dpvc dpvc added this to the 3.3.0 milestone Aug 6, 2022
@dpvc dpvc removed their assignment Aug 6, 2022
@dpvc dpvc self-assigned this Jan 31, 2023
dpvc added a commit to mathjax/MathJax-src that referenced this issue Feb 15, 2023
@dpvc dpvc removed their assignment Feb 15, 2023
dpvc added a commit to mathjax/MathJax-src that referenced this issue Mar 11, 2023
Combine unknown characters into a common <text> element.  (mathjax/MathJax#2672)
@dpvc dpvc added Merged Merged into develop branch and removed Ready for Review labels Mar 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants