Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto handle named entities in divs #743

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

arthurattwell
Copy link
Member

Till now, named entities like   could not be used in tables inside figures. We've had to manually change them to numeric, XML-valid alternatives. This is because our process couldn't re-encode these named entities as numeric ones, since kramdown doesn't 'reach' inside HTML islands. And we need these entities to be valid in XML for epub output, and for processing tools like Cheerio not to break entities in PDF and EPUB output.

So this PR adds a step to our output process that replaces named entities with numeric entities after Jekyll runs, before other processing.

We act on the HTML as a string, and do not try to parse it, because tools that parse the HTML into an AST like parse5, jsdom, or htmlparser2 break XML validity and double-encode entities when they render the HTML back to us.

This processing is not necessary for web and app outputs.

This is currently being tested in a book project at EBW, and should only be merged once it's been field tested in that project.

Till now, named entities like   could not be
used in tables inside figures. This is because
our process couldn't re-encode these named entities
as numeric ones, since kramdown doesn't 'reach'
inside HTML islands. And we need these entities to
be valid in XML for epub output, and for processing
steps like Chrrio not to break entities in PDF output.

So this adds a step to replace named entities with
numeric entities after Jekyll runs, before other processing.

We act on the HTML as a string,
and do not try to parse it, because tools that parse
the HTML into an AST like parse5, jsdom, or htmlparser2
break XML validity and double-encode entities
when they render the HTML back to us.

This processing is not necessary for web and app outputs.
Copy link

netlify bot commented Jan 14, 2025

Deploy Preview for electric-book ready!

Name Link
🔨 Latest commit 947317a
🔍 Latest deploy log https://app.netlify.com/sites/electric-book/deploys/6786945d9da40900087fc3e7
😎 Deploy Preview https://deploy-preview-743--electric-book.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant