Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api-minor] Remove the disableCombineTextItems option #16234

Merged

Conversation

Snuffleupagus
Copy link
Collaborator

Please note: This parameter has never been used within the PDF.js library/viewer itself, and it was only ever added for backwards compatibility reasons.

This parameter was added in PR #7475, over six years ago, to try and optionally maintain the previous default text-extraction behaviour.
However as part of the general text-extraction improvements in PR #13257, almost two years ago, the disableCombineTextItems functionality was accidentally "broken" in various ways. Note how the only (very basic) unit-test was updated in a way that doesn't really make sense, since generally speaking you'd expect that using the option should result in more (or at least the same number of) text-items. Furthermore there's also the recent issue #16209 (comment), where the option causes almost all textContent to be concatenated together.

Hence this patch proposes that we simply remove the disableCombineTextItems option since it's essentially unused/untested functionality, as evident from the fact that it took almost two years for someone to notice that it's broken.

*Please note:* This parameter has never been used within the PDF.js library/viewer itself, and it was only ever added for backwards compatibility reasons.

This parameter was added in PR 7475, over six years ago, to try and optionally maintain the previous *default* text-extraction behaviour.
However as part of the general text-extraction improvements in PR 13257, almost two years ago, the `disableCombineTextItems` functionality was accidentally "broken" in various ways. Note how the only (very basic) unit-test was updated in a way that doesn't really make sense, since generally speaking you'd expect that using the option should result in *more* (or at least the same number of) text-items. Furthermore there's also the recent issue 16209, where the option causes almost all textContent to be concatenated together.

Hence this patch proposes that we simply remove the `disableCombineTextItems` option since it's essentially unused/untested functionality, as evident from the fact that it took almost two years for someone to notice that it's broken.
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/223a73243de31f0/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/3fe24ddeea37c22/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/223a73243de31f0/output.txt

Total script time: 27.08 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 10
  different first/second rendering: 2

Image differences available at: http://54.241.84.105:8877/223a73243de31f0/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/3fe24ddeea37c22/output.txt

Total script time: 32.37 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 33

Image differences available at: http://54.193.163.58:8877/3fe24ddeea37c22/reftest-analyzer.html#web=eq.log

@timvandermeij timvandermeij merged commit a9af0a6 into mozilla:master Apr 1, 2023
@timvandermeij
Copy link
Contributor

I agree that this is best here; thanks!

@Snuffleupagus Snuffleupagus deleted the rm-disableCombineTextItems branch April 1, 2023 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants