-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: All page lines are merged into a single line #16209
Comments
That version is no longer supported, please find the latest releases at https://mozilla.github.io/pdf.js/getting_started/#download
Please see https://github.com/mozilla/pdf.js/blob/master/.github/CONTRIBUTING.md (emphasis mine):
I checked a couple of different pages in that document with the browser dev-tools, calling the the
That's a limitation of the PDF format itself, since in most cases glyphs are absolutely positioned and there's simply no concept of "lines" in the majority of all PDF documents. |
All of the pages actually. For exemple, on page 1 (juste after the title page), I should receive: Instead, I receive only 2 lines. The first one has a huge width of 6559px: So it doesn't work. I've upgraded pdfjs to the latest version: |
Opening your PDF document with https://mozilla.github.io/pdf.js/web/viewer.html (using e.g. drag-and-drop) and executing {
"items": [
{
"str": "INT. FITTS HOUSE - RICKY'S BEDROOM - NIGHT",
"dir": "ltr",
"width": 294.5279999999998,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
711
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "On VIDEO: JANE BURNHAM lays in bed, wearing a tank top. She's",
"dir": "ltr",
"width": 427.67999999999984,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
687
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "sixteen, with dark, intense eyes.",
"dir": "ltr",
"width": 231.45599999999988,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
675
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "JANE",
"dir": "ltr",
"width": 28.04399999999997,
"height": 12,
"transform": [
12,
0,
0,
12,
252,
651
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "I need a father who's a role model,",
"dir": "ltr",
"width": 245.47199999999978,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
639
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "not some horny geek-boy who's gonna",
"dir": "ltr",
"width": 245.47199999999978,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
627
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "spray his shorts whenever I bring a",
"dir": "ltr",
"width": 245.47199999999978,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
615
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "girlfriend home from school.",
"dir": "ltr",
"width": 196.41599999999977,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
603
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "(snorts)",
"dir": "ltr",
"width": 56.08799999999994,
"height": 12,
"transform": [
12,
0,
0,
12,
209.004,
591
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "What a lame-o. Somebody really",
"dir": "ltr",
"width": 210.4319999999998,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
579
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "should put him out of his misery.",
"dir": "ltr",
"width": 231.45599999999973,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
567
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "Her mind wanders for a beat.",
"dir": "ltr",
"width": 196.41599999999985,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
543
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "RICKY (O.S.)",
"dir": "ltr",
"width": 84.15599999999989,
"height": 12,
"transform": [
12,
0,
0,
12,
252,
519
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "Want me to kill him for you?",
"dir": "ltr",
"width": 196.41599999999977,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
507
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "Jane looks at us and sits up.",
"dir": "ltr",
"width": 203.42399999999986,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
483
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "JANE",
"dir": "ltr",
"width": 28.04399999999997,
"height": 12,
"transform": [
12,
0,
0,
12,
252,
459
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "(deadpan)",
"dir": "ltr",
"width": 63.07199999999993,
"height": 12,
"transform": [
12,
0,
0,
12,
209.004,
447
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "Yeah, would you?",
"dir": "ltr",
"width": 112.13999999999986,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
435
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "FADE TO BLACK:",
"dir": "ltr",
"width": 98.14799999999997,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
411
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "FADE IN:",
"dir": "ltr",
"width": 56.08799999999998,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
387
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "EXT. ROBIN HOOD TRAIL - EARLY MORNING",
"dir": "ltr",
"width": 259.48799999999983,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
351
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "We're FLYING above suburban America, DESCENDING SLOWLY toward",
"dir": "ltr",
"width": 427.67999999999995,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
327
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "a tree-lined street.",
"dir": "ltr",
"width": 140.1239999999998,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
315
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "LESTER (V.O.)",
"dir": "ltr",
"width": 91.15199999999989,
"height": 12,
"transform": [
12,
0,
0,
12,
252,
291
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "My name is Lester Burnham. This is",
"dir": "ltr",
"width": 238.46399999999977,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
279
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "my neighborhood. This is my street.",
"dir": "ltr",
"width": 245.4719999999997,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
267
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "This... is my life. I'm forty-two",
"dir": "ltr",
"width": 231.4559999999998,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
255
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "years old. In less than a year,",
"dir": "ltr",
"width": 217.43999999999977,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
243
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "I'll be dead.",
"dir": "ltr",
"width": 91.15199999999993,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
231
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "INT. BURNHAM HOUSE - MASTER BEDROOM - CONTINUOUS",
"dir": "ltr",
"width": 336.5759999999998,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
195
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "We're looking down at a king-sized BED from OVERHEAD:",
"dir": "ltr",
"width": 371.61599999999976,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
171
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "LESTER BURNHAM lies sleeping amidst expensive bed linens,",
"dir": "ltr",
"width": 399.6479999999998,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
147
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "face down, wearing PAJAMAS. An irritating ALARM CLOCK RINGS.",
"dir": "ltr",
"width": 420.67199999999997,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
135
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "Lester gropes blindly to shut it off.",
"dir": "ltr",
"width": 259.4879999999998,
"height": 12,
"transform": [
12,
0,
0,
12,
108,
123
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "LESTER (V.O.)",
"dir": "ltr",
"width": 91.15199999999989,
"height": 12,
"transform": [
12,
0,
0,
12,
252,
99
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "Of course, I don't know that yet.",
"dir": "ltr",
"width": 231.4559999999998,
"height": 12,
"transform": [
12,
0,
0,
12,
180,
87
],
"fontName": "g_d1_f3",
"hasEOL": false
},
{
"str": "",
"dir": "ltr",
"width": 0,
"height": 0,
"transform": [
12,
0,
0,
12,
463,
39
],
"fontName": "g_d1_f3",
"hasEOL": true
},
{
"str": "(CONTINUED)",
"dir": "ltr",
"width": 77.16000000000007,
"height": 12,
"transform": [
12,
0,
0,
12,
463,
39
],
"fontName": "g_d1_f3",
"hasEOL": false
}
],
"styles": {
"g_d1_f3": {
"fontFamily": "monospace",
"ascent": 0.79150390625,
"descent": -0.21630859375,
"vertical": false
}
}
} |
Ok, the error came from the option "disableCombineTextItems: true". Thank you very much for your help. |
Attach (recommended) or Link to PDF file here:
pdf example link
Configuration
Osx 10.13.4
"pdfjs-dist": "^3.2.146",
Steps to reproduce the problem
What is the expected behavior?
Each line of the pdf should be considered as a unique line.
What went wrong?
All lines of the page are merged into one or two lines (instead of 40+ lines). The line breaks are not properly taken into account. Please note it doesn't always happens depending on the pdf. Sometimes it works. PDFJS parsing is not regular.
The text was updated successfully, but these errors were encountered: