Closed Bug 1854094 Opened 8 months ago Closed 12 days ago

[macOS] Some characters are not displayed correctly printing to PDF a pdf document

Categories

(Core :: Printing: Output, defect)

defect

Tracking

()

RESOLVED FIXED
127 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox-esr115 --- wontfix
firefox117 --- wontfix
firefox118 --- wontfix
firefox119 --- wontfix
firefox120 --- wontfix
firefox126 --- wontfix
firefox127 --- fixed

People

(Reporter: zstimi, Unassigned)

References

(Regression)

Details

(Keywords: regression)

Attachments

(3 files)

Attached image save to pdf.png

Found in

  • Firefox 115.3.0esr

Affected versions

  • Firefox 115.3.0esr
  • Firefox 118.0b9
  • Firefox 118.0
  • Firefox 119.0a1

Tested platforms

  • Affected platforms: macOS 12
  • Unaffected platforms: Windows 10, Ubuntu 22

Steps to reproduce

  1. Launch Firefox
  2. Access this PDF sample
  3. Print the edited page to PDF

Expected result

  • Page is displayed correctly, after save to PDF and access it again in a new tab.

Actual result

  • Some characters are not displayed correctly, see the issue in the attached screenshot.

Regression range

  • I will come back with regression range ASAP.

Additional notes

  • Not reproduced using Chrome.

:zstimi, if you think that's a regression, could you try to find a regression range using for example mozregression?

Setting Regressed by field after analyzing regression range found by mozregression in comment #2.

Regressed by: 1789482

Set release status flags based on info from the regressing bug 1789482

:jfkthame, since you are the author of the regressor, bug 1789482, could you take a look?

For more information, please visit BugBot documentation.

Flags: needinfo?(jfkthame)

Interestingly, the Save-to-PDF output looks fine if viewed in Preview.app, but if viewed in Adobe Reader or in Firefox (via pdf.js), the breakage appears.

The "broken" characters I'm seeing correspond to the characters that would be alphabetically first, when sorted by glyph name, in the embedded subset fonts: the "A" character in the main fonts, and the "m" in the Semibold-Italic face that is used only for the word "must".

So my guess is that this is somehow related to re-encoding and subsetting embedded fonts, where the glyph that ends up in slot 0 fails to render because of a hard-coded assumption somewhere that 0 is the .notdef missing-character glyph.

It would be interesting to try and reproduce a similar issue with a standalone CoreGraphics-based test, generating a PDF with text that renders differently in Preview.app vs Adobe Reader; that would indicate a bug in CG. But another possibility is that it's somewhere in the cairo layers, or even in how pdf.js manages the embedded fonts.

Flags: needinfo?(jfkthame)

Set release status flags based on info from the regressing bug 1789482

Could you check if this works properly when printing to an actual printer?

Flags: needinfo?(tzsoldos)
Attached file the original PDF

(In reply to Timea Zsoldos [:zstimi/tzsoldos], Desktop QA from comment #0)

Steps to reproduce

  1. Launch Firefox
  2. Access this PDF sample

For archival purposes, here's a copy of that original PDF that the STR starts with.

(In reply to Jonathan Kew [:jfkthame] from comment #5)

Interestingly, the Save-to-PDF output looks fine if viewed in Preview.app, but if viewed in Adobe Reader or in Firefox (via pdf.js), the breakage appears.

Firefox's save-to-PDF output (the resulting PDF) also looks broken when viewed in Chrome as well (i.e. PDFium or whatever their pdf.js equivalent is called).

(In reply to Daniel Holbert [:dholbert] from comment #7)

Could you check if this works properly when printing to an actual printer?

Answering my own question: I just tested locally, and confirmed it's fine when printing to an actual printer. Though perhaps it depends a bit on the printer (i.e. whether the printer reliably interprets the characters in the same way that Preview.app does).

So: for now it seems like this is only an issue when folks print from a PDF to a PDF, which should theoretically work, but which we think is not a super common use-case (since there's no real reason to print to PDF if you've already got a PDF; users would hopefully prefer to just directly use "Save"/"Download" instead of "Print|Save-to-PDF" in this circumstance.

Flags: needinfo?(tzsoldos)
Attachment #9357710 - Attachment description: save to PDF output (the result of the STR, broken when viewed in PDF.js, Adobe, Chrome, etc) → save to PDF output (the result of the STR, broken when viewed in PDF.js, evince, Adobe, Chrome, etc)

Jonathan, is there anything else that can be done here?
marking 119/120 as wontfix given that it is an older regression and an S4

Yes, I don't think we have any immediate solution here, but fortunately it's a fairly rare case (I think), only affecting certain PDF files depending on how fonts have been re-encoded and embedded, and then only for a very specific workflow.

The only "fix" I know of at the moment would be to revert bug 1789482, but that would regress the rendering quality for a lot of PDFs (e.g. bug 1772225 would come back). Aside from that, it might be possible to work around the issue somehow in pdf.js but I don't think we have a good enough understanding yet of exactly what's happening, or where the real bug lies.

Flags: needinfo?(jfkthame)
See Also: → 1896915

This came back across my radar since recently-filed bug 1896915 seems like possibly the same as this one.

And in re-testing this one, it seems to have been recently fixed!
2024-05-06 is "bad" (produces save-to-PDF output that looks like the screenshot when viewed in current Firefox Nightly)
2024-05-08 is "good" (produces save-to-PDF output that looks fine when viewed in current Firefox Nightly, no missing letters).

Given that range, I think we can safely say this was fixed by the cairo update in bug 1892913.

Status: NEW → RESOLVED
Closed: 12 days ago
Depends on: 1892913
Resolution: --- → FIXED

In the output pdf, the encoding is wrong: if you copy & paste the text then it won't be what you expect.
The font (a type1C) used to render the A is a subset of Whitney-Semibold:

  • at position 0, there is .notdef;
  • at position 1, there is A
  • the charset for this font is [.notdef, .notdef, ...] which means that .notdef is mapped on glyphs 0 and 1.
  • the encoding for this font is MacRomanEncoding and char 0 is mapped on "".

So on the pdf.js side, a potential fix could be to map char "" with the second .notdef at position 1 and not the usual 0.
But it needs to be tested, because I've the feeling that it could be easy to break something...
That said, Jonathan, if you've a better idea or if there's something wrong with my understanding of that stuff, please tell me.

I'm not entirely sure it's worth trying to fix our rendering of the broken output PDF (on the PDF.js side) at this point.

The output PDF looks similarly-broken in virtually all PDF viewers -- in Firefox, Chromium, Acrobat Reader, and evince (testing on multiple platforms), as well as the Dropbox and Google Drive PDF-veiwers on Android.

The only program I've tested that renders it nicely is Preview.app on macOS. (And I'm not sure to-what-extent this is a pure enhancement in Preview.app vs. something that potentially makes them render content where they shouldn't, in related edge cases where a PDF might have some content that it expects not-to-be-rendered for whatever reason.)

If you do want to improve things for our rendering of the broken output PDF here, though, that's probably worth spinning off to a new bug.

Target Milestone: --- → 127 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: