Open Bug 1318272 Opened 8 years ago Updated 1 year ago

Crash when trying to pretty print 170MiB XML file

Categories

(Core :: XML, defect, P3)

x86
Windows 7
defect

Tracking

()

People

(Reporter: cbook, Unassigned)

References

()

Details

(Keywords: crash, Whiteboard: [MemShrink:P3])

Crash Data

Attachments

(2 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-06382a59-c4a3-430a-9fd0-151b22161117.
=============================================================


Found via bughunter and reproduced via nightly opt builds

STR:
-> Load https://sportaffiliatefeeds.blob.core.windows.net/affiliates/guts_prelive_en.xml 
--> wait
---> Crash
3 	xul.dll 	txMozillaXMLOutput::closePrevious(bool) 	dom/xslt/xslt/txMozillaXMLOutput.cpp:597
4 	xul.dll 	txMozillaXMLOutput::endElement() 	dom/xslt/xslt/txMozillaXMLOutput.cpp:273
5 	xul.dll 	txEndElement::execute(txExecutionState&) 	dom/xslt/xslt/txInstructions.cpp:422
6 	xul.dll 	txXSLTProcessor::execute(txExecutionState&) 	dom/xslt/xslt/txXSLTProcessor.cpp:49
(In reply to Carsten Book [:Tomcat] from comment #0)
> https://sportaffiliatefeeds.blob.core.windows.net/affiliates/guts_prelive_en.
> xml 
> --> wait
> ---> Crash

9.4 MB XML file. Can we do something better than the current memory growth?
Component: DOM → XML
Flags: needinfo?(erahm)
When I load the page, it just spins and spins and doesn't show anything. I didn't leave it open for more than like 10 seconds, but the memory usage had already climbed to 13GB for Firefox.
This chokes Chrome up as well, although it renders stuff and runs for a good 7 minutes while pegging a core. The file itself is pretty bizarre, it looks like totally gibberish (maybe some sort of binary-as-unicode thing going on). It's not even clear to me how xslt is getting invoked here.
Flags: needinfo?(erahm)
Ah right, that's gzipped. It's actually a 170MiB file
Okay so we're going through the "pretty print the xml" path which invokes our xst pretty printer [1] with a 170MiB file and we eventually OOM. In this case it was creating a new node (size 64 bytes). In theory we could try to make things fallible but our new is infallible by default so you'd have to go through all the xslt code and make us use a custom fallible new. That sounds unpleasant.

Perhaps a simpler solution would be to not pretty print large XML files.

Peter what do you think?
Flags: needinfo?(peterv)
Missing footnote:

[1] https://hg.mozilla.org/mozilla-central/annotate/79feeed42933/dom/xml/nsXMLPrettyPrinter.cpp#l122
Summary: Crash in OOM | small → Crash when trying to pretty print 170MiB XML file
Priority: -- → P2
This adds a variable to track the total text size of the parsed XML document.
If the size is greater than 10MiB we bypass attempting to perform an XSL
transformation to pretty print the document.

I looked into using the the content-length reported by the document's channel
instead, but that was just the compressed size of the XML file which, in this
case, was less useful (<10MiB).

The 10MiB limit was somewhat arbitrarily chosen but seems reasonable. With
this patch I'm eventually able to load the XML file from comment 0 and scroll
through it on a Linux64 debug build. It's possible on a 32-bit build we'd
still see an OOM but I'd expect a fallible allocation to fail first leading to
an XML error being shown rather than a crash.

MozReview-Commit-ID: Sadwxjr8b7
Attachment #8812316 - Flags: review?(peterv)
Assignee: nobody → erahm
Status: NEW → ASSIGNED
Whiteboard: [MemShrink]
Whiteboard: [MemShrink] → [MemShrink:P3]
Moving to p3 because no activity for at least 1 year(s).
See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information
Priority: P2 → P3
Comment on attachment 8812316 [details] [diff] [review]
Add size limit to pretty printing XML

I'm going to revive this in phabricator.
Attachment #8812316 - Attachment is obsolete: true
Flags: needinfo?(peterv)
Attachment #8812316 - Flags: review?(peterv)

This adds a variable to track the total text size of the parsed XML document.
If the size is greater than 10MiB we bypass attempting to perform an XSL
transformation to pretty print the document.

Peter, we've had this patch sitting around for 3 years or so, would you mind reviewing?

Flags: needinfo?(peterv)

I commented in phabricator.

Flags: needinfo?(peterv)

I'm not actively working on this.

Assignee: erahm → nobody
Status: ASSIGNED → NEW
Attachment #9070385 - Attachment is obsolete: true
Severity: critical → S2
Severity: S2 → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: