The target.cppunittest.tests fails to hook QueryDosDeviceW from kernelbase.dll for TestDllInterceptor.exe
Categories
(Core :: mozglue, defect)
Tracking
()
People
(Reporter: atrif, Assigned: toshi)
Details
(Keywords: regression, regressionwindow-wanted)
Attachments
(5 files)
Affected versions
- 75.0b10 (20200326191140)
Affected platforms
- Windows 10 x32
Steps to reproduce
- Download target.cppunittest.tests.zip and target.zip from https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&revision=c6d1ad75d7ca1ab66cf4b269ef73e3020a4abc90 (Fx75.0b10 - Windows 2012 shippable opt Bpgo-B).
- Unzip the content.
- Copy mozglue.dll from target.zip in the unzipped file from above.
- Open the PowerShell window in the folder the content was extracted in the previous step.
- Type
.\TestDllInterceptor.exe
and hit Enter.
Expected result
- All checks passed.
Actual result
TEST-UNEXPECTED-FAIL | WindowsDllInterceptor | Failed to hook QueryDosDeviceW from kernelbase.dll First 5 bytes of function: 6A 7C 68 D0 F0
Regression Range
- Reproducible with Firefox 75.0b8, Firefox 75.0b9 and Firefox 75.0b1. Will search for one ASAP.
Notes
- Attached a screenshot with the issue.
Comment 1•4 years ago
|
||
Bugbug thinks this bug is a regression, but please revert this change in case of error.
Updated•4 years ago
|
Comment 2•4 years ago
|
||
Toshi, David, can you take a look?
Comment 3•4 years ago
|
||
What happens if you try the test in a regular console (instead of PowerShell)?
Comment 4•4 years ago
|
||
FYI, I am asking because of Bug 1601071.
Reporter | ||
Comment 5•4 years ago
|
||
(In reply to David Parks (dparks) (he/him) [:handyman] from comment #3)
What happens if you try the test in a regular console (instead of PowerShell)?
Same error with cmd too.
TestDllInterceptorCrossProcess.exe
test pass.
Updated•4 years ago
|
Assignee | ||
Comment 6•4 years ago
|
||
I think I reproduced this issue. When our detour meets the pattern of 6A 7C 68 D0 F0, it can process the first instruction 6a 7c (= push 7c
). After that, its head needs to jump to the 3rd byte 68, which is push imm32
, but it jumped to the 4th byte D0. Probably something wrong happened in CountPrefixBytes
. Let me continue investigation and find out the root cause..
Assignee | ||
Comment 7•4 years ago
|
||
I got the root cause and came up with a simple patch. I'll b e able to start a review by EOD tomorrow.
Assignee | ||
Comment 8•4 years ago
|
||
When our detour processes instructions, we pass ReadOnlyTargetFunction
to
CountPrefixBytes
to determine whether a lock prefix exists or not.
In that case, we don't need to pass both ReadOnlyTargetFunction
and an offset
as a parameter because ReadOnlyTargetFunction
has an offset as a member.
Updated•4 years ago
|
Comment 9•4 years ago
|
||
What's the likely effect of this issue beyond the failed test? Is it bad enough that we should get the fix in 75 (we're in RC week)?
Assignee | ||
Comment 10•4 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #9)
What's the likely effect of this issue beyond the failed test? Is it bad enough that we should get the fix in 75 (we're in RC week)?
An immediate impact is our detour fails to hook QueryDosDeviceW
here, so ChromiumCDMAdapter
may not function correctly. I'm not sure how it emerges, though. Media team can answer that..?
Updated•4 years ago
|
I believe the reason we hook this function is so the GMP doesn't need to use the system version, which means we can make the sandbox more restrictive. So I think if we fail to hook we'll run into the sandbox when we're trying to playback premium media (Netflix, Amazon prime, etc.). My preference from the media side would be try and uplift fixes if possible.
Updated•4 years ago
|
Comment 12•4 years ago
|
||
It should be possible to check on a Windows 10 32-bit environment (which doesn't sound like its common) to see if those (Netflix, Amazon Prime, ...) work or not. I'm guessing that will define our urgency in uplifting.
Note that this was found by our own tests, if this had actually been broken in beta this might've been reported?
Comment 13•4 years ago
|
||
Also I'm not clear what regressed this, or if this had been broken in previous versions?
Comment 14•4 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #12)
It should be possible to check on a Windows 10 32-bit environment (which doesn't sound like its common) to see if those (Netflix, Amazon Prime, ...) work or not. I'm guessing that will define our urgency in uplifting.
Alexandru, can you test this?
Note that this was found by our own tests, if this had actually been broken in beta this might've been reported?
I'd hope so, but probably better to make sure... Thanks :)
Wonder if this is the cause of bug 1496607.
Assignee | ||
Comment 16•4 years ago
|
||
(In reply to Bryce Seager van Dyk (:bryce) from comment #15)
Wonder if this is the cause of bug 1496607.
I believe it is. As shown below (that's from my local system where no repro), KernelBase!QueryDosDeviceW
starts with push 7Ch
, followed by a code pushing an immediate value. This detour failure occurs depending on a number of that immediate value, which means this issue could easily disappear/reappear release by release.
KernelBase!QueryDosDeviceW:
1012e5e0 6a7c push 7Ch
1012e5e2 6898321b10 push offset KernelBase!HviIsAnyHypervisorPresent+0xcdc (101b3298)
My proposed patch affects only 32bit binaries (i.e. 32bit Windows or WOW64), and the change is simple. The regression risk from uplift is low.
Reporter | ||
Comment 17•4 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #14)
(In reply to Gian-Carlo Pascutto [:gcp] from comment #12)
It should be possible to check on a Windows 10 32-bit environment (which doesn't sound like its common) to see if those (Netflix, Amazon Prime, ...) work or not. I'm guessing that will define our urgency in uplifting.
Alexandru, can you test this?
I get the attached console errors when trying to play videos on Netflix or amazon prime. Tested with two machines. On one machine Netflix was working after trying to update the widevine plugin (even if no updates were installed) and playing another video (I could not check the other machine due to boot errors) but amazon prime was not at all on neither on the machines. On the other machine, Netflix was not working. These errors were presented in Firefox 75.0 (20200331175109) win 10x86. If more information is needed please let me know.
Reporter | ||
Comment 18•4 years ago
|
||
So I retested on another win10x86 machine. It seems that Netflix is working after a browser restart.
Comment 19•4 years ago
|
||
Tested on Amazon Prime too and from the first try, an error message is shown that the stream cannot be played. After restarting the PC, and accessing Amazon Prime again, movies/TV series loaded as expected. This happened on a Win 10 x86 system with Firefox 75.0.
Comment 20•4 years ago
|
||
Pushed by btara@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bc31229fd23f No need to pass an offset to CountPrefixBytes. r=handyman
Comment 21•4 years ago
|
||
bugherder |
Comment 22•4 years ago
•
|
||
Given that this is Windows 10 32-bit only (which is likely not very common?), restarting can make it work and it's been there for a while, it probably does not make sense to try to uplift it aggressively to a dot release.
Comment 23•4 years ago
|
||
Verified the fix on 76 (beta - treeherder) under Windows 10 32-bit and all tests are passed.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 24•4 years ago
|
||
Part of me thinks we should take this on ESR68 since this breaks potentially-important functionality, but the other part of me says we've made it this far into it's life cycle without anybody complaining and I wonder who will notice. Calling this wontfix for now but I'm open to arguments for why we should if someone feels strongly about it.
Description
•