Closed Bug 1598650 Opened 4 years ago Closed 4 years ago

TestDllInterceptor.exe test failed on Windows 8x64

Categories

(Firefox Build System :: General, defect)

71 Branch
x86_64
Windows 8
defect
Not set
normal

Tracking

(firefox-esr68 unaffected, firefox71 wontfix, firefox72 fixed, firefox73 fixed)

RESOLVED FIXED
mozilla73
Tracking Status
firefox-esr68 --- unaffected
firefox71 --- wontfix
firefox72 --- fixed
firefox73 --- fixed

People

(Reporter: mboldan, Assigned: handyman)

Details

(Keywords: regression, regressionwindow-wanted)

Attachments

(2 files)

Attached file dllInterceptor.txt

[Affected versions]:

  • Firefox 71.0b12

[Affected platforms]:

  • Windows 8 64bit

[Steps to reproduce]:

  1. Download target.cppunittest.tests.zip from https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&revision=a5ce9b31b8253c9758e486b5476ddaf5f7e1d08f (Fx71.0b12 - Windows 2012 x64 shippable opt Bpgo-B).
  2. Unzip the content.
  3. Copy mozglue.dll from target.zip in the unziped file from above.
  4. Open CMD in the folder the content was extracted in the previous step.
  5. Type 'TestDllInterceptor.exe' and hit Enter.

[Expected result]:

  • Bunch of tests run in CMD without issues

[Actual result]:

  • TEST-UNEXPECTED-FAIL | WindowsDllInterceptor | Failed to hook CreateFileA from kernel32.dll

[Regression range]:

  • This issue was not reproducible on Firefox 71.0b3.
  • Will return with the regression ASAP.

Notes:

  • Didn't encountered this issue on Windows 10x86, Windows 8.1 x86/64, or on Windows 7x64.
  • The full result from the test cand be found in the attachment.
Has Regression Range: --- → no
Has STR: --- → yes

Aaron any idea why this test would fail?

Flags: needinfo?(aklotz)

Mihai, can you provide the regression range please? Thanks

Flags: needinfo?(aklotz) → needinfo?(davidp99)

It would have to be Win 8 only when I already have a VM set up for Win 8.1. sigh.

I'll take a look but, from the output, it looks like this is a rip-relative jump with 32-byte offset + nops that only give us 12 bytes of room total -- the 13th being a REX.W for the next instruction. If that is the case, I'll make it a 10-byte patch on win8 like in bug 1489391.

Assignee: nobody → davidp99
Flags: needinfo?(davidp99)

Some updated info:

  • Windows 8 VMs are not available for download from Microsoft (only 8.1 is). I was able to get what I needed from the #servicedesk.

  • When I grab the files from the build mentioned in comment 0 and run them on my Win 8.0 x64 VM (winver 6.2 build 9200), CreateFileA is hooked and run fine:

      TEST-PASS | WindowsDllInterceptor | Could hook CreateFileA from kernel32.dll
      TEST-PASS | WindowsDllInterceptor | Executed hooked function CreateFileA from kernel32.dll
    
  • CloseHandle does fail to hook. This should be fixed by bug 1489391.

  • AcquireCredentialsHandleA is producing something that crashes when it runs in the test (!). I am looking into this. For the record, this is the relevant preamble, which includes some rsp access that should probably have failed to patch:
    0x0000000000000000: FF F3 push rbx
    0x0000000000000002: 56 push rsi
    0x0000000000000003: 57 push rdi
    0x0000000000000004: 48 83 EC 40 sub rsp, 0x40
    0x0000000000000008: 48 8B 05 51 CA 00 00 mov rax, qword ptr [rip + 0xca51]

FYI, the given build does not get to AcquireCredentialsHandleA because TestDllInterceptor quits on the first failure -- CloseHandle. I hacked up a special build to see that failure.

I screwed that up -- the code attributed to AcquireCredentialsHandleA's was actually for something else. This is AcquireCredentialsHandleA's beginning:

0x0000000000000000:  48 83 EC 58                sub rsp, 0x58
0x0000000000000004:  48 8B 84 24 A0 00 00 00    mov rax, qword ptr [rsp + 0xa0]

This is the beginning of what PatcherDetour comes up with (value is specific to this run):

0x0000000000000000:  48 B8 88 50 BE 0B F7 07 00 00    movabs rax, 0x7f70bbe5088

I can see that PatcherDetour is confusing an RSP-relative mov for a RIP-relative one, which is bad enough, but... where did the sub go?

To further explain how the mov is being misread, this is the breakdown of the ModR/M and SIB bytes:

MOD R/M: 10 000 100 (0x84) | mod = 10, reg = RAX (dest reg), rm = SP (=> [SIB + disp32])
SIB: 00 100 100 (0x24) | scale = 0, index = no index, base = SIB

Here [1], we think we are handling a rip-relative instruction but we are not. CountModRmSib's special return value is wrong in this case.

[1] https://searchfox.org/mozilla-central/rev/04d8e7629354bab9e6a285183e763410860c5006/mozglue/misc/interceptor/PatcherDetour.h#1013

Nope. None of that was it. It handles that case fine. Where it fails is the next instruction -- I needed to disassemble more than 16 bytes.

This is the entire relevant beginning of the function:
0x0000000000000000: 48 83 EC 58 sub rsp, 0x58
0x0000000000000004: 48 8B 84 24 A0 00 00 00 mov rax, qword ptr [rsp + 0xa0]
0x000000000000000c: C6 44 24 48 00 mov byte ptr [rsp + 0x48], 0

The last mov starts at byte 13 so we need to include it. This is what the stub looks like:

0x0000000000000000:  48 83 EC 58                sub rsp, 0x58
0x0000000000000004:  48 8B 84 24 A0 00 00 00    mov rax, qword ptr [rsp + 0xa0]
0x000000000000000c:  C6 44 24 48 FF             mov byte ptr [rsp + 0x48], 0xff
0x0000000000000011:  25 00 00 00 00             and eax, 0

Not hard to see what happened. PatcherDetour is leaving off the '0' at the end of the mov byte. The following jump that starts 'ff 25 ...' is then mistakenly read as part of the mov. Hilarity ensues.

The mov byte instruction is handled here [1]. This should be an easy fix at this point.

[1] https://searchfox.org/mozilla-central/rev/04d8e7629354bab9e6a285183e763410860c5006/mozglue/misc/interceptor/PatcherDetour.h#1287

mov byte ptr support was added in bug 1382251 but did not properly count the instruction size. It was missing the 1-byte operand, which causes the rest of the trampoline to be garbage.

Pushed by daparks@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2cab2519da71
Copy operand for mov byte ptr, imm8 in DLL interceptor r=aklotz
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73

Is this worth uplifting?

Flags: needinfo?(davidp99)

The value of uplift is minor but I think I'll do it anyway. The potential value is to people running Flash plugins that do SSL communication on Win 8.0 x64 -- the expected behavior would not happen correctly because of this bug in that case. We don't have any bugs about that case -- just this bug about the test failure -- so I'm not sure what it looks like in the wild. But I guess the uplift should not be controversial.

OTOH, I'm going to wait until tomorrow so that I can locally run the nightly-build test on Win 8.0 x64 to confirm the fix. The problem today is that bug 1489391 needs to land first since it currently causes TestDllInterceptor to quit before it gets to the failure in this bug.

Leaving the NI open to keep this in mind.

Comment on attachment 9113339 [details]
Bug 1598650: Copy operand for mov byte ptr, imm8 in DLL interceptor r=aklotz!

Beta/Release Uplift Approval Request

  • User impact if declined: Not completely clear. The bug is reported as a test failure but it shows that we are failing to intercept this Win32 function, so the behavior would be wrong. This function is associated with Flash plugins doing SSL communication but we don't have any bug reports specifically about that. The test failure, and therefore any SSL bug, would be limited to Windows 8.0 (not 8.1), 64-bit. Note that we don't test on Win 8.0 so this test failure does not show in automation. I've confirmed the fix with the latest nightly build.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This fixes a clear failure in the DLL interceptor code's recognition of mov byte ptr, imm8 instructions.
  • String changes made/needed: N/A
Flags: needinfo?(davidp99)
Attachment #9113339 - Flags: approval-mozilla-beta?

Comment on attachment 9113339 [details]
Bug 1598650: Copy operand for mov byte ptr, imm8 in DLL interceptor r=aklotz!

dll interceptor fix, approved for 72.0b5

Attachment #9113339 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify-

I'm still seeing this FAIL on two machines Win 8 64bit and Win 8.1 64bit using the target.cppunittest.tests.zip from 72.0b11. Any thoughts?

Flags: needinfo?(davidp99)

We do get reports of failures of this test that we are unable to reproduce. There are a number of potential causes for this but we haven't investigated deeply enough to know for sure. I'd mention bug 1601071 in particular as a cause that could be common in QA and dev setups. Are there any tools you are running that you think could be altering the behavior of the application? Beyond consoles, suspects include antivirus software, profilers and debuggers. If so, would it be reasonable to retry the tests with them disabled? Also, do you have the console output from the failures? Maybe we can at least eliminate a few theories.

Flags: needinfo?(davidp99) → needinfo?(bogdan.maris)

Ok, so I think that the issue may be present only on that isolated case of Windows 8 x64. Tried the tests with latest beta available from treeherder (20200115154857) under Windows 8 x86 (all tests passed) and under Windows 8.1 x64 on two different systems (all tests passed).

On Windows 8 x64 with both x86 and x64 builds, the tests failed. Kaspersky antivirus was installed on all of them.

Please let me know if I can help with any further investigations or informations.

Flags: needinfo?(bogdan.maris)

(In reply to David Parks (dparks) [:handyman] from comment #18)

Are there any tools you are running that you think could be altering the behavior of the application? Beyond consoles, suspects include antivirus software, profilers and debuggers. If so, would it be reasonable to retry the tests with them disabled? Also, do you have the console output from the failures? Maybe we can at least eliminate a few theories.

We've tested target.cppunittest.tests for Beta 74.0b1 on the same Windows 8 Pro x64 machine, as mentioned in comment 19 and comment 0, and it seems that the test [1] is still failing when running TestDllInterceptorCrossProcess.exe. Note that no antivirus has been installed on the machine.

[1] TEST-UNEXPECTED-FAIL | WindowsDllInterceptor | Failed to hook CreateFileA from kernel32.dll

David, would you mind letting us know how do you want to proceed with this, should we file a folllow-up bug?

Flags: needinfo?(davidp99)

I had not updated my Windows 8.0 Pro x64 VM -- I've just installed 1G of updates to it. Now, I am seeing the same TestDllInterceptor.exe failure to hook CreateFileA as in comment 20.
I should now be able to track this down. Sorry for the confusion. I've created bug 1615752 for this.

Flags: needinfo?(davidp99)
You need to log in before you can comment on or make changes to this bug.