Closed Bug 1302944 Opened 8 years ago Closed 8 years ago

Newly reimaged WinXP slaves have the mouse pointer in the wrong place

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Unassigned)

References

Details

After the bug 1302863 reimage of every WinXP slave, some if not all of them have the mouse pointer in the wrong place, over the window, where it causes test failures by unexpectedly hovering things.

Maybe it's documented somewhere what you have to do or avoid doing to not wind up in this state, dunno. I remember hearing people yell in the past about things like not connecting to a slave with some technology and connecting with some other instead, and/or not having a keyboard attached during reimaging, or starting it up in a particular way, or... dunno.

If you look at the reftest analyzer (https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://archive.mozilla.org/pub/firefox/tinderbox-builds/autoland-win32/1473877807/autoland_xp_ix_test-reftest-bm112-tests1-windows-build114.txt.gz&only_show_unexpected=1) for a failed run like https://treeherder.mozilla.org/logviewer.html#?job_id=3524923&repo=autoland and click on the blue links on the left side and then go back and forth between the radio buttons on the top for "Image 1" and "Image 2", you'll see that the difference (for the ones where it's clearly visible) is something like a button which in one of the two has the orangish WinXP hover state, because the mouse is over the reftest window instead of wherever else it is that it's supposed to be.
Q fixed the issue and trees will reopen
Severity: blocker → major
Summary: All trees closed: Newly reimaged WinXP slaves have the mouse pointer in the wrong place → Newly reimaged WinXP slaves have the mouse pointer in the wrong place
One possibility that occurs to me is that we've lost a GPO to shut off https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/win_pointer_snapto.mspx?mfr=true and so what I'm disabling are slaves that have opened a dialog box in the top-left during one test run, and then run reftests before their next reboot.
If anyone would like to bring support to WinXP to control the mouse position you can have a look to modifying this script and configuration file:
https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/external_tools/mouse_and_screen_resolution.py
https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/external_tools/machine-configuration.json
Is there any correlation between this and the issue in Bug 1302943?
Some overlap, but not very much.

Or, a horrible possibility either involving comment 2 or not: because the easiest place to catch the firewall dialog is in m-oth, I ran hundreds of iterations of it, and it's possible either that a test in m-oth opens a dialog and thus allows snap to to move the pointer, or, a test in m-oth moves the pointer and nothing moves it back and the actual problem could be (but probably isn't) "the WinXP slaves can't run reftests after m-oth without a reboot in between."

But glancing at recent history of the slaves disabled for this, not only do they not uniformly have a m-oth run before the bad reftest run, but there's one completely incomprehensible one which ran a green reftest run on aurora, and then the next two jobs were failed reftest runs on central from my retriggers to catch this. The only theory I have to make *that* work would be that it rebooted between the good and bad reftest runs, that a freshly rebooted slave cannot successfully run reftests, and that there is one or more other test suites which move the pointer for us to make a slave that can then run reftests until the next reboot.
Added the following to the login.bat that runs before talos on XP:

nircmdc setcursor 1600 1200

puts the cursor in the extreme lower right corner below the clock 

Let's see if this helps
I re-enable and rebooted 004 and 019.
Out of the above mentioned machines there was one test with warnings the rest came back green. 

Philor:Do you think we are safe to re-enable and reboot the other machines associated with this bug?
Flags: needinfo?(philringnalda)
You can't really tell anything by just looking at whether an undifferentiated "jobs" result is green or not. 004 hasn't done the affected job yet, so for it we've learned nothing so far, but 019 has done two reftest jobs and passed them both, so let's just say it's fixed and reenable the rest and see what happens.
Flags: needinfo?(philringnalda)
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.