Closed Bug 1125887 Opened 9 years ago Closed 9 years ago

We have machines with 4GB of RAM that need 8GB

Categories

(Infrastructure & Operations :: DCOps, task)

x86_64
Windows Server 2008
task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: q, Unassigned)

References

Details

So far identified:
B-2008-IX-0001 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0002 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0003 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0004 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0005 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0007 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0009 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0010 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0011 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0012 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0013 ~ Total Physical Memory:     4,087 MB
B-2008-IX-0014 ~ Total Physical Memory:     4,087 MB
According to sources we recently decommissioned some hardware so the current hope is these machines can have the RAM upgraded from existing stores.
Blocks: 1122975
Summary: We have machines with 4gb of RAM that need 8GB → We have machines with 4GB of RAM that need 8GB
colo-trip: --- → scl3
b-2008-ix-0006 ~ Total Physical Memory:     4,087 MB

also has 4gb
Just so be clear we have some linux builders that were recently decommed we are hoping to transplant that RAM if possible Assuming these machines truly  only have 4gb physically installed. Does anyone in DCOPS  know if this is logistically possible?
Flags: needinfo?(server-ops-dcops)
>Does anyone in DCOPS  know if this is logistically possible?

:q, ill be back on Thursday at the data centers and can check for you then. :arr had told us to hold off so i wasnt sure if you guys had other plans for the retired nodes or were just waiting for a full list. i am unable to check remotely as the decommed machines have been turned off. since these are different models and manufacturers, i'd like to test one host by upgrading it to 8G and make sure we don't have any POST errors or CRC mismatch.
Flags: needinfo?(server-ops-dcops)
van: I just want to scavenge RAM from the decommed linux nodes for these windows builders if we need to do so (if the windows boxes didn't get upgraded before they moved like they should have).
i have upgraded b-2008-ix-0001.winbuild.releng.scl3.mozilla.com to 8gb of memory and the host is online. if it stays up over the weekend, ill upgrade the rest on Monday if time permits.
:van: so the machines were physically missing the rest of the RAM?
>:van: so the machines were physically missing the rest of the RAM?

yah, 2 of the memory banks were unpopulated.
Hm, these were supposed to have been upgraded before they left mtv1. I wonder what happened to the RAM that was slated for them. Well, in any case, please do upgrade the rest Monday. We're pushing more jobs to the 2008 builders now that we're doing w64 betas, so we should get these back online asap.
According to collectd stats here is every machine with less than 8 gb:

b-2008-ix-0002
b-2008-ix-0003
b-2008-ix-0004
b-2008-ix-0005
b-2008-ix-0006
b-2008-ix-0007
b-2008-ix-0009
b-2008-ix-0010
b-2008-ix-0011
b-2008-ix-0012
b-2008-ix-0013
b-2008-ix-0014
b-2008-ix-0051
b-2008-ix-0081
b-2008-ix-0172
b-2008-ix-0180
b-2008-ix-test
the hosts in this list has been upgraded to 8gb except for the following hosts:

b-2008-ix-0010 - upgraded to 8gb but host is not pingable. i see that it's up per console redirect but something is not working right. will need to take a 2nd look when i get back next week or if vinh has time this week. 

b-2008-ix-0180 - decom per 1024869, bad eth0.

b-2008-ix-test - cant find this host in inventory. can you give me the fqdn or the inventory link for it?
vinh is going to take a look at b-2008-ix-0010 for me. do you have any info on the other host?
b-2008-ix-test wasn't a physical host, so that can be ignored.
b-2008-ix-test was a cname and the machine was moved. No action needed
b-2008-ix-0180 - Decom status is correct no action needed. 

Thanks for the quick work on this!
b-2008-ix-0010.winbuild.releng.scl3.mozilla.com is showing 8gb RAM installed.
all hosts should be at 8g now. please let me know if there are any issues.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
(In reply to Van Le [:van] from comment #17)
> all hosts should be at 8g now. please let me know if there are any issues.

thanks! I'll enable them in the morning so I can keep an eye on them.
I've re-opened and disabled: B-2008-IX-0014 and b-2008-ix-0051

B-2008-IX-0014 started failing jobs in a similar OOM fashion. Looks like it still has 4gb:
cltbld@B-2008-IX-0014 ~
$ systeminfo | grep "Total Physical Memory"
Total Physical Memory:     4,087 MB

I'm not sure what's going on with b-2008-ix-0051 but its last few jobs are not looking optimal. This could be due to bad code on try but the MemoryError in the log snippet below makes me think it would be worth running hardware diagnostics on it:
05:17:03     INFO -  js\src\ctypes\libffi> config.status: executing include commands
05:17:03     INFO -  js\src\ctypes\libffi> config.status: executing src commands
06:45:46     INFO -  Traceback (most recent call last):
06:45:46     INFO -    File "c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/src/build/subconfigure.py", line 422, in <module>
06:45:46     INFO -      sys.exit(main(sys.argv[1:]))
06:45:46     INFO -    File "c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/src/build/subconfigure.py", line 406, in main
06:45:46     INFO -      return subconfigure(args)
06:45:46     INFO -    File "c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/src/build/subconfigure.py", line 393, in subconfigure
06:45:46     INFO -      pool.imap_unordered(run, subconfigures):
06:45:46     INFO -    File "c:\mozilla-build\python27\Lib\multiprocessing\pool.py", line 626, in next
06:45:46     INFO -      raise value
06:45:46     INFO -  MemoryError
06:45:46     INFO -  *** Fix above errors and then restart with\
06:45:46     INFO -                 "c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/src/mozmake.EXE -f client.mk build"
06:45:46     INFO -  c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/src/client.mk:361: recipe for target 'configure' failed
06:45:46     INFO -  mozmake.EXE[2]: *** [configure] Error 1
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
B-2008-IX-0013 also still only has 4gb

cltbld@B-2008-IX-0013 ~
$ systeminfo | grep "Total"
Total Physical Memory:     4,087 MB
13, 14, and 51 has had all of its memory replaced. ive confirmed BIOS shows 8gb and the OS system properties shows 8gb as well.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.