John Craig wrote:
Hi,
I have a dual Opteron File server with 4GB ram, and just "upgraded" to SLES9 x86-64 (was previously running SUSE 9.3 x86-64 without problem). The system keeps crashing, and I am starting to suspect mtrr.
Some other posts on the subject indicate that mtrr is used to remap the memory that is hidden by the PCI bus to a region above the existing physical memory, so it can be accessible.
The system on boot always reports something like this:
mtrr: 0xfd000000,0x800000 overlaps existing 0xfd000000,0x400000 mtrr: 0xfd000000,0x800000 overlaps existing 0xfd000000,0x400000
And after running a while it suddently locks up, requiring a cold start.
Something else that I thought funny is that there are 2 PCI -X cards in it that work fine but attempting to put in another (an Intel dual GigE card) prevents the system from booting or even passing POST (even with the other cards removed). The same card works in another dual Opteron machine (different mother board) running SLES9 without problem.
So, if erroneous memory mapping by mtrr is causing the lockups, how can I remap or disable memory mapping?
Thanks for any clues
I'm investigating random lockups as well on a SuperMicro motherboard using SuSE 10.0. I get the exact message as well but I'm not conviced the mtrr module is the culprit. I was about to try booting with mtrr disabled when I decided to go home and try it Monday. There's a boot command argument to disable mtrr. Also, I just read that the xorg.conf file has an option ( Option mtrr = off") under the "Section Device" pertinent to the graphics driver. I think the problem is graphics related. Sometimes the system runs for days before locking up, sometimes for a few hours. The screen saver has been disabled too.