A bit more than four hours ago we removed two of the 6 memory modules, leaving a physical addressable space of only 4G and now it is running without any swap usage at all and more stable than it did today before that removal. Best regards, Arjen Kees Hoekzema wrote:
Hello list,
We have a strange problem with one of our opteron servers. It used to be very unstable when we put it in production the first time, but after some tweaks we managed it to run for more than 20 days without crashing (the previous record was 2 hours..).
Until today. We had to reboot the server and after it came up it was very unstable again. It crashed every half hour, it was pingable, a running vmstat would continue to run but no new commando's or shells could be started effectivily freezing the server. This was the same behavior we experienced when we started to use the server, we were unable than to conclude what was exactly wrong, it was just running stable after the xx'th reboot without any reason why it should be stable.
We are currently using only 4G memory (6G total avail.) this makes the server stable but very slow, it almost looks like the memory management of the kernel still thinks he has 6G ram and uses 2G of swap to compensate the missing 2G of ram (we are starting with mem=4g, we don't have physical access to the server now). Also when running in 4G-mode kswapd is doing strange things ~ every 5 minutes, a snapshot of top shows it useing a huge amount of CPU and MEM, it blocks the other processes completely making the server unavailable for several minutes.
[snip]
A kinda nasty piece of work from.. kspwad! What is wrong with kswapd? it crashes our server if we use 6G and slows down our server to a crawl when using 4G (kernel option mem=4g). Is this the BIOS bug mentioned earlier or an undocumented AMD 'feature'. We are going to take two dimms out the server so it runs physically with 4G (maybe we just slap it a bit too for the trouble it is giving us ;)).
But are there more things we can do? it won't boot a vanilla kernel (it crashes during boot on.. kswapd!..) , it boots the latest 2.4.21-149 suse kernel without problems though.
-kees