Suse 9 Pro on dual opteron + 6GB mem crashes/panics

Hi list, We're running Suse 9 pro (latest suse smp kernel) on a dual Opteron 242 with 6GB memory, LSI megaraid 1600 + 2 disks in raid 0 and 4 in raid 5 on a Tyan Thunder K8S (with latest bios). Uname -a: Linux apollo 2.4.21-149-smp #1 SMP Thu Nov 13 23:24:40 UTC 2003 x86_64 x86_64 x86_64 GNU/Linux The machine is running a heavy mysql database. It can stay up for about half an hour, before it crashes, leaving this in the warn-log: Dec 20 09:59:14 apollo kernel: Unable to handle kernel paging request at virtual address 00000103c003a644 Dec 20 09:06:43 apollo last message repeated 3 times Dec 20 09:59:14 apollo kernel: printing rip: Dec 20 09:59:14 apollo kernel: ffffffff80148b29 Dec 20 09:59:14 apollo kernel: PML4 8063 PGD 0 Once it crashed with a more complete (and different) oops/panic: Dec 20 20:28:02 apollo kernel: Unable to handle kernel paging request at virtual address 0000007f804537e0 Dec 20 20:28:02 apollo kernel: printing rip: Dec 20 20:28:02 apollo kernel: ffffffff801494f7 Dec 20 20:28:02 apollo kernel: PML4 1048b1067 PGD 0 Dec 20 20:28:02 apollo kernel: Oops: 0000 Dec 20 20:28:02 apollo kernel: CPU 1 Dec 20 20:28:02 apollo kernel: Pid: 7, comm: kswapd Not tainted Dec 20 20:28:02 apollo kernel: RIP: 0010:[kmem_cache_reap+343/880]{kmem_cache_reap+343} Dec 20 20:28:02 apollo kernel: RIP: 0010:[<ffffffff801494f7>]{kmem_cache_reap+343} Dec 20 20:28:02 apollo kernel: RSP: 0000:0000010100009df8 EFLAGS: 00010016 Dec 20 20:28:02 apollo kernel: RAX: 000ffffff0000000 RBX: 0000000000000003 RCX: 0000000000000019 Dec 20 20:28:02 apollo kernel: RDX: 0000007fffff8000 RSI: 0000000000000000 RDI: 00000100e78f3b10 Dec 20 20:28:02 apollo kernel: RBP: 00000100e78f4080 R08: 0000000000000033 R09: 00000100e78f3b30 Dec 20 20:28:02 apollo kernel: R10: 0000010102c44c30 R11: 0000010102c44c00 R12: 0000000000000058 Dec 20 20:28:02 apollo kernel: R13: 0000000000000002 R14: ffffffff7fffffff R15: 0000000080000000 Dec 20 20:28:02 apollo kernel: FS: 0000000000560b00(0000) GS:ffffffff804bbb00(0000) knlGS:0000000000000000 Dec 20 20:28:02 apollo kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Dec 20 20:28:02 apollo kernel: CR2: 0000007f804537e0 CR3: 00000000e7902000 CR4: 00000000000006e0 Dec 20 20:28:02 apollo kernel: Process kswapd (pid: 7, stackpage=10100009000) Dec 20 20:28:02 apollo kernel: Stack: 0000010100009df8 0000000000000000 ffffffff8014ae20 00000100e78f3b20 Dec 20 20:28:02 apollo kernel: 0000000200000000 0000010001000048 0000000000000020 00000000000001d0 Dec 20 20:28:02 apollo kernel: 00000101000003c0 0000010100009e84 0000000000000000 0000000000000000 Dec 20 20:28:02 apollo kernel: Call Trace: [shrink_cache+1104/1184]{shrink_cache+1104} Dec 20 20:28:02 apollo kernel: Call Trace: [<ffffffff8014ae20>]{shrink_cache+1104} Dec 20 20:28:02 apollo kernel: [shrink_caches+41/128]{shrink_caches+41} [try_to_free_pages_zone+98/272]{try_to_free_pages_zone+98} Dec 20 20:28:02 apollo kernel: [<ffffffff8014b0f9>]{shrink_caches+41} [<ffffffff8014b1b2>]{try_to_free_pages_zone+98} Dec 20 20:28:02 apollo kernel: [kswapd_balance_pgdat+113/224]{kswapd_balance_pgdat+113} [kswapd_balance+28/64]{kswapd_balance+28} Dec 20 20:28:02 apollo kernel: [<ffffffff8014b3d1>]{kswapd_balance_pgdat+113} [<ffffffff8014b45c>]{kswapd_balance+28} Dec 20 20:28:02 apollo kernel: [kswapd+168/195]{kswapd+168} [child_rip+8/16]{child_rip+8} Dec 20 20:28:02 apollo kernel: [<ffffffff8014b5b8>]{kswapd+168} [<ffffffff80110ae4>]{child_rip+8} Dec 20 20:28:02 apollo kernel: [kswapd+0/195]{kswapd+0} [child_rip+0/16]{child_rip+0} Dec 20 20:28:02 apollo kernel: [<ffffffff8014b510>]{kswapd+0} [<ffffffff80110adc>]{child_rip+0} Dec 20 20:28:02 apollo kernel: Dec 20 20:28:02 apollo kernel: Dec 20 20:28:02 apollo kernel: Code: 48 0f b6 92 e0 b7 45 80 48 8b 14 d5 00 b6 45 80 48 8b 8a c8 I hope you guys are able to help, Best regards, Arjen van der Meijden Sysadmin Tweakers.net

A small update. The machine uses about 5.5G memory which is ok, but it also uses about 1.5G of swap, which is very odd... A similar machine (with 'only' 4G memory) runs just fine with 0 swap usage using SLES 8 beta9 (or 8?) orso. Regards, Arjen Arjen van der Meijden wrote:
Hi list,
We're running Suse 9 pro (latest suse smp kernel) on a dual Opteron 242 with 6GB memory, LSI megaraid 1600 + 2 disks in raid 0 and 4 in raid 5 on a Tyan Thunder K8S (with latest bios).
Uname -a: Linux apollo 2.4.21-149-smp #1 SMP Thu Nov 13 23:24:40 UTC 2003 x86_64 x86_64 x86_64 GNU/Linux
The machine is running a heavy mysql database.
It can stay up for about half an hour, before it crashes, leaving this in the warn-log:
Dec 20 09:59:14 apollo kernel: Unable to handle kernel paging request at virtual address 00000103c003a644 Dec 20 09:06:43 apollo last message repeated 3 times Dec 20 09:59:14 apollo kernel: printing rip: Dec 20 09:59:14 apollo kernel: ffffffff80148b29 Dec 20 09:59:14 apollo kernel: PML4 8063 PGD 0
Once it crashed with a more complete (and different) oops/panic:
Dec 20 20:28:02 apollo kernel: Unable to handle kernel paging request at virtual address 0000007f804537e0 Dec 20 20:28:02 apollo kernel: printing rip: Dec 20 20:28:02 apollo kernel: ffffffff801494f7 Dec 20 20:28:02 apollo kernel: PML4 1048b1067 PGD 0 Dec 20 20:28:02 apollo kernel: Oops: 0000 Dec 20 20:28:02 apollo kernel: CPU 1 Dec 20 20:28:02 apollo kernel: Pid: 7, comm: kswapd Not tainted Dec 20 20:28:02 apollo kernel: RIP: 0010:[kmem_cache_reap+343/880]{kmem_cache_reap+343} Dec 20 20:28:02 apollo kernel: RIP: 0010:[<ffffffff801494f7>]{kmem_cache_reap+343} Dec 20 20:28:02 apollo kernel: RSP: 0000:0000010100009df8 EFLAGS: 00010016 Dec 20 20:28:02 apollo kernel: RAX: 000ffffff0000000 RBX: 0000000000000003 RCX: 0000000000000019 Dec 20 20:28:02 apollo kernel: RDX: 0000007fffff8000 RSI: 0000000000000000 RDI: 00000100e78f3b10 Dec 20 20:28:02 apollo kernel: RBP: 00000100e78f4080 R08: 0000000000000033 R09: 00000100e78f3b30 Dec 20 20:28:02 apollo kernel: R10: 0000010102c44c30 R11: 0000010102c44c00 R12: 0000000000000058 Dec 20 20:28:02 apollo kernel: R13: 0000000000000002 R14: ffffffff7fffffff R15: 0000000080000000 Dec 20 20:28:02 apollo kernel: FS: 0000000000560b00(0000) GS:ffffffff804bbb00(0000) knlGS:0000000000000000 Dec 20 20:28:02 apollo kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Dec 20 20:28:02 apollo kernel: CR2: 0000007f804537e0 CR3: 00000000e7902000 CR4: 00000000000006e0 Dec 20 20:28:02 apollo kernel: Process kswapd (pid: 7, stackpage=10100009000) Dec 20 20:28:02 apollo kernel: Stack: 0000010100009df8 0000000000000000 ffffffff8014ae20 00000100e78f3b20 Dec 20 20:28:02 apollo kernel: 0000000200000000 0000010001000048 0000000000000020 00000000000001d0 Dec 20 20:28:02 apollo kernel: 00000101000003c0 0000010100009e84 0000000000000000 0000000000000000 Dec 20 20:28:02 apollo kernel: Call Trace: [shrink_cache+1104/1184]{shrink_cache+1104} Dec 20 20:28:02 apollo kernel: Call Trace: [<ffffffff8014ae20>]{shrink_cache+1104} Dec 20 20:28:02 apollo kernel: [shrink_caches+41/128]{shrink_caches+41} [try_to_free_pages_zone+98/272]{try_to_free_pages_zone+98} Dec 20 20:28:02 apollo kernel: [<ffffffff8014b0f9>]{shrink_caches+41} [<ffffffff8014b1b2>]{try_to_free_pages_zone+98} Dec 20 20:28:02 apollo kernel: [kswapd_balance_pgdat+113/224]{kswapd_balance_pgdat+113} [kswapd_balance+28/64]{kswapd_balance+28} Dec 20 20:28:02 apollo kernel: [<ffffffff8014b3d1>]{kswapd_balance_pgdat+113} [<ffffffff8014b45c>]{kswapd_balance+28} Dec 20 20:28:02 apollo kernel: [kswapd+168/195]{kswapd+168} [child_rip+8/16]{child_rip+8} Dec 20 20:28:02 apollo kernel: [<ffffffff8014b5b8>]{kswapd+168} [<ffffffff80110ae4>]{child_rip+8} Dec 20 20:28:02 apollo kernel: [kswapd+0/195]{kswapd+0} [child_rip+0/16]{child_rip+0} Dec 20 20:28:02 apollo kernel: [<ffffffff8014b510>]{kswapd+0} [<ffffffff80110adc>]{child_rip+0} Dec 20 20:28:02 apollo kernel: Dec 20 20:28:02 apollo kernel: Dec 20 20:28:02 apollo kernel: Code: 48 0f b6 92 e0 b7 45 80 48 8b 14 d5 00 b6 45 80 48 8b 8a c8
I hope you guys are able to help,
Best regards,
Arjen van der Meijden Sysadmin Tweakers.net

On Sat, 20 Dec 2003 22:24:43 +0100 Arjen van der Meijden <acm@tweakers.net> wrote:
Once it crashed with a more complete (and different) oops/panic:
Dec 20 20:28:02 apollo kernel: Unable to handle kernel paging request at virtual address 0000007f804537e0 Dec 20 20:28:02 apollo kernel: printing rip: Dec 20 20:28:02 apollo kernel: ffffffff801494f7 Dec 20 20:28:02 apollo kernel: PML4 1048b1067 PGD 0 Dec 20 20:28:02 apollo kernel: Oops: 0000 Dec 20 20:28:02 apollo kernel: CPU 1 Dec 20 20:28:02 apollo kernel: Pid: 7, comm: kswapd Not tainted Dec 20 20:28:02 apollo kernel: RIP: 0010:[kmem_cache_reap+343/880]{kmem_cache_reap+343}
I would suspect bad memory here. -Andi

Andi Kleen wrote:
On Sat, 20 Dec 2003 22:24:43 +0100 Arjen van der Meijden <acm@tweakers.net> wrote:
Once it crashed with a more complete (and different) oops/panic:
Dec 20 20:28:02 apollo kernel: Unable to handle kernel paging request at virtual address 0000007f804537e0 Dec 20 20:28:02 apollo kernel: printing rip: Dec 20 20:28:02 apollo kernel: ffffffff801494f7 Dec 20 20:28:02 apollo kernel: PML4 1048b1067 PGD 0 Dec 20 20:28:02 apollo kernel: Oops: 0000 Dec 20 20:28:02 apollo kernel: CPU 1 Dec 20 20:28:02 apollo kernel: Pid: 7, comm: kswapd Not tainted Dec 20 20:28:02 apollo kernel: RIP: 0010:[kmem_cache_reap+343/880]{kmem_cache_reap+343}
I would suspect bad memory here.
-Andi
First of all, does that explain the 1.5G swapusage? As in: would it use 1.5G of swap if the memory is broken, even if there is plenty of diskcache to remove? I'm no kernel expert, so I don't know the answer to that and I hope you do :) But there is some more news from our front. - I've adjusted the kernel bootparameters to read iommu=fullflush (we noticed your comments on the 2.6.0-amd64-patchpack about the iommu being forced on a io-device that doesn't support it that well) - Changed our 32bits mysql to use less than 2G of memory instead of more (mysql (actually, innodb) used to crash itself when it was configured with more than 2G of memory available to its buffers and such, due to issues with glibc orso). And now it is already running for about 5 hours en 52 minutes, without a hick on exactly the same type of load as before (using the full 6G of memory), when it didn't get past the 2 hours. The question is now: Is our problem solved now? And if so: What did solve it? When it hits the 24hour mark, we'll probably try a few steps changing back, like booting without the iommu=fullflush and such things. Best regards, Arjen

On Sun, 21 Dec 2003 21:11:13 +0100 Arjen van der Meijden <acm@tweakers.net> wrote:
First of all, does that explain the 1.5G swapusage? As in: would it use 1.5G of swap if the memory is broken, even if there is plenty of diskcache to remove?
Broken memory has nothing to do with swapping. I'm just commenting on how the oops looks like. I don't know what causes the swapping. It could be memory corruption from a bad driver too, but usually bad memory is the first guess.
- Changed our 32bits mysql to use less than 2G of memory instead of more (mysql (actually, innodb) used to crash itself when it was configured with more than 2G of memory available to its buffers and such, due to issues with glibc orso).
Best would be to run some memory and IO checker independent of mysql just to verify that both IO and memory work reliably on your system. e.g. run http://people.redhat.com/dledford/memtest.html for some time.
And now it is already running for about 5 hours en 52 minutes, without a hick on exactly the same type of load as before (using the full 6G of memory), when it didn't get past the 2 hours.
The question is now: Is our problem solved now? And if so: What did solve it?
When it hits the 24hour mark, we'll probably try a few steps changing back, like booting without the iommu=fullflush and such things.
You need a new kernel for that option, it is not in the 9.0 kernel. like the kernel in ftp://ftp.suse.com/pub/people/ak/test12/* It will only do anything if your IO device uses the IOMMU. This would generally only happen if it's IDE based of some sort, most other IO devices are not crippled and can access the full address space without assistance. -Andi

Andi Kleen wrote:
On Sun, 21 Dec 2003 21:11:13 +0100 Arjen van der Meijden <acm@tweakers.net> wrote:
First of all, does that explain the 1.5G swapusage? As in: would it use 1.5G of swap if the memory is broken, even if there is plenty of diskcache to remove?
Broken memory has nothing to do with swapping. I'm just commenting on how the oops looks like. I don't know what causes the swapping.
It could be memory corruption from a bad driver too, but usually bad memory is the first guess. Well, the machine is still running stable since my previous email (over three days and eight hours now). So one of the software changes has resulted in the stability.
- Changed our 32bits mysql to use less than 2G of memory instead of more (mysql (actually, innodb) used to crash itself when it was configured with more than 2G of memory available to its buffers and such, due to issues with glibc orso).
Best would be to run some memory and IO checker independent of mysql just to verify that both IO and memory work reliably on your system.
e.g. run http://people.redhat.com/dledford/memtest.html for some time. We have run quite a few tests (including a few heavy mysql tests and that memorytest, or a similar tool) before putting the machine into production use, we were very surprised by its instability after putting it into production.
You need a new kernel for that option, it is not in the 9.0 kernel. like the kernel in ftp://ftp.suse.com/pub/people/ak/test12/* It will only do anything if your IO device uses the IOMMU. This would generally only happen if it's IDE based of some sort, most other IO devices are not crippled and can access the full address space without assistance. It becomes quite hard to track/deduce what caused the instabilities now :)
If we find out what did, we'll let you know. Best regards and a merry christmas, Arjen van der Meijden

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I don't think it's a function of the Kernel or anything in the software... Could this be with the Tyan 2880? we've had the same problems. I've spoken with Tyan about it, even went to their office with our server, and they claim it's an error in the CPU, that they can reproduce the problem on a B-stepping CPU, but the problem was corrected by the C-stepping CPU (we're using the 246). Then I spoke with my uncle, who came to visit from Taiwan, where he is one the guys at Phoenix BIOS, who writes the BIOS for Tyan(and almost everyone else). He says that it's a design flaw in the Mobo, and that there is no fix for it, other than run 4GB. which is what we're doing, and it has been stable for a while now running 64-bit MySQL. Kris. Arjen van der Meijden wrote: | Andi Kleen wrote: | |> On Sun, 21 Dec 2003 21:11:13 +0100 |> Arjen van der Meijden <acm@tweakers.net> wrote: |> |> |>> First of all, does that explain the 1.5G swapusage? |>> As in: would it use 1.5G of swap if the memory is broken, even if |>> there is plenty of diskcache to remove? |> |> |> |> Broken memory has nothing to do with swapping. I'm just commenting on how |> the oops looks like. I don't know what causes the swapping. |> |> It could be memory corruption from a bad driver too, but usually bad |> memory |> is the first guess. | | Well, the machine is still running stable since my previous email (over | three days and eight hours now). So one of the software changes has | resulted in the stability. | |>> - Changed our 32bits mysql to use less than 2G of memory instead of |>> more (mysql (actually, innodb) used to crash itself when it was |>> configured with more than 2G of memory available to its buffers and |>> such, due to issues with glibc orso). |> |> |> |> Best would be to run some memory and IO checker independent of mysql |> just to verify that both IO and memory work reliably on your system. |> |> e.g. run http://people.redhat.com/dledford/memtest.html for some time. | | We have run quite a few tests (including a few heavy mysql tests and | that memorytest, or a similar tool) before putting the machine into | production use, we were very surprised by its instability after putting | it into production. | |> You need a new kernel for that option, it is not in the 9.0 kernel. |> like the kernel in ftp://ftp.suse.com/pub/people/ak/test12/* |> It will only do anything if your IO device uses the IOMMU. This would |> generally only happen if it's IDE based of some sort, most other IO |> devices are not crippled and can access the full address space |> without assistance. | | It becomes quite hard to track/deduce what caused the instabilities now :) | | If we find out what did, we'll let you know. | | Best regards and a merry christmas, | | Arjen van der Meijden | | | -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQE/6jKNuBLfyXibQuYRAs+AAJ4tHcHok8T8xqOUnLIEUbqOIC2oNwCgp0g2 Qq8bRBeTXTNfz5j5A4gV8bs= =UcNc -----END PGP SIGNATURE-----

On Wed, 24 Dec 2003 16:42:54 -0800 Kris Ongbongan <kris@vfrogs.com> wrote:
Could this be with the Tyan 2880? we've had the same problems. I've spoken with Tyan about it, even went to their office with our server, and they claim it's an error in the CPU, that they can reproduce the problem on a B-stepping CPU, but the problem was corrected by the C-stepping CPU (we're using the 246).
Maybe that's refering to Errata #86. See the AMD Opteron Specification Update .pdf on the AMD website. The BIOS can fix that though and probably does.
Then I spoke with my uncle, who came to visit from Taiwan, where he is one the guys at Phoenix BIOS, who writes the BIOS for Tyan(and almost everyone else). He says that it's a design flaw in the Mobo, and that
Judging from all the reports we get on this mailing list about Tyan problems this would not surprise me at all. People hardly ever report memory problems with other motherboards, just with a few Tyan mobos. -Andi
participants (3)
-
Andi Kleen
-
Arjen van der Meijden
-
Kris Ongbongan