Random segmentation faults --> Software,hardware???
Hello SuSE folkz, I've noticed random seg fault problem when I compiled some applications from its sources. Sometimes the same application compiled without problems. Sometimes it exits with seg fault error. Sometimes when I'm changing directories with cd and looking files with less. I'm getting random segmentation fault error as well. Could somebody please point me what could be source of this problem? Filesystem? I'm using ReiserFS. Hardware, memory, hard drives? This computer built on Tyan Tiger S2460 mobo with 3GB SDRAM and dual Athlom MP 1500+ CPU's. Recently I've changed failed IBM Deskstar 40Gb on the same but newer edition. Or something else? Many thanks in advance for any thoughts or ideas. Alex
On Thu, 27 Feb 2003 08:35:51 -0800 Alex Daniloff <alex@daniloff.com> wrote:
Hello SuSE folkz, I've noticed random seg fault problem when I compiled some applications from its sources. Sometimes the same application compiled without problems. Sometimes it exits with seg fault error. Sometimes when I'm changing directories with cd and looking files with less. I'm getting random segmentation fault error as well. Could somebody please point me what could be source of this problem? Filesystem? I'm using ReiserFS. Hardware, memory, hard drives? This computer built on Tyan Tiger S2460 mobo with 3GB SDRAM and dual Athlom MP 1500+ CPU's. Recently I've changed failed IBM Deskstar 40Gb on the same but newer edition. Or something else? Many thanks in advance for any thoughts or ideas.
It could be some glitch in your libraries. Was there ever a time it ran good? Can you get the backups from that point in time, and copy the libs back in? Do you have room on a spare partition to install a fresh version of SuSE, and see if that fixes it? -- use Perl; #powerful programmable prestidigitation
Thank you for your suggestion. However it's not an option in my case. I can say that initially this system was very stables with exactly same Linux installation. The only things which were changed are replaced hard drive and added SDRAM up to 3GB. So, may be this is a really hardware memory issue? Thanks. Alex
It could be some glitch in your libraries. Was there ever a time it ran good? Can you get the backups from that point in time, and copy the libs back in?
Do you have room on a spare partition to install a fresh version of SuSE, and see if that fixes it?
On Thu, 27 Feb 2003 12:18:28 -0800 Alex Daniloff <alex@daniloff.com> wrote:
Thank you for your suggestion. However it's not an option in my case. I can say that initially this system was very stables with exactly same Linux installation. The only things which were changed are replaced hard drive and added SDRAM up to 3GB. So, may be this is a really hardware memory issue?
Oooh, 3 gig of ram, my dream. Maybe run memtest overnight. First I would pull out the ram, clean the contacts and reseat them. Maybe the memory timing needs adjusting in the bios? -- use Perl; #powerful programmable prestidigitation
On Thu, 27 Feb 2003 20:17:50 -0500 zentara <zentara@zentara.net> wrote:
Oooh, 3 gig of ram, my dream. Maybe run memtest overnight.
First I would pull out the ram, clean the contacts and reseat them.
Maybe the memory timing needs adjusting in the bios?
An after thought occurred to me, do you think your power supply has enough juice to power all that ram? Do you have a 400 W suppply? -- use Perl; #powerful programmable prestidigitation
On Thursday 27 February 2003 07:39 pm, zentara wrote:
On Thu, 27 Feb 2003 20:17:50 -0500
zentara <zentara@zentara.net> wrote:
Oooh, 3 gig of ram, my dream. Maybe run memtest overnight.
First I would pull out the ram, clean the contacts and reseat them.
Maybe the memory timing needs adjusting in the bios?
An after thought occurred to me, do you think your power supply has enough juice to power all that ram? Do you have a 400 W suppply?
Usually when i get random segfaults, or segfaults doing "simple" applications like less and cd, then its my memory or processor. I love linux because it refuses to run on anything less than perfect while i've seen MS chew through garbabe memory and processors while siliently corrupting your programs and files instead of properly warning you. In short try reducing the ram one module at a time until the problem disappears. Memtest overnight is another good solution. Lastly, sometimes in the bios there is a memory drive strength. WIth so much ram you might need it to drive the signals harder. Then again memory can fail if it is not performance ram and the timing is set too high. Random segfaults especially with "simple" problems, in my expericene, mean that one of your two most crititical componets are failing, CPU or RAM. Good luck. -- #------------------------ #Eric Bambach #Eric@CISU.net #------------------------
I have 430W power supply in this box. I hope it's enough to power the system. Anyway, I'll check memory setting in the BIOS. Thanks again. Alex On Thursday 27 February 2003 05:39 pm, zentara wrote:
On Thu, 27 Feb 2003 20:17:50 -0500
zentara <zentara@zentara.net> wrote:
Oooh, 3 gig of ram, my dream. Maybe run memtest overnight.
First I would pull out the ram, clean the contacts and reseat them.
Maybe the memory timing needs adjusting in the bios?
An after thought occurred to me, do you think your power supply has enough juice to power all that ram? Do you have a 400 W suppply?
On Thu, Feb 27, 2003 at 06:54:59PM -0800, Alex Daniloff wrote:
I have 430W power supply in this box. I hope it's enough to power the system. Anyway, I'll check memory setting in the BIOS.
If you are not using ECC memory then there is no way to detect a memory error, usually it causes some mysterious segfaults. Regards, -Kastus
Hello, Thank you for your response. However, looks like this seg. fault problem has gone away, at least for now. I opened the box, removed all memory modules. Then I rubbed its contacts with soft eraser, and after cleaned them with an alcohol. After this procedure I performed a couple of code compilations and everything was very stable. The only one thing still bothers me. Even though all my 3GB PC2100 DDR memory modules have ECC, I can't set them to ECC setting in the BIOS. When I do this, the system just doesn't post on reboot and I have to reset the BIOS. For now BIOS memory options set to non-ECC. Thanks. Alex -------------------
On Thu, Feb 27, 2003 at 06:54:59PM -0800, Alex Daniloff wrote:
I have 430W power supply in this box. I hope it's enough to power the system. Anyway, I'll check memory setting in the BIOS.
If you are not using ECC memory then there is no way to detect a memory error, usually it causes some mysterious segfaults.
Regards, -Kastus
Dear Alex, Failing on POST is an indicator of a memory failure. Check with your vendor on warrantee info. Always use a good 'contact lubricant' on electronic connections; it helps prevent the formation of oxidation and "stress-corrosion" . Any electronics supply house should carry it for a few dollars; do not fall for the $15+ Audiophiles TWEEK fluid, it is just the cheap stuff repackaged. Good Luck .......... PeterB try http://allelectronics.com On Friday 28 February 2003 10:24 am, Alex Daniloff wrote:
Hello, Thank you for your response. However, looks like this seg. fault problem has gone away, at least for now. I opened the box, removed all memory modules. Then I rubbed its contacts with soft eraser, and after cleaned them with an alcohol. After this procedure I performed a couple of code compilations and everything was very stable. The only one thing still bothers me. Even though all my 3GB PC2100 DDR memory modules have ECC, I can't set them to ECC setting in the BIOS. When I do this, the system just doesn't post on reboot and I have to reset the BIOS. For now BIOS memory options set to non-ECC.
Thanks. Alex
-------------------
On Thu, Feb 27, 2003 at 06:54:59PM -0800, Alex Daniloff wrote:
I have 430W power supply in this box. I hope it's enough to power the system. Anyway, I'll check memory setting in the BIOS.
If you are not using ECC memory then there is no way to detect a memory error, usually it causes some mysterious segfaults.
Regards, -Kastus
-- -- Proud to be a SuSE Linux User since 5.2 --
participants (5)
-
Alex Daniloff
-
Eric
-
Konstantin (Kastus) Shchuka
-
Peter B Van Campen
-
zentara