Hi guys, I'm having problems with a windows server that resets spontaneously. Normally I would suspect windows, but seeing as it is a fresh install, the hard drive seems to be the obvious culprit. Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level? Thanks Hans
It could be other possible hardware as well. Change the ram and see what it does. -----Original Message----- From: Hans du Plooy [mailto:hansdp-lists@sagacit.com] Sent: Monday, October 31, 2005 12:16 PM To: suse-linux-e@suse.com Subject: [SLE] Test HDD Hi guys, I'm having problems with a windows server that resets spontaneously. Normally I would suspect windows, but seeing as it is a fresh install, the hard drive seems to be the obvious culprit. Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level? Thanks Hans -- Check the headers for your unsubscription address For additional commands send e-mail to suse-linux-e-help@suse.com Also check the archives at http://lists.suse.com Please read the FAQs: suse-linux-e-faq@suse.com
Hi guys,
I'm having problems with a windows server that resets spontaneously. Normally I would suspect windows, but seeing as it is a fresh install, the hard drive seems to be the obvious culprit.
First test ram. memtest86, on every bootmenu on Suse-cd's
Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level?
Check at the drive-manufacturers website. All of them have tools to check a disk, booting from a floppy or CD.
Thanks Hans
-- Check the headers for your unsubscription address For additional commands send e-mail to suse-linux-e-help@suse.com Also check the archives at http://lists.suse.com Please read the FAQs: suse-linux-e-faq@suse.com
-- L. de Braal BraHa Systems NL - Terneuzen T +31 115 649333 F +31 115 649444
On Monday 31 October 2005 13:09, Leen de Braal wrote:
First test ram. memtest86, on every bootmenu on Suse-cd's That's the first thing I did - did four passes so far without any errors.
Check at the drive-manufacturers website. All of them have tools to check a disk, booting from a floppy or CD. Thanks, I'll do that.
Thanks Hans
On Mon, 2005-10-31 at 14:00 +0200, Hans du Plooy wrote:
On Monday 31 October 2005 13:09, Leen de Braal wrote:
First test ram. memtest86, on every bootmenu on Suse-cd's That's the first thing I did - did four passes so far without any errors.
Check at the drive-manufacturers website. All of them have tools to check a disk, booting from a floppy or CD. Thanks, I'll do that.
Thanks Hans
Hi Hans, Four passes in not enough. Leave it running for about 24 hours if you an afordth downtime. I sometimes witness strange behaviour only after several hours, each time a different address. furthermore, have a look at the temperature. Probably you can do it with several tools. I use /usr/bin/acpi -t, from crontab and print the out put every 5 minutes in a log with a time stamp (wraps 24 hours). If the system halts, you can see when the last log was. Hans
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Monday 2005-10-31 at 12:15 +0200, Hans du Plooy wrote:
Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level?
Usually the HD manufacturer provide such tools, booting from a floppy. Seagate has a god one, for example. - -- Cheers, Carlos Robinson -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Made with pgp4pine 1.76 iD8DBQFDZhc+tTMYHG2NR9URAhqvAJ0Rg8SiQaWy8ToEUuyjlMFakalU7QCgj9eI TRyFbGceOJJLFJuVdk9WKUQ= =zh0J -----END PGP SIGNATURE-----
Carlos E. R. wrote:
The Monday 2005-10-31 at 12:15 +0200, Hans du Plooy wrote:
Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level?
Usually the HD manufacturer provide such tools, booting from a floppy. Seagate has a god one, for example. Or I can heartily recommend the Ultimate Boot cd http://www.ultimatebootcd.com/ It is a bootable cd with several different manufacturers utilities plus a whole lot more. -- Joe Morris New Tribes Mission Email Address: Joe_Morris@ntm.org Registered Linux user 231871
I agree that it's not likely to be your drive that's at fault, but since it appears nobody else mentioned it, the tool you probably want is smartctl. It drives tests that are built into the majority of drives these days. (I think that SATA is not supported, but normal IDE and SCSI are) Run something like: smartctl -a /dev/hda and you'll see what the drive currently thinks of itself. Then run: smartctl -t long /dev/hda and wait the hour or two that the information tells you while it does an extensive self-test, then run the first command (smartctl -a ...) again to see what it found out. But, as they said, the drive is not likely to be your problem. That said, the SMART-capable drives also keep track of their own internal temperatures, storing max and min over their lifetimes. That information might help point out if you have a generalized heat problem in your case (as opposed to a localized one on the CPU or just memory area.) Cheers, Simon --- "Carlos E. R." <robin1.listas@tiscali.es> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
The Monday 2005-10-31 at 12:15 +0200, Hans du Plooy wrote:
Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level?
Usually the HD manufacturer provide such tools, booting from a floppy. Seagate has a god one, for example.
- -- Cheers, Carlos Robinson
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Made with pgp4pine 1.76
iD8DBQFDZhc+tTMYHG2NR9URAhqvAJ0Rg8SiQaWy8ToEUuyjlMFakalU7QCgj9eI TRyFbGceOJJLFJuVdk9WKUQ= =zh0J -----END PGP SIGNATURE-----
-- Check the headers for your unsubscription address For additional commands send e-mail to suse-linux-e-help@suse.com Also check the archives at http://lists.suse.com Please read the FAQs: suse-linux-e-faq@suse.com
"You can tell whether a man is clever by his answers. You can tell whether a man is wise by his questions." Naguib Mahfouz __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com
Hans du Plooy wrote:
Hi guys,
I'm having problems with a windows server that resets spontaneously. Normally I would suspect windows, but seeing as it is a fresh install, the hard drive seems to be the obvious culprit.
Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level?
The hard drive is not the problem. Even if memtest does not fail, try changing the memory anyway, with a different brand. Look in BIOS at the memory timings, choose "By SPD". Other cause could be a bad driver. Investigating the hard-disk is most probably a waste of time. Some bad blocks on the disk don't manifest with spontaneous resets. But if you really feel you need to do it, boot the "Rescue System" and use the badblocks command. 99.99999% it will tell you the hard-disk is just fine and peachy.
On 10/31/05, Silviu Marin-Caea <silviu_marin-caea@fieldinsights.ro> wrote:
Hans du Plooy wrote:
Hi guys,
I'm having problems with a windows server that resets spontaneously. Normally I would suspect windows, but seeing as it is a fresh install, the hard drive seems to be the obvious culprit.
Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level?
The hard drive is not the problem. Even if memtest does not fail, try changing the memory anyway, with a different brand.
Look in BIOS at the memory timings, choose "By SPD".
Other cause could be a bad driver.
Investigating the hard-disk is most probably a waste of time. Some bad blocks on the disk don't manifest with spontaneous resets.
But if you really feel you need to do it, boot the "Rescue System" and use the badblocks command. 99.99999% it will tell you the hard-disk is just fine and peachy.
Can you bench test the power supply (supplies)? I had a system that would just do random reboots and it turned out to be the supply. It was fine when just sitting there, but do the slightest thing and reboot. John
Investigating the hard-disk is most probably a waste of time. Some bad blocks on the disk don't manifest with spontaneous resets.
I would usually agree wholeheartedly with this statement, but... but.. I recently had some issues with my computer that traced back to some borked data on the hard drive, or a bad sector. Basically it would run along fine until I touched anything to do with Oracle 10g. Turns out that there was some bad data or a bad sector in that particular segemnt of the hard drive. Every time the OS needed to do something with Oracle, it would reboot the computer. If I tried to delete a file in that particular directory, reboot... start up Oracle.. reboot. It was only corrected after I did a Reiser fsck That said, I have to agree with the advice here... memory is almost always the culprit. Even if it tests OK after several passes of Memtest... just try swapping it... C.
I'm having problems with a windows server that resets spontaneously. Normally I would suspect windows, but seeing as it is a fresh install, the hard drive seems to be the obvious culprit.
How about a heat problem? How is the CPU cooling, and what is the
Hans du Plooy wrote: thresholds set in the BIOS? That would be the first thing I would check. -- Joe Morris New Tribes Mission Email Address: Joe_Morris@ntm.org Registered Linux user 231871
On Monday 31 October 2005 16:15, Silviu Marin-Caea wrote:
Investigating the hard-disk is most probably a waste of time. Some bad blocks on the disk don't manifest with spontaneous resets. It does in Windows 2000 (dunno about the others). I've seen that a few times. Actually it throws up a BSOD and then resets, but the BSOD flashes by so quickly on this box, you don't really see it if you're not paying attention.
However, as most of you suggested, the discs were not at fault. I downloaded the Seatools CD and ran all the diagnostics, checks out fine. The problem seems to be an update to the onboard raid controller (this is an Intel Server board, but as usual with IDE/SATA, they have cheap-arse components on it) via Windows Update that change the lowlevel format of the raid set. Bizarre. I figured this out when I gave up and wanted to reload - the setup told me the discs were blank. /rant on Speaking of Intel, I'm not impressed with the boards they've produced lately. The last two years worth of server boards I've dealt with all had at least one issue that I don't expect from magnificently expensive server hardware. Cheap onboard components like Promise and Sil_Image raid controllers (they have their own hardware raid processors, why don't they use that?), buggy ACPI (So bad I have to disable it to get SUSE to boot at all, and this is not a new board), poor performance considering what I can get for half the price. I have one server at a client, the bios of which doesn't have the "Power on after power failure" option. It has "Soft Off" and "Last state." The box runs MS SQL, so there's no way I can just let it run out of power, I have to have the UPS software shut it down. Which means "Last state" is always "Off" i.e. the box doesn't power on when the power comes back on. No BIOS update to fix that either. I'm getting better results from Gigabyte desktop boards that sometimes cost, CPU included, less than half of what the server board they replace cost. /rant off
But if you really feel you need to do it, boot the "Rescue System" and use the badblocks command. 99.99999% it will tell you the hard-disk is just fine and peachy. On which CD is this - windows or SUSE?
Thanks for all the replies Hans
Hans du Plooy wrote:
Hi guys,
I'm having problems with a windows server that resets spontaneously. Normally I would suspect windows, but seeing as it is a fresh install, the hard drive seems to be the obvious culprit.
Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level?
Thanks Hans
I've had problems like this with a failing power supply. Can you tell if its fan is still pulling air through the case?
At 12:15 PM 10/31/2005 +0200, Hans du Plooy wrote:
Content-Disposition: inline
Hi guys,
I'm having problems with a windows server that resets spontaneously. Normally I would suspect windows, but seeing as it is a fresh install, the hard drive seems to be the obvious culprit.
Is there a tool, preferably one that I can run off a live CD, that will test the hard drive below filesystem level?
Thanks Hans
Don't know of a tool, off-hand, but this is the sort of thing a heat problem will cause. Check that all your fans are running, and there's no dirt build- up in a screen, etc. I have been running XP just about 24/7 for over a year, and I find it's extremely reliable, altho some app--I forget what--did give it some trouble. This is by all odds the best Windows they've ever made. -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.1.362 / Virus Database: 267.12.6/151 - Release Date: 10/28/2005
participants (12)
-
Carlos E. R.
-
Clayton
-
Doug McGarrett
-
Gerald Humphreys
-
Hans du Plooy
-
Hans Witvliet
-
Joe Morris (NTM)
-
John Scott
-
Leen de Braal
-
Mark
-
Silviu Marin-Caea
-
Simon Roberts