I spent a day, and watch it rebuild, killed off most services, and made
backups of family photos and such during this time. (I also pulled the log
files) It took ~20hrs to rebuild, and it didn't crash during this time(the
previous 2 times it died in about an hr) The system appeared good and
stable, but went down the next day.
The box is currently off. Hopefully this sat I will be able to follow the
directions on here to add the hot spare.
I just don't understand how something under /exports/array1/ with no system
files could take down the box. I'm going to comb through logs again to see
if there is anything else, but so far only the 1 drive reported errors.
Thanks again
-Cody
On Sat, Feb 4, 2012 at 5:39 PM, John Andersen
On 2/4/2012 3:28 PM, Cody Nelson wrote:
I have a software Raid5 with 4 1TB drives. I first found that the system was hung and couldn't get anything on the screen, rebooted and did some poking around before it went down again. I do have a spare drive but not sure which drive it is. Think I read a drive number giving some errors. Right now that PC is off till I get some information and a plan of action.
Each time the server is on, its up for about 30 min, then is unresponsive. Bellow is some of the output, I'd appreciate any help, I'm not use to software raid issues.
/dev/md0: Version : 00.90.03 Creation Time : Mon Nov 3 01:30:34 2008 Raid Level : raid5 Array Size : 2930285184 (2794.54 GiB 3000.61 GB) Used Dev Size : 976761728 (931.51 GiB 1000.20 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent
Update Time : Fri Jan 13 18:28:23 2012 State : active, resyncing Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0
Layout : left-symmetric Chunk Size : 64K
Rebuild Status : 1% complete
UUID : 9327adce:9242bb1a:5ddee0bc:b536709a Events : 0.11
Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 2 8 33 2 active sync /dev/sdc1
Oh yea openSUSE 11.1 on an intel celeron 600mhz 512mb ram, 3ware 8 SATA in jbod mode.
Can I turn off the auto fixing to see if the system becomes stable enough to work on?
Thanks, Cody
No I wouldn't turn off auto fixing. Its corrupt, and who knows what you will end up with if you don't let it rebuild.
Its in a rebuild state. Leave it alone, and see if it can recover would be my advice.
Put up a top display to make sure it is still rebuilding and leave it alone and keep users off till it digs itself out. Its a celeron for christ sake!!!
Maybe go buy some ram?
You might want to add the spare to the array now, so that if mdadm decides a drive is toast it can swap in the new one, and maybe want to tail /var/log/warn for a while.
(This is why I never put root on raid5, you can always re-install the OS on a fresh drive, but rebuilding a raid 5 can be really painful when it includes / and the system itself.)
-- _____________________________________ ---This space for rent--- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org