Mailinglist Archive: opensuse (1306 mails)

< Previous Next >
Re: [opensuse] Software RAID dieing, help/suggestions please2
 I spent a day, and watch it rebuild, killed off most services, and made
 backups of family photos and such during this time.  (I also pulled the log
 files) It took ~20hrs to rebuild, and it didn't crash during this time(the
 previous 2 times it died in about an hr)  The system appeared good and
 stable, but went down the next day.

 The box is currently off.  Hopefully this sat I will be able to follow the
 directions on here to add the hot spare.

 I just don't understand how something under /exports/array1/ with no system
 files could take down the box.  I'm going to comb through logs again to see
 if there is anything else, but so far only the 1 drive reported errors.
Thanks again
-Cody

On Sat, Feb 4, 2012 at 5:39 PM, John Andersen <jsamyth@xxxxxxxxx> wrote:

On 2/4/2012 3:28 PM, Cody Nelson wrote:

I have a software Raid5 with 4 1TB drives. I first found that the
system was hung and couldn't get anything on the screen, rebooted and
did some poking around before it went down again. I do have a spare
drive but not sure which drive it is.  Think I read a drive number
giving some errors. Right now that PC is off till I get some
information and a plan of action.

 Each time the server is on, its up for about 30 min, then is
unresponsive.  Bellow is some of the output, I'd appreciate any help,
I'm not use to  software raid issues.

 /dev/md0:
        Version : 00.90.03
  Creation Time : Mon Nov  3 01:30:34 2008
     Raid Level : raid5
     Array Size : 2930285184 (2794.54 GiB 3000.61 GB)
  Used Dev Size : 976761728 (931.51 GiB 1000.20 GB)
   Raid Devices : 4
  Total Devices : 4
 Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Fri Jan 13 18:28:23 2012
          State : active, resyncing
 Active Devices : 4
 Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 1% complete

           UUID : 9327adce:9242bb1a:5ddee0bc:b536709a
         Events : 0.11

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1

Oh yea openSUSE 11.1 on an intel celeron 600mhz 512mb ram, 3ware 8
SATA in jbod mode.

Can I turn off the auto fixing to see if the system becomes stable
enough to work on?

Thanks,
Cody


No I wouldn't turn off auto fixing.  Its corrupt, and who knows what you will
end up
with if you don't let it rebuild.

Its in a rebuild state.  Leave it alone, and see if it can recover would be
my advice.

Put up a top display to make sure it is still rebuilding and leave it alone
and keep users off
till it digs itself out. Its a celeron for christ sake!!!

Maybe go buy some ram?

You might want to add the spare to the array now, so that if mdadm decides a
drive is toast
it can swap in the new one, and maybe want to tail /var/log/warn for a while.

(This is why I never put root on raid5, you can always re-install the OS on a
fresh
drive, but rebuilding a raid 5 can be really painful when it includes / and
the
system itself.)


--
_____________________________________
---This space for rent---
--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse+owner@xxxxxxxxxxxx

--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse+owner@xxxxxxxxxxxx

< Previous Next >
Follow Ups