Hey Chris, Well, I am not an expert... but what I can tell you is this. When a file system is mounted... it is dirty. If you have any data in buffer cache, which is almost unavoidable, then the file system cannot be 100% consistent. This is why, when you try and run fsck on a mounted/active partition you get warnings about potential data loss. As for the mdadm reporting three devices... have you removed md0 from raidtab (if you had one)? Have you rebooted the machine since you mucked with the raid devices? It's possible the 3rd device is just a shadow from your old md0. Are you seeing any sort of file corruption or errors in your logs/dmesg? If not, then I would think you are possibly being overly concerned about something that really isn't an issue. - Herman Chris Roubekas wrote:
----- Original Message ----- From: Chris Roubekas To: suse-linux-e@suse.com Sent: Saturday, October 09, 2004 11:47 PM Subject: Ok Raid-1 is beggining to scare me BIG TIME!!
Dear friends,
I recently posted a message asking for help on why my RAID-1 has a state of "dirty,no-errors" when issuing mdadm --detail /dev/md0. A friend in the list said that when he changed from reiserfs to ext3 it stopped being dirty as he was experiencing the same problems that I was when running reiserfsck on the /dev/md0 drive and found errors and upon running reiserfsck --fix-fixable the errors where not fixed. So being through this same tunnel of problems, I though I should follow his steps and try changing the entire md0 to ext3 to see if I will be able to avoide the problems that I am faced with.
Well I thought of switching the md0 to md1 and recreating it in ext3 as opposed to reiserfs which it was before.
Since my raid-1 in md0 is 2x200GB drives and I have no other drive to backup I removed one of the disks of md0, created a new md1 with mdadm and formated it in ext3 format. Then I copied all the data from md0 to md1 and finally I removed the second drive from md0 and added it to md1.
After the process of reconstruction was over, I restarted my machine and issued
cat /proc/mdstat which reported:
server:/ # cat /proc/mdstat Personalities : [raid1] read_ahead 1024 sectors md1 : active raid1 hdc1[1] hdb1[0] 199125568 blocks [2/2] [UU]
unused devices: <none>
Which made me very happy as I saw that after the hard-drive lights stopped glowing things appeared just great! But then, I issued a mdadm --detail /dev/md0 and see what I get:
server:/ # mdadm --detail /dev/md1 /dev/md1: Version : 00.90.00 Creation Time : Sat Oct 9 15:44:51 2004 Raid Level : raid1 Array Size : 199125568 (189.90 GiB 203.95 GB) Device Size : 199125568 (189.90 GiB 203.95 GB) Raid Devices : 2 Total Devices : 3 Preferred Minor : 1 Persistence : Superblock is persistent
Update Time : Sun Oct 10 02:09:02 2004 State : dirty, no-errors Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0
Number Major Minor RaidDevice State 0 3 65 0 active sync /dev/hdb1 1 22 1 1 active sync /dev/hdc1 UUID : 22a79613:35bf4980:215dbc1e:910acdea
I am totally confused by what I see here for the following reasons:
a) How is it possible to say Total Devices 3 when there are only 2?? b) Why am I seeing the State: Dirty, no-errors still???
Then I copied a file from hda (which is the root filesystem; my raid is totally storage and nothing more as it is shared by Samba) and noticed that although the directory that is mounted on /dev/md1 has the file, when I mount /dev/hdb1 or hdc1 I notice that the file is not there. I tried issuing "sync" to see what would happen and I got nothing!! The file still doesn't appear on the drives!
I am so confused and at the same time very very scared since I do not know if the data that the users put on that raid are actually being raid-ed (if there is such a word...).
Tried to see how dmesg looks like and I got among other messages the following ones which report stuff about my md :
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: Autodetecting RAID arrays. [events: 0000000e] [events: 0000000e] md: autorun ... md: considering hdc1 ... md: adding hdc1 ... md: adding hdb1 ... md: created md1 md: bind
md: bind md: running: <hdc1><hdb1> md: hdc1's event counter: 0000000e md: hdb1's event counter: 0000000e md: RAID level 1 does not need chunksize! Continuing anyway. kmod: failed to exec /sbin/modprobe -s -k md-personality-3, errno = 2 md: personality 3 is not loaded! md :do_md_run() returned -22 md: md1 stopped. md: unbind md: export_rdev(hdc1) md: unbind md: export_rdev(hdb1) md: ... autorun DONE. Then a little bellow this point I get the following messages: hdd: bad special flag: 0x03 md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. (recovery.c, 254): journal_recover: JBD: recovery, exit status 0, recovered transactions 4810 to 4908 (recovery.c, 256): journal_recover: JBD: Replayed 113 and revoked 0/1 blocks kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.18, 14 May 2002 on ide1(22,1), internal journal EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. (recovery.c, 254): journal_recover: JBD: recovery, exit status 0, recovered transactions 4810 to 4921 (recovery.c, 256): journal_recover: JBD: Replayed 133 and revoked 1/2 blocks kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.18, 14 May 2002 on ide0(3,65), internal journal EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. It looks like md is operating on one hand but is having a near to impossible day trying to import the hdc1 and hdb1 directories.... ARRRGGG!!!! THis is driving me crazy!
Please please please try to give me a hand as I am about to loose my mind for ever!
I am running a SuSE8.1 box with 3 HD 2 are 200GB in Raid-1 and the third is a small drive of 8GB for Linux and root filesystem.
What do you think?? Thank you for your reply! Chris