Mailinglist Archive: opensuse (1599 mails)

< Previous Next >
Re: [opensuse] Yast2 partitioner new features
  • From: Sinisa <sinisa@xxxxxxx>
  • Date: Tue, 27 Oct 2009 15:54:27 +0100
  • Message-id: <4AE709A3.4050804@xxxxxxx>
On 10/27/09 12:54, Per Jessen wrote:
Sinisa wrote:

Agree, but with current disk prices, it will become more and more
common to have at least raid1 (or raid 10 on two disks, something that
seems to be specific to Linux, but has significant speed increase on
reads over raid1)
Yep, except for plain back-office PCs, I always have RAID1.

BTW, what about my second question, about beeing able to specify
layout for radi10?
It wasn't quite clear to me what you meant - I think it's possible to
setup a RAID10 today, isn't it?

From "man mdadm":

---------------------------------- 8< ------------------------------------------
-p, --layout=
This option configures the fine details of data layout for raid5, and raid10 arrays, and controls the failure modes for faulty.

.... skipped raid5, nobody uses it anyway ....

Finally, the layout options for RAID10 are one of 'n', 'o' or 'f' followed by a small number. The default is 'n2'.

n signals 'near' copies. Multiple copies of one data block are at similar offsets in different devices.

o signals 'offset' copies. Rather than the chunks being duplicated within a stripe, whole stripes are duplicated but are rotated by one device so duplicate blocks are
on different devices. Thus subsequent copies of a block are in the next drive, and are one chunk further down.

f signals 'far' copies (multiple copies have very different offsets). See md(4) for more detail about 'near' and 'far'.

The number is the number of copies of each datablock. 2 is normal, 3 can be useful. This number can be at most equal to the number of devices in the array. It does
not need to divide evenly into that number (e.g. it is perfectly legal to have an 'n2' layout for an array with an odd number of devices).

---------------------------------- 8< ------------------------------------------

Layout "far" gives best performance (almost twice read speed of layout "near" or raid1, writes beeing the same, even for 2 disks), and has no disadvantages (that I know of), so it could just be made default.


PS. Here is a small test I did today with 4 Seagate 1.5TB SATA drives. I know that there is much more to performance than raw reading speed reported by hdparm, but this is enough to make one think:

1. Created 4 50GB partitions on each drive (so that all of them have almost same speed because they are at the begining of the drive and small enough to resync fast)

2. Created 6 md arrays:

2 disk raid10 with layout near
sa:~ # mdadm -C -n 2 -c 512 -l 10 -p n2 /dev/md0 /dev/sdc1 /dev/sdd1

2 disk raid10 with layout far
sa:~ # mdadm -C -n 2 -c 512 -l 10 -p f2 /dev/md1 /dev/sde1 /dev/sdf1

2 disk raid1
sa:~ # mdadm -C -n 2 -c 512 -l 1 /dev/md2 /dev/sdc2 /dev/sdd2

2 disk raid0
sa:~ # mdadm -C -n 2 -c 512 -l 0 /dev/md3 /dev/sde2 /dev/sdf2

3 disk raid10 with layout far (and a hot spare)
sa:~ # mdadm -C -n 3 -c 512 -l 10 -p f2 -x 1 /dev/md4 /dev/sdc3 /dev/sdd3 /dev/sde3 /dev/sdf3

4 disk raid10 with layout far
sa:~ # mdadm -C -n 4 -c 512 -l 10 -p f2 /dev/md5 /dev/sdc4 /dev/sdd4 /dev/sde4 /dev/sdf4

2a. Result looks like this:

sa:~ # cat /proc/mdstat
Personalities : [raid10] [raid1] [raid0]
md5 : active raid10 sdf4[3] sde4[2] sdd4[1] sdc4[0]
104871936 blocks 512K chunks 2 far-copies [4/4] [UUUU]

md4 : active raid10 sdf3[3](S) sde3[2] sdd3[1] sdc3[0]
78653952 blocks 512K chunks 2 far-copies [3/3] [UUU]

md3 : active raid0 sdf2[1] sde2[0]
104871936 blocks 512k chunks

md2 : active raid1 sdd2[1] sdc2[0]
52436096 blocks [2/2] [UU]

md1 : active raid10 sdf1[1] sde1[0]
52435968 blocks 512K chunks 2 far-copies [2/2] [UU]

md0 : active raid10 sdd1[1] sdc1[0]
52435968 blocks 2 near-copies [2/2] [UU]

3. done some timing tests:

sa:~ # hdparm -tT /dev/md0 /dev/md1 /dev/md2 /dev/md3 /dev/md4 /dev/md5

Timing cached reads: 2284 MB in 2.00 seconds = 1141.54 MB/sec
Timing buffered disk reads: 374 MB in 3.00 seconds = 124.52 MB/sec # 2 disk raid10 with layout near

Timing cached reads: 2316 MB in 2.00 seconds = 1157.40 MB/sec
Timing buffered disk reads: 714 MB in 3.00 seconds = 237.76 MB/sec # 2 disk raid10 with layout far, 2x faster than md0

Timing cached reads: 2282 MB in 2.00 seconds = 1141.15 MB/sec
Timing buffered disk reads: 386 MB in 3.00 seconds = 128.46 MB/sec # 2 disk raid1, same speed as raid10 / near

Timing cached reads: 2234 MB in 2.00 seconds = 1116.92 MB/sec
Timing buffered disk reads: 726 MB in 3.01 seconds = 241.58 MB/sec # 2 disk raid0, not really RAID, just for speed comparison

Timing cached reads: 2296 MB in 2.00 seconds = 1148.16 MB/sec
Timing buffered disk reads: 1014 MB in 3.00 seconds = 337.72 MB/sec # 3 disk raid10 with layout far

Timing cached reads: 2296 MB in 2.00 seconds = 1147.79 MB/sec
Timing buffered disk reads: 1308 MB in 3.00 seconds = 435.95 MB/sec # 4 disk raid10 with layout far

Read speed increases lineary with addition of drives, of course limited by bus speed, but I have tested earlier with 10 disks, and all of them were able to deliver 100+MB/s simultaneously (1 GB/s aggregate transfer).

Later today I will test read and write with something like this:
read: dd if=/dev/mdX of=/dev/null bs=8192 count=some large number to eliminate effects of caching (6GB RAM)
write: dd if=/dev/zero of=/dev/mdX bs=8192 count=same large number

To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx

< Previous Next >
Follow Ups