Mailinglist Archive: opensuse (1986 mails)

< Previous Next >
Re: [opensuse] wait times excessive high {Suse 10.3 & 11}
  • From: "Ciro Iriarte" <cyruspy@xxxxxxxxx>
  • Date: Fri, 21 Nov 2008 14:13:29 -0300
  • Message-id: <a998a0140811210913i79c21954veb4487904812412@xxxxxxxxxxxxxx>
2008/11/19 James D. Parra <Jamesp@xxxxxxxxxxxxxxxx>:
On Wednesday, 2008-11-19 at 10:43 -0800, Randall R Schulz wrote:

Why not, have you benchmarked it?

Me, no; he did. On one computer it took 50 minutes, 10 on another
identical system. That difference is not normal.

But if he has a slow disk, it is worth investigating why, because the
machine seems to be a good one.

~~~~~~~~~~~~~~~~~~~~
Hello,


Thank you for all of your feed back. I have three machines that I will
compare.
Machines 1 & 2 have no problems. Machine 3, however, is slow whether it is
writing to a single disk or a RAID50 with 24 drives.
File System on all machines is XFS. Machine 3 has a fresh OS install on new
equipment.


***Machine 1*** -- 8 cpu's -- test done on single drive. -- No
problems here.
model name : Intel(R) Xeon(R) CPU E5310 @ 1.60GHz
stepping : 11
cpu MHz : 1600.049
cache size : 4096 KB

Using 'cp' to copy a 17GB file over the network.


top - 11:52:47 up 20 days, 3:53, 2 users, load average: 1.42, 0.89, 0.38
Tasks: 176 total, 1 running, 175 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu1 : 0.0%us, 1.3%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu2 : 0.0%us, 1.0%sy, 0.0%ni, 87.7%id, 0.0%wa, 0.3%hi, 11.0%si,
0.0%st
Cpu3 : 0.0%us, 8.0%sy, 0.0%ni, 13.7%id, 78.3%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu5 : 0.3%us, 0.7%sy, 0.0%ni, 98.0%id, 0.0%wa, 0.0%hi, 1.0%si,
0.0%st
Cpu6 : 0.3%us, 0.3%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu7 : 0.0%us, 0.7%sy, 0.0%ni, 98.0%id, 0.0%wa, 0.0%hi, 1.3%si,
0.0%st
Mem: 16460944k total, 16377168k used, 83776k free, 52k buffers
Swap: 4200988k total, 100k used, 4200888k free, 15902292k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11682 root 18 0 9132 748 596 D 8 0.0 0:28.50 cp
11367 root 15 0 0 0 0 S 2 0.0 0:02.98 pdflush
255 root 10 -5 0 0 0 S 1 0.0 2:08.64 kswapd0
4256 root 10 -5 0 0 0 S 1 0.0 0:13.39 rpciod/2
661 root 10 -5 0 0 0 S 0 0.0 0:00.98 xfsdatad/5
6424 postgres 16 0 18532 2392 416 S 0 0.0 10:53.34 postgres
11724 root 15 0 8632 1180 828 R 0 0.0 0:00.04 top
1 root 15 0 812 304 244 S 0 0.0 0:02.80 init
2 root 11 -5 0 0 0 S 0 0.0 0:00.25 kthreadd
3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0
4 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0
5 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/1
6 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1
7 root RT -5 0 0 0 S 0 0.0 0:00.02 migration/2
8 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/2
9 root RT -5 0 0 0 S 0 0.0 0:00.02 migration/3
10 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/3
11 root RT -5 0 0 0 S 0 0.0 0:00.30 migration/4
12 root 34 19 0 0 0 S 0 0.0 0:00.06 ksoftirqd/4
<snip>

time cp /home/backup/server/17GB_file /tmp/

real 7m41.027s
user 0m0.160s
sys 0m41.119s
<snip>

Trying 'dd';
dd if=/tmp/17GB_file of=/tmp/17GB_file2 bs=1M

top - 12:27:11 up 20 days, 4:27, 2 users, load average: 0.66, 0.19, 0.22
Tasks: 176 total, 1 running, 175 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu3 : 0.0%us, 1.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu4 : 0.0%us, 1.7%sy, 0.0%ni, 36.3%id, 60.3%wa, 0.0%hi, 1.7%si,
0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu6 : 0.0%us, 3.0%sy, 0.0%ni, 65.7%id, 29.0%wa, 1.3%hi, 1.0%si,
0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 16460944k total, 16374520k used, 86424k free, 52k buffers
Swap: 4200988k total, 100k used, 4200888k free, 15903112k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11884 root 18 0 10244 1792 644 D 4 0.0 0:02.81 dd
255 root 10 -5 0 0 0 S 1 0.0 2:18.29 kswapd0
11367 root 15 0 0 0 0 S 1 0.0 0:05.36 pdflush
660 root 10 -5 0 0 0 S 0 0.0 1:03.30 xfsdatad/4
662 root 10 -5 0 0 0 S 0 0.0 0:59.59 xfsdatad/6
11724 root 15 0 8632 1184 828 R 0 0.0 0:04.79 top
1 root 15 0 812 304 244 S 0 0.0 0:02.80 init
2 root 11 -5 0 0 0 S 0 0.0 0:00.25 kthreadd
3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0
4 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0
5 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/1
6 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1
7 root RT -5 0 0 0 S 0 0.0 0:00.02 migration/2
8 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/2
9 root RT -5 0 0 0 S 0 0.0 0:00.02 migration/3
10 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/3
11 root RT -5 0 0 0 S 0 0.0 0:00.30 migration/4
12 root 34 19 0 0 0 S 0 0.0 0:00.06 ksoftirqd/4
13 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/5
14 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/5
<snip>

Notice that 'cp' and 'dd' are on the top of the list.

***Machine 2*** -- 8 cpu's -- test done on mirrored scsi drive -- No
problems here.
model name : Intel(R) Xeon(R) CPU E5405 @ 2.00GHz
stepping : 6
cpu MHz : 2000.137

Using 'cp' to copy a 17GB file over the network. (this is also in
production)

top - 12:37:15 up 22 days, 5:03, 2 users, load average: 1.05, 0.32, 0.16
Tasks: 175 total, 2 running, 173 sleeping, 0 stopped, 0 zombie
Cpu0 : 99.3%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.3%hi, 0.0%si,
0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni, 99.0%id, 1.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu3 : 0.0%us, 3.6%sy, 0.0%ni, 68.5%id, 26.5%wa, 0.7%hi, 0.7%si,
0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu5 : 0.0%us, 5.3%sy, 0.0%ni, 0.0%id, 94.7%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu6 : 0.0%us, 2.7%sy, 0.0%ni, 81.1%id, 0.0%wa, 0.7%hi, 15.6%si,
0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 33012372k total, 32851424k used, 160948k free, 12k buffers
Swap: 10490436k total, 364k used, 10490072k free, 32062816k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10731 postgres 25 0 657m 548m 227m R 100 1.7 4:33.18 postgres
18818 root 18 0 9136 752 596 D 6 0.0 0:03.76 cp
4634 root 10 -5 0 0 0 S 3 0.0 4:57.23 rpciod/6
257 root 10 -5 0 0 0 S 2 0.0 12:18.33 kswapd0
30402 root 15 0 0 0 0 S 1 0.0 0:00.23 pdflush
18875 root 15 0 8628 1184 828 R 0 0.0 0:00.04 top
1 root 18 0 808 304 244 S 0 0.0 0:02.99 init
2 root 12 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root RT -5 0 0 0 S 0 0.0 0:00.02 migration/0
4 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0
5 root RT -5 0 0 0 S 0 0.0 0:00.02 migration/1
6 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1
7 root RT -5 0 0 0 S 0 0.0 0:00.10 migration/2
8 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/2
9 root RT -5 0 0 0 S 0 0.0 0:00.04 migration/3
10 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/3
11 root RT -5 0 0 0 S 0 0.0 0:04.15 migration/4
12 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/4
13 root RT -5 0 0 0 S 0 0.0 0:03.98 migration/5
14 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/5
15 root RT -5 0 0 0 S 0 0.0 0:00.28 migration/6
16 root 34 19 0 0 0 S 0 0.0 0:00.01 ksoftirqd/6
17 root RT -5 0 0 0 S 0 0.0 0:00.14 migration/7
18 root 34 19 0 0 0 S 0 0.0 0:00.01 ksoftirqd/7
19 root 10 -5 0 0 0 S 0 0.0 0:00.00 events/0
20 root 10 -5 0 0 0 S 0 0.0 0:00.00 events/1
21 root 10 -5 0 0 0 S 0 0.0 0:00.06 events/2
22 root 10 -5 0 0 0 S 0 0.0 0:00.00 events/3
23 root 10 -5 0 0 0 S 0 0.0 0:00.34 events/4
<snip>

time cp /home/backup/server/17GB_file /tmp/

real 7m19.508s
user 0m0.252s
sys 0m29.618s
<snip>

Output from iostat;

avg-cpu: %user %nice %system %iowait %steal %idle
0.03 0.00 0.62 11.51 0.00 87.84

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 0.25 0.00 1011.00 0.00 127730.75 126.34
128.49 127.16 0.99 100.20
<snip>

As you can see, this is pretty fast.

***Machine 3*** -- 8 cpu's -- test done on mirrored scsi drive --
I/O falls apart on this machine.
model name : Intel(R) Xeon(R) CPU E5405 @ 2.00GHz
stepping : 6
cpu MHz : 2000.076

Using 'cp' to copy a 17GB file over the network.

top - 13:01:11 up 2:08, 3 users, load average: 4.77, 1.27, 0.58
Tasks: 162 total, 1 running, 161 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 1.1%sy, 0.0%ni, 53.2%id, 45.7%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu1 : 0.0%us, 0.6%sy, 0.0%ni, 0.0%id, 99.4%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu2 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu3 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni, 0.0%id,100.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 33020488k total, 7822156k used, 25198332k free, 316k buffers
Swap: 8393952k total, 0k used, 8393952k free, 7587420k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21 root 15 -5 0 0 0 S 1 0.0 0:15.42 events/2
4203 root 20 0 8296 724 584 D 1 0.0 0:03.62 cp
1 root 20 0 864 328 272 S 0 0.0 0:01.28 init
2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
3 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/0
4 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0
5 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/1
6 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1
7 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/2
8 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/2
9 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/3
10 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/3
11 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/4
12 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/4
13 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/5
14 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/5
15 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/6
16 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/6
17 root RT -5 0 0 0 S 0 0.0 0:00.00 migration/7
18 root 15 -5 0 0 0 S 0 0.0 0:00.00 ksoftirqd/7
19 root 15 -5 0 0 0 S 0 0.0 0:00.20 events/0
20 root 15 -5 0 0 0 S 0 0.0 0:15.18 events/1
22 root 15 -5 0 0 0 S 0 0.0 0:00.08 events/3
23 root 15 -5 0 0 0 S 0 0.0 0:00.02 events/4
24 root 15 -5 0 0 0 S 0 0.0 0:00.02 events/5
25 root 15 -5 0 0 0 S 0 0.0 0:00.04 events/6
26 root 15 -5 0 0 0 S 0 0.0 0:00.10 events/7
27 root 15 -5 0 0 0 S 0 0.0 0:00.00 khelper
<snip>

time cp /home/backup/server/17GB_file /tmp/

real 36m43.241s
user 0m0.120s
sys 0m33.954s

Here is the output of vmstat;

procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy id
wa
0 3 52 8189540 120 24411432 0 0 1 277 29 36 0 0
87 12
0 3 52 8189276 120 24411512 0 0 0 6436 108 270 0 0
76 24
0 4 52 8185060 120 24415604 0 0 0 6128 103 170 0 0
51 49
0 6 52 8176620 120 24423796 0 0 0 6208 110 327 0 0
83 16
0 9 52 8163972 120 24436084 0 0 0 6296 113 386 0 0
68 32
0 9 52 8159756 120 24440180 0 0 0 6088 102 387 0 0
68 32
0 9 52 8159764 120 24440180 0 0 0 6120 107 336 0 0
40 60
0 9 52 8147240 120 24452468 0 0 0 6504 105 420 0 0
72 27
0 9 52 8138808 120 24460656 0 0 0 5835 95 359 0 0
55 44
0 1 52 8138808 120 24460656 0 0 0 4100 128 331 0 0
69 31
0 5 52 8126340 120 24472948 0 0 0 8219 95 130 0 0
76 23
0 5 52 8122072 120 24476868 0 0 0 6064 90 134 0 0
80 20
0 4 52 8113892 120 24485324 0 0 0 6245 130 331 0 0
79 21
0 6 52 8109428 120 24489328 0 0 0 6856 113 285 0 0
75 24
0 8 52 8101252 120 24497644 0 0 0 6088 103 276 0 0
77 23
0 8 52 8092820 120 24505836 0 0 0 6096 103 375 0 0
79 21
0 8 52 8088476 120 24509668 0 0 0 6088 115 369 0 0
69 31
0 8 52 8080132 120 24517936 0 0 0 6088 109 359 0 0
77 23
0 8 52 8075832 120 24522148 0 0 4 8136 112 413 0 0
35 64
0 9 52 8067400 120 24530344 0 0 0 5975 110 418 0 0
73 27
<snip>

and from iostat;

avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.19 31.95 0.00 67.86

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 677.25 0.00 98.50 0.00 12496.00 126.86
142.08 1441.98 10.15 100.00
<snip>

Something is causing a significant await times (1441.98 compared to 127.16
on Machine 2). Any thoughts on discovering what it might be?

Thank you,

James


What about the disks?, are you using FC?, MPIO?, try with "hdparm -tT
/dev/[your_disk]", did you check /var/log/message ? any issues there?

Regards,
Ciro
--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx

< Previous Next >
References