Comment # 17 on bug 1030310 from
ext4 looks good

pgioperfbench ext4
                              4.11.0-rc5            4.11.0-rc5
                                 vanilla         transact-v1r1
Min         commit           13.10 (  0.00%)           11.00 ( 16.03%)
Min         read           1053.00 (  0.00%)         1077.30 ( -2.31%)
Min         wal               0.00 (  0.00%)            0.00 (  0.00%)
Max-95%     commit         6154.80 (  0.00%)           78.90 ( 98.72%)
Max-95%     read           1473.40 (  0.00%)         1097.10 ( 25.54%)
Max-95%     wal            6359.70 (  0.00%)            0.10 (100.00%)
Max-99%     commit         8933.20 (  0.00%)          382.50 ( 95.72%)
Max-99%     read           1696.80 (  0.00%)         1097.10 ( 35.34%)
Max-99%     wal            7013.60 (  0.00%)            0.20 (100.00%)
Max         commit        10651.00 (  0.00%)         3090.50 ( 70.98%)
Max         read           1696.80 (  0.00%)         1097.10 ( 35.34%)
Max         wal           76206.20 (  0.00%)           41.40 ( 99.95%)
Mean        commit          828.89 (  0.00%)           57.06 ( 93.12%)
Mean        read           1111.46 (  0.00%)         1088.66 (  2.05%)
Mean        wal            1241.19 (  0.00%)            0.08 ( 99.99%)

This is a limited view of the report but it's fairly obvious it's good. Max wal
latency of 76 seconds down to 4 ms, read latencies very similar, commit times
way down. However, there appears to be some read starvation going on because
the number of read samples is far lower (not in the report). A manual check
shows 416 read samples with the vanilla kernel and 80 with the patches.

The story is much more severe for ext3

                              4.11.0-rc5            4.11.0-rc5
                                 vanilla         transact-v1r1
Min         commit           12.40 (  0.00%)            9.80 ( 20.97%)
Min         read           1046.90 (  0.00%)
Min         wal               0.00 (  0.00%)            0.00 (  0.00%)
Max-95%     commit         4156.80 (  0.00%)          101.40 ( 97.56%)
Max-95%     read           1296.10 (  0.00%)                 (100.00%)
Max-95%     wal            4623.20 (  0.00%)            0.10 (100.00%)
Max-99%     commit         6352.20 (  0.00%)          521.90 ( 91.78%)
Max-99%     read           1296.20 (  0.00%)                 (100.00%)
Max-99%     wal            5346.10 (  0.00%)            0.20 (100.00%)
Max         commit        36643.40 (  0.00%)         2212.40 ( 93.96%)
Max         read           1296.20 (  0.00%)                 (100.00%)
Max         wal           45138.40 (  0.00%)          124.40 ( 99.72%)

Those blank entries for read are somewhat of a reporting bug but occur due to
no samples being recorded. A manual check verifies. 304 samples with the
vanilla kernel and 0 with the patches applied.

The per-sample graphs (not presented) shows that commit and wal times are
consistently very low but the lack of reads is of concern.

A partially manual rerun to see what readers were doing was not particularly
revealing. It's stuck in read as you'd expect

delboy:~ # cat /proc/3177/stack
[<ffffffff810b5876>] io_schedule+0x16/0x40
[<ffffffff811a8466>] wait_on_page_bit_common+0x116/0x1c0
[<ffffffff811ab367>] generic_file_read_iter+0x157/0x8b0
[<ffffffffa01afd9a>] ext4_file_read_iter+0x4a/0xd0 [ext4]
[<ffffffff8123b27e>] __vfs_read+0xbe/0x130
[<ffffffff8123c13e>] vfs_read+0x9e/0x170
[<ffffffff8123d666>] SyS_read+0x46/0xa0
[<ffffffff810039ae>] do_syscall_64+0x6e/0x180
[<ffffffff8176be2f>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff

They're not completely stalled because tracing one of the readers show that
reads are completing but apparently not enough of them to meet the threshold
where pgioperf reports something. It could be another flaw in the benchmark and 
the reason for fewer reads being recorded is simply because writes are not
being stalled but it's worth checking out.

One major observation supporting that it's a basic timing issue is that the
time the benchmark takes to complete is way reduced.

          4.11.0-rc5  4.11.0-rc5
             vanillatransact-v1r1
User           14.51        8.91
System        188.92       97.41
Elapsed      4432.20     2660.06

That's way faster and this may all be down to timing. Hence, there may be no
problem with the patches here as such and what is needed is to adjust the
benchmark to report stall times more frequently and increase the number of
samples it takes before the benchmark is considered complete.


You are receiving this mail because: