Comment # 2 on bug 1030310 from Jan Kara

OK, so I've been testing with SLE12 SP3 kernels on ives.arch.suse.de. One thing
I've noticed is that writeback-throttling patches have been backported from
upstream to SLE12-SP3 and it screws this workload badly. Read numbers look all
nice and dandy like:

read[21335]: avg: 6.3 msec; max: 303.8 msec
read[21335]: avg: 6.4 msec; max: 224.6 msec
read[21335]: avg: 6.2 msec; max: 286.5 msec
read[21335]: avg: 6.2 msec; max: 264.8 msec
read[21335]: avg: 6.4 msec; max: 258.2 msec
read[21335]: avg: 6.4 msec; max: 280.4 msec

However writer struggles hard to make any progress at all - usually
pgioperf.log contains only entries like:

wal[27108]: avg: 0.0 msec; max: 6.1 msec
wal[27108]: avg: 0.0 msec; max: 10.9 msec
commit[27108]: avg: 2.6 msec; max: 2361.9 msec
wal[27108]: avg: 0.0 msec; max: 16.5 msec

which are from the time before readers actually managed to start. Then writers
are blocked until the whole benchmark completes.

Analysis of blktrace data has shown that wbt logic interacts badly with CFQ. As
a result of wbt logic, CFQ always sees just one write request at a time so when
such request is seen, async queue gets scheduled, eventually gets its time slot
and submits that one write request. Once that completes, readers are scheduled
again since the async queue has no more IO. As a result we complete about 1
write per couple of seconds which is far too low - single writeback pass
through the data file has more writes than we can complete during the whole
benchmark run.

TODO item: Disable writeback throttling for non-multiqueue devices by default.