Comment # 29 on bug 1169774 from
So was thinking more about this. In the end I've decided I want to verify more
that I understand correctly what's going on and added some more trace points to
inform about why transaction starts are being blocked, how big are transaction
handles and how come resulting commits are so small in dioread_nolock mode.
That revealed that there are usually only ~1500 reserved credits (out of 64k
total credits in a transaction) which highlighted even more that the theory
about reserved credits causing premature transaction commits was not correct
and there must be something else going on - this amount of reserved credits
could cause a regression of a few percent but not really 20%. After some more
debugging I've found out that when we reserve transaction handle but then don't
use it, we do not properly return reserved credits (we remove them from the
reserved amount but we forgot to remove them also from the total number of
credits tracked in a transaction). This results in transaction having lots of
leaked credits that then result it forcing transaction commit early because we
think the transaction is full (although it is not in the end). Fixing this leak
also fixes the fsmark performance for me:

                                      fsmark                 fsmark
                                        lock        nolock-fixunrsv
Min       1-files/sec    46974.80 (   0.00%)    47322.70 (   0.74%)
1st-qrtle 1-files/sec    49614.60 (   0.00%)    49663.50 (   0.10%)
2nd-qrtle 1-files/sec    48644.50 (   0.00%)    49259.20 (   1.26%)
3rd-qrtle 1-files/sec    47583.90 (   0.00%)    47966.80 (   0.80%)
Max-1     1-files/sec    50754.30 (   0.00%)    51919.40 (   2.30%)
Max-5     1-files/sec    50754.30 (   0.00%)    51919.40 (   2.30%)
Max-10    1-files/sec    50754.30 (   0.00%)    51919.40 (   2.30%)
Max-90    1-files/sec    47356.50 (   0.00%)    47473.90 (   0.25%)
Max-95    1-files/sec    47356.50 (   0.00%)    47473.90 (   0.25%)
Max-99    1-files/sec    47356.50 (   0.00%)    47473.90 (   0.25%)
Max       1-files/sec    50754.30 (   0.00%)    51919.40 (   2.30%)
Hmean     1-files/sec    48540.95 (   0.00%)    48884.32 (   0.71%)

Then there was another revelation for me that in this workload ext4 actually
starts lots of reserved transaction handles that are unused. This is due to the
way how ext4 writepages code works - it starts a transaction, then inspects
page cache and writes one extent if found. Then starts again a transaction and
checks whether there's more to write. So for single extent files we always
start transaction twice, second time only to find there's nothing more to
write. This probably also deserves to be fixed but a simple fix I made seems to
break page writeback so I need to dig more into it and it doesn't seem to be a
pressing issue.

I'll push the jbd2 fix upstream.


You are receiving this mail because: