Re: "background idle" syncing of filesystems

24 May 2023

      On Tuesday 2023-05-23 09:18, Andrei Borzenkov wrote:
...
Date: Tue, 23 May 2023 09:18:19
From: Andrei Borzenkov <arvidjaar@gmail.com>
To: users@lists.opensuse.org
Subject: Re: "background idle" syncing of filesystems
On Tue, May 23, 2023 at 8:53 AM Paul Neuwirth <mail@paul-neuwirth.nl> wrote:
...
On Tuesday 2023-05-23 06:35, Andrei Borzenkov wrote:
...
Date: Tue, 23 May 2023 06:35:45
From: Andrei Borzenkov <arvidjaar@gmail.com>
To: users@lists.opensuse.org
Subject: Re: "background idle" syncing of filesystems
On 23.05.2023 07:02, Paul Neuwirth via openSUSE Users wrote:
...
and I could recreate the issue, after 6 days uptime with high I/O.
from /proc/vmstat:
 nr_dirty 571064
That is about 22GiB (4K pages)
...
nr_dirty_threshold 24894434
 nr_dirty_background_threshold 12432019
from top:
 MiB Mem : 515896.2+total, 6047.250 free, 15834.02+used,
 501647.4+buff/cache
and finally:
 # date; sync; date
 Tue May 23 05:49:06 AM CEST 2023
 Tue May 23 05:59:44 AM CEST 2023
->  syncing took > 10 minutes.
This is appr. 32MiB/s. Assuming small random IO (which is also duplicated due
Sorry, I miscalculated. It is 3.5MiB/s. It sounds small, but the usual
sustained random IO rate for SATA disks is around 50 - 70 IOPS (IO
operations per second), so it depends on IO size.
maybe this represents my real problem. I triggered a sync and watched 
iostat. I/O Rate is around 1.2MB/s, between 40 and 46 tps (transactions 
per second?)
Harddisks are SAS 2TB (SPL-3), IBM-XIV, ST2000NM0023
Controller 02:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS 2008 
[Falcon] (rev 03)
SMART data is fine, except for elevated temperature for one HDD (60°C)

is that really what to expect?

btw. I/O wait increased a lot during sync (which just finished after > 
20 minutes). Some pplications got totally inresponsive, X froze for 
several minutes, load average went from ~1 to 35 (having 32 cpu). not 
nice :-/
...
...
...
to mirroring) I would not call it something extraordinary with two SATA disks.
If this data is on btrfs, you have additional metadata which adds to IO
randomness (and the amount of metadata could be quite large as well).
So it is quite normal for your workload. The only way to mitigate it is to
lower thresholds to force syncing of dirty pages earlier.
What you observe is the typical buffer bloat. If you let data to accumulate
you need to wait for data to drain.
thanks for figuring this out.
As part of my original post, would lowering
/proc/sys/vm/dirty_background_bytes or
/proc/sys/vm/dirty_background_ratio affect system performance
noticeably?
Define "system performance". Yes, lowering the dirty background
threshold may help to reduce sync time during shutdown. But you may
suddenly find out that now drives are kept more busy and it impacts
normal application performance.
Paul Neuwirth

PaNe Foto
Paul Neuwirth
Postfach 45 04 54
80904 MüNCHEN
DEUTSCHLAND

Fax: +49 89 35819624
https://www.swabian.net/

UST-IdNr. (VAT):
DE314867715