Why can't the process "sync" be killed?

Carlos E. R.

11 Apr 2021 11 Apr '21

22:30

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 It is just a fact, it can not. Situation. The system is running the process "texpire", which is part of leafnode package, an nntp proxy server. texpire simply looks at every posts and deletes those that are old according to certain rules. This is very intensive on the disk, looking at 1.2 million files. It runs for half an hour. If during that process I issue a "sync", it does not succeed till texpire ends. It can not be killed. Telcontar:~ # time sync ^C^C^C On another terminal: Telcontar:~ # killall sync Telcontar:~ # killall sync Telcontar:~ # killall -9 sync Telcontar:~ # killall -9 sync Telcontar:~ # It has to run till completion. - -- Cheers Carlos E. R. (from 15.2 x86_64 at Telcontar) -----BEGIN PGP SIGNATURE----- iHoEARECADoWIQQZEb51mJKK1KpcU/W1MxgcbY1H1QUCYHN4lRwccm9iaW4ubGlz dGFzQHRlbGVmb25pY2EubmV0AAoJELUzGBxtjUfV1sMAn2cBZmEwd/2LPAIbwlZA xOYqhSPeAJ9JEDEbDbVTBxs9Ec7oK20KZpqWkQ== =Rcb3 -----END PGP SIGNATURE-----

Show replies by date

David Haller

12 Apr 12 Apr

01:47

Hello, On Mon, 12 Apr 2021, Carlos E. R. wrote:

...

It is just a fact, it can not.

Situation.

The system is running the process "texpire", which is part of leafnode package, an nntp proxy server. texpire simply looks at every posts and deletes those that are old according to certain rules. This is very intensive on the disk, looking at 1.2 million files. It runs for half an hour.

If during that process I issue a "sync", it does not succeed till texpire ends. It can not be killed.

You have to kill texpire, as that's continuously creating new stuff to be synced. And you can consider 'sync' to be halfdead already (just not yet done reporting back (or not usually, other than exiting with status 0)), as it's basically just issuing a 'sync', 'syncfs', 'fsync' or fdatasync' syscall (see the manpages in section 2) and reporting back. The source for sync is rather straightforward and a measly 5537 bytes in coreutils version 8.32. ==== comments by me after //DNH: ==== [..] int main (int argc, char **argv) { [..] if (! args_specified || (arg_file_system && ! HAVE_SYNCFS)) mode = MODE_SYNC; else if (arg_file_system) mode = MODE_FILE_SYSTEM; //DNH: will call syncfs(2) if available else if (! arg_data) mode = MODE_FILE; //DNH: will call fsync(2) else mode = MODE_DATA; //DNH: will call fdatasync(2) if (mode == MODE_SYNC) sync (); //DNH: well, duh ;) [..] return ok ? EXIT_SUCCESS : EXIT_FAILURE; } ==== So, default is to just call 'sync(2)', unless you call 'sync(1)' with options or arguments in which case it branches to a function and calls fsync(2) or fdatasync(2) or syncfs(2) depeding on 'mode'. And: ==== man 2 sync ==== sync() causes all pending modifications to filesystem metadata and cached file data to be written to the underlying filesystems. ==== Anyway: all those functions (sync, fsync, fdatasync, syncfs) are syscalls and, as they're writing to storage, in uninterruptible sleep-state ('D' in ps/top etc.), and thus not "killable" from userland. And if sync(1) is "in" the call to these functions, you can not kill it as well until that function returns control to the sync(1) process. So, your only option is to kill anything that still causes more stuff to be synced and wait. Or shut the machine off hard[1] via sysrq+b, sysrq+o, the reset- or powerbutton ... HTH, -dnh [1] as the kernel is already syncing ... -- Sigmonster was here!

Carlos E. R.

07:50

On 12/04/2021 03.47, David Haller wrote:

...

Hello,

On Mon, 12 Apr 2021, Carlos E. R. wrote:

...
It is just a fact, it can not.

Situation.

The system is running the process "texpire", which is part of leafnode package, an nntp proxy server. texpire simply looks at every posts and deletes those that are old according to certain rules. This is very intensive on the disk, looking at 1.2 million files. It runs for half an hour.

If during that process I issue a "sync", it does not succeed till texpire ends. It can not be killed.

You have to kill texpire, as that's continuously creating new stuff to be synced.

But I don't want to. In this case, I want sync to abandon the attempt because I did a mistake calling "sync". And I wonder why it can not.

...

And you can consider 'sync' to be halfdead already (just not yet done reporting back (or not usually, other than exiting with status 0)), as it's basically just issuing a 'sync', 'syncfs', 'fsync' or fdatasync' syscall (see the manpages in section 2) and reporting back. The source for sync is rather straightforward and a measly 5537 bytes in coreutils version 8.32.

==== comments by me after //DNH: ==== [..] int main (int argc, char **argv) { [..] if (! args_specified || (arg_file_system && ! HAVE_SYNCFS)) mode = MODE_SYNC; else if (arg_file_system) mode = MODE_FILE_SYSTEM; //DNH: will call syncfs(2) if available else if (! arg_data) mode = MODE_FILE; //DNH: will call fsync(2) else mode = MODE_DATA; //DNH: will call fdatasync(2)

if (mode == MODE_SYNC) sync (); //DNH: well, duh ;) [..] return ok ? EXIT_SUCCESS : EXIT_FAILURE; } ====

So, default is to just call 'sync(2)', unless you call 'sync(1)' with options or arguments in which case it branches to a function and calls fsync(2) or fdatasync(2) or syncfs(2) depeding on 'mode'. And:

==== man 2 sync ==== sync() causes all pending modifications to filesystem metadata and cached file data to be written to the underlying filesystems. ====

Anyway: all those functions (sync, fsync, fdatasync, syncfs) are syscalls and, as they're writing to storage, in uninterruptible sleep-state ('D' in ps/top etc.), and thus not "killable" from userland.

Ah. This is what I feared.

...

And if sync(1) is "in" the call to these functions, you can not kill it as well until that function returns control to the sync(1) process.

So, your only option is to kill anything that still causes more stuff to be synced and wait. Or shut the machine off hard[1] via sysrq+b, sysrq+o, the reset- or powerbutton ...

HTH, -dnh

[1] as the kernel is already syncing ...

The next (philosophical) question is why are those functions uninterruptible? It could write an item of the cache, then another item, then the next... I see no "philosophical" reason why that can not be aborted. There must be a reason out there that I don't know. Now, don't go investigating the code for me, it is just a curiosity :-) If you are curious, I have been investigating a problem I have with hibernation: sometimes it does not succeed, it stalls. When this happens, I can not poweroff the machine, in the end I have to switch the power off. What I have found, is that the machine is, those times, is trying to sync the filesystem and failing, precisely because texpire is running. Probably leaving the machine there for half an hour would succeed - but obviously, sometimes one is in a hurry to hibernate of poweroff (battery running out, say), and I had no idea that waiting for half an hour might work. I have found out more. In my case, there is a dedicated partition to /var/spool/news/ (formatted as reiserfs), and this partition goes 100% busy during texpire. If I run "sync" fifteen minutes after texpire finishes, it takes a minute to complete. A sync at other point in the day, takes about half a minute. The partition was mounted "relatime,lazytime". If I take out the "lazytime" parameter, the sync completes in a second (except if texpire is running). My conclusion is that lazytime is broken in the case of reiserfs (or in the case of news). The writing to disk is not delayed to an appropriate time, it is delayed for ever. leafnode, texpire, do a lot of timestamp changing, compared to other tools. -- Cheers / Saludos, Carlos E. R. (from 15.2 x86_64 at Telcontar)

Simon Lees

10:52

On 4/12/21 5:20 PM, Carlos E. R. wrote:

...

On 12/04/2021 03.47, David Haller wrote:

...
Hello,

On Mon, 12 Apr 2021, Carlos E. R. wrote:

...
It is just a fact, it can not.

Situation.

The system is running the process "texpire", which is part of leafnode package, an nntp proxy server. texpire simply looks at every posts and deletes those that are old according to certain rules. This is very intensive on the disk, looking at 1.2 million files. It runs for half an hour.

If during that process I issue a "sync", it does not succeed till texpire ends. It can not be killed.

You have to kill texpire, as that's continuously creating new stuff to be synced.

But I don't want to. In this case, I want sync to abandon the attempt because I did a mistake calling "sync". And I wonder why it can not.

...
And you can consider 'sync' to be halfdead already (just not yet done reporting back (or not usually, other than exiting with status 0)), as it's basically just issuing a 'sync', 'syncfs', 'fsync' or fdatasync' syscall (see the manpages in section 2) and reporting back. The source for sync is rather straightforward and a measly 5537 bytes in coreutils version 8.32.

==== comments by me after //DNH: ==== [..] int main (int argc, char **argv) { [..]    if (! args_specified || (arg_file_system && ! HAVE_SYNCFS))      mode = MODE_SYNC;    else if (arg_file_system)      mode = MODE_FILE_SYSTEM;        //DNH: will call syncfs(2) if available    else if (! arg_data)      mode = MODE_FILE;               //DNH: will call fsync(2)    else      mode = MODE_DATA;               //DNH: will call fdatasync(2)

   if (mode == MODE_SYNC)      sync ();                        //DNH: well, duh ;) [..]    return ok ? EXIT_SUCCESS : EXIT_FAILURE; } ====

So, default is to just call 'sync(2)', unless you call 'sync(1)' with options or arguments in which case it branches to a function and calls fsync(2) or fdatasync(2) or syncfs(2) depeding on 'mode'. And:

==== man 2 sync ====         sync() causes all pending modifications to filesystem metadata and         cached file data to be written to the underlying filesystems. ====

Anyway: all those functions (sync, fsync, fdatasync, syncfs) are syscalls and, as they're writing to storage, in uninterruptible sleep-state ('D' in ps/top etc.), and thus not "killable" from userland.

Ah. This is what I feared.

...
And if sync(1) is "in" the call to these functions, you can not kill it as well until that function returns control to the sync(1) process.

So, your only option is to kill anything that still causes more stuff to be synced and wait. Or shut the machine off hard[1] via sysrq+b, sysrq+o, the reset- or powerbutton ...

HTH, -dnh

[1] as the kernel is already syncing ...

The next (philosophical) question is why are those functions uninterruptible?

It could write an item of the cache, then another item, then the next... I see no "philosophical" reason why that can not be aborted. There must be a reason out there that I don't know.

Now, don't go investigating the code for me, it is just a curiosity :-)

Without investigating the code I suspect it comes down to how you define an item and whether the sync command actually knows about said items. The sync command basically says look at everything in cache that has yet to be written to disk and write it to disk because I want to have the disk back in a consistent state (because I may want eject a USB or power off a machine). In all likely hood all the sync command knows is the blocks of cache that haven't been written to disk and where those blocks should be written on disk. At that level it should have no concept of the data thats actually in that block of cache so it can't go I know if I write the following 4 blocks to cache myfile.txt will be fully written to disk and it might be safe to stop at this point. To further complicate this say when you saved myfile.txt it put the data across 8 blocks of cache, something may have decided even before you called sync that there was some spare time and had already written 3 of those 8 blocks to cache. So really the only way the OS can be sure that data isn't corrupted by the sync process is to allow it to run completely (like the only way you can be that data on your USB stick won't be corrupted when you pull it out is to have something call sync which is what pressing the eject button does).

...

If you are curious, I have been investigating a problem I have with hibernation: sometimes it does not succeed, it stalls. When this happens, I can not poweroff the machine, in the end I have to switch the power off.

What I have found, is that the machine is, those times, is trying to sync the filesystem and failing, precisely because texpire is running. Probably leaving the machine there for half an hour would succeed - but obviously, sometimes one is in a hurry to hibernate of poweroff (battery running out, say), and I had no idea that waiting for half an hour might work.

I have found out more. In my case, there is a dedicated partition to /var/spool/news/ (formatted as reiserfs), and this partition goes 100% busy during texpire. If I run "sync" fifteen minutes after texpire finishes, it takes a minute to complete. A sync at other point in the day, takes about half a minute.

The partition was mounted "relatime,lazytime". If I take out the "lazytime" parameter, the sync completes in a second (except if texpire is running).

My conclusion is that lazytime is broken in the case of reiserfs (or in the case of news). The writing to disk is not delayed to an appropriate time, it is delayed for ever.

I guess it might depend on what you call an appropriate time, its quite possible that while texpire is running and causing significant disk usage its deciding the appropriate time is after it has finished. You have to remember this is mostly a powersaving feature for laptops so that rather then spending lots of energy constantly spinning up a rotating hard drive to write small amounts of data it somewhat waits until your done (or the cache is full) and spins up the drive once to write out all the data. By mounting with lazytime your saying I don't care when in the future this happens on the other hand sometimes when you hibernate your laptop you do care. -- Simon Lees (Simotek) http://simotek.net Emergency Update Team keybase.io/simotek SUSE Linux Adelaide Australia, UTC+10:30 GPG Fingerprint: 5B87 DB9D 88DC F606 E489 CEC5 0922 C246 02F0 014B

Carlos E. R.

11:08

On 12/04/2021 12.52, Simon Lees wrote:

...

On 4/12/21 5:20 PM, Carlos E. R. wrote:

...
On 12/04/2021 03.47, David Haller wrote:

...
Hello,

...

...
The next (philosophical) question is why are those functions uninterruptible?

It could write an item of the cache, then another item, then the next... I see no "philosophical" reason why that can not be aborted. There must be a reason out there that I don't know.

Now, don't go investigating the code for me, it is just a curiosity :-)

Without investigating the code I suspect it comes down to how you define an item and whether the sync command actually knows about said items. The sync command basically says look at everything in cache that has yet to be written to disk and write it to disk because I want to have the disk back in a consistent state (because I may want eject a USB or power off a machine).

In all likely hood all the sync command knows is the blocks of cache that haven't been written to disk and where those blocks should be written on disk. At that level it should have no concept of the data thats actually in that block of cache so it can't go I know if I write the following 4 blocks to cache myfile.txt will be fully written to disk and it might be safe to stop at this point. To further complicate this say when you saved myfile.txt it put the data across 8 blocks of cache, something may have decided even before you called sync that there was some spare time and had already written 3 of those 8 blocks to cache. So really the only way the OS can be sure that data isn't corrupted by the sync process is to allow it to run completely (like the only way you can be that data on your USB stick won't be corrupted when you pull it out is to have something call sync which is what pressing the eject button does).

Understood, that's good enough for me, thanks :-)

...

...
If you are curious, I have been investigating a problem I have with hibernation: sometimes it does not succeed, it stalls. When this happens, I can not poweroff the machine, in the end I have to switch the power off.

What I have found, is that the machine is, those times, is trying to sync the filesystem and failing, precisely because texpire is running. Probably leaving the machine there for half an hour would succeed - but obviously, sometimes one is in a hurry to hibernate of poweroff (battery running out, say), and I had no idea that waiting for half an hour might work.

I have found out more. In my case, there is a dedicated partition to /var/spool/news/ (formatted as reiserfs), and this partition goes 100% busy during texpire. If I run "sync" fifteen minutes after texpire finishes, it takes a minute to complete. A sync at other point in the day, takes about half a minute.

The partition was mounted "relatime,lazytime". If I take out the "lazytime" parameter, the sync completes in a second (except if texpire is running).

My conclusion is that lazytime is broken in the case of reiserfs (or in the case of news). The writing to disk is not delayed to an appropriate time, it is delayed for ever.

I guess it might depend on what you call an appropriate time, its quite possible that while texpire is running and causing significant disk usage its deciding the appropriate time is after it has finished. You have to remember this is mostly a powersaving feature for laptops so that rather then spending lots of energy constantly spinning up a rotating hard drive to write small amounts of data it somewhat waits until your done (or the cache is full) and spins up the drive once to write out all the data. By mounting with lazytime your saying I don't care when in the future this happens on the other hand sometimes when you hibernate your laptop you do care.

Well, I read the documentation again and the current version says it can delay as much as 24 hours. I think that's way too much, at least it should be tunable. When the machine is idling, waiting an hour it is more than enough. (if it is possible to sync only a partition, then a cron job would do) When texpire is running the problem is different: the machine can not be suspended or halted. It can take half an hour to complete, during which time if it is a laptop, or a server during a power failure on UPS, it should be treated as an emergency and handle the situation somehow. If it is a manual command I know I have to kill texpire, but if not, the battery will die sooner. -- Cheers / Saludos, Carlos E. R. (from 15.2 x86_64 at Telcontar)

Dave Howorth

11:04

On Mon, 12 Apr 2021 09:50:03 +0200 "Carlos E. R." <robin.listas@telefonica.net> wrote:

...

The partition was mounted "relatime,lazytime".

What does that even mean? Aren't they mutually contradictory?

Carlos E. R.

11:12

On 12/04/2021 13.04, Dave Howorth wrote:

...

On Mon, 12 Apr 2021 09:50:03 +0200 "Carlos E. R." <> wrote:

...
The partition was mounted "relatime,lazytime".

What does that even mean? Aren't they mutually contradictory?

Nope. (relatime is a default option) relatime Update inode access times relative to modify or change time. Access time is only updated if the previous access time was earlier than the cur- rent modify or change time. (Similar to noatime, but it doesn't break mutt or other applications that need to know if a file has been read since the last time it was modified.) Since Linux 2.6.30, the kernel defaults to the behavior provided by this option (unless noatime was specified), and the strictatime option is required to obtain traditional semantics. In addition, since Linux 2.6.30, the file's last access time is always updated if it is more than 1 day old. lazytime Only update times (atime, mtime, ctime) on the in-memory version of the file inode. This mount option significantly reduces writes to the inode table for workloads that perform frequent random writes to preallocated files. The on-disk timestamps are updated only when: - the inode needs to be updated for some change unrelated to file timestamps - the application employs fsync(2), syncfs(2), or sync(2) - an undeleted inode is evicted from memory - more than 24 hours have passed since the i- node was written to disk. -- Cheers / Saludos, Carlos E. R. (from 15.2 x86_64 at Telcontar)

David Haller

13 Apr 13 Apr

14:43

Hello, On Mon, 12 Apr 2021, Carlos E. R. wrote:

...

On 12/04/2021 03.47, David Haller wrote:

...
On Mon, 12 Apr 2021, Carlos E. R. wrote:

...
It is just a fact, it can not.

Situation.

The system is running the process "texpire", which is part of leafnode package, an nntp proxy server. texpire simply looks at every posts and deletes those that are old according to certain rules. This is very intensive on the disk, looking at 1.2 million files. It runs for half an hour.

If during that process I issue a "sync", it does not succeed till texpire ends. It can not be killed.

You have to kill texpire, as that's continuously creating new stuff to be synced.

But I don't want to. In this case, I want sync to abandon the attempt because I did a mistake calling "sync". And I wonder why it can not.

See below. I just thought that it might help to trigger an emergency sync via sysrq+s ... [..]

...

...
Anyway: all those functions (sync, fsync, fdatasync, syncfs) are syscalls and, as they're writing to storage, in uninterruptible sleep-state ('D' in ps/top etc.), and thus not "killable" from userland.

Ah. This is what I feared.

...
And if sync(1) is "in" the call to these functions, you can not kill it as well until that function returns control to the sync(1) process.

So, your only option is to kill anything that still causes more stuff to be synced and wait. Or shut the machine off hard[1] via sysrq+b, sysrq+o, the reset- or powerbutton ...

HTH, -dnh

[1] as the kernel is already syncing ...

The next (philosophical) question is why are those functions uninterruptible?

It could write an item of the cache, then another item, then the next... I see no "philosophical" reason why that can not be aborted. There must be a reason out there that I don't know.

I guess so that the filesystems are in a defined and consistent state. See fs/sync.c and mm/filemap.c for details.

...

Now, don't go investigating the code for me, it is just a curiosity :-)

Too late ;)

...

If you are curious, I have been investigating a problem I have with hibernation: sometimes it does not succeed, it stalls. When this happens, I can not poweroff the machine, in the end I have to switch the power off.

sysrq+r sysrq+e sysrq+i sysrq+s sysrq+u sysrq+o might help.

...

What I have found, is that the machine is, those times, is trying to sync the filesystem and failing, precisely because texpire is running. Probably leaving the machine there for half an hour would succeed - but obviously,

Yep.

...

sometimes one is in a hurry to hibernate of poweroff (battery running out, say), and I had no idea that waiting for half an hour might work.

Sure.

...

I have found out more. In my case, there is a dedicated partition to /var/spool/news/ (formatted as reiserfs), and this partition goes 100% busy during texpire. If I run "sync" fifteen minutes after texpire finishes, it takes a minute to complete. A sync at other point in the day, takes about half a minute.

I use a loop-mounted reiserfs image as news-spool for leafnode: $ df -h /var/spool/news/ Filesystem Size Used Avail Use% Mounted on /dev/loop0 8.0G 3.3G 4.8G 41% /var/spool/news I could probably shrink that a bit ;) This works nicely and sync(1)-ing while texpire is done in a couple of seconds (on rotating rust).

...

The partition was mounted "relatime,lazytime". If I take out the "lazytime" parameter, the sync completes in a second (except if texpire is running).

I just use 'rw,acl,user_xattr,barrier=flush'...

...

My conclusion is that lazytime is broken in the case of reiserfs (or in the case of news). The writing to disk is not delayed to an appropriate time, it is delayed for ever. leafnode, texpire, do a lot of timestamp changing, compared to other tools.

Probably. Reiserfs was always special ;) HTH, -dnh -- Linux is not a desktop OS for people whose VCRs are still flashing "12:00". -- Paul Tomblin

Carlos E. R.

22:38

On 12/04/2021 09.50, Carlos E. R. wrote:

...

On 12/04/2021 03.47, David Haller wrote:

...
Hello,

On Mon, 12 Apr 2021, Carlos E. R. wrote:

...
It is just a fact, it can not.

Situation.

...

If you are curious, I have been investigating a problem I have with hibernation: sometimes it does not succeed, it stalls. When this happens, I can not poweroff the machine, in the end I have to switch the power off.

What I have found, is that the machine is, those times, is trying to sync the filesystem and failing, precisely because texpire is running. Probably leaving the machine there for half an hour would succeed - but obviously, sometimes one is in a hurry to hibernate of poweroff (battery running out, say), and I had no idea that waiting for half an hour might work.

I have found out more. In my case, there is a dedicated partition to /var/spool/news/ (formatted as reiserfs), and this partition goes 100% busy during texpire. If I run "sync" fifteen minutes after texpire finishes, it takes a minute to complete. A sync at other point in the day, takes about half a minute.

The partition was mounted "relatime,lazytime". If I take out the "lazytime" parameter, the sync completes in a second (except if texpire is running).

My conclusion is that lazytime is broken in the case of reiserfs (or in the case of news). The writing to disk is not delayed to an appropriate time, it is delayed for ever.

leafnode, texpire, do a lot of timestamp changing, compared to other tools.

More data points: if the partition is mounted "nolazytime", texpire takes 45 minutes to run instead of 30. The impact of "lazytime" in performance is huge, with this particular load (/var/spool/news). Next question. As "lazytime" current implementation delays write for 24 hours (or up to), I wonder if I can "sync" just a partition. I know I have asked this before, I just can't remember what we found out about it in the past... (if you have better memory than me, or better google foo skills, and know the link to the thread, just post it ;-) ) -- Cheers / Saludos, Carlos E. R. (from 15.2 x86_64 at Telcontar)

Andrei Borzenkov

14 Apr 14 Apr

05:10

On 14.04.2021 01:38, Carlos E. R. wrote:

...

As "lazytime" current implementation delays write for 24 hours (or up to), I wonder if I can "sync" just a partition.

man 1 sync

Carlos E. R.

10:34

On 14/04/2021 07.10, Andrei Borzenkov wrote:

...

On 14.04.2021 01:38, Carlos E. R. wrote:

...
As "lazytime" current implementation delays write for 24 hours (or up to), I wonder if I can "sync" just a partition.

man 1 sync

{flush} -- Cheers / Saludos, Carlos E. R. (from 15.2 x86_64 at Telcontar)

Carlos E. R.

25 May 25 May

01:19

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday, 2021-04-14 at 12:34 +0200, Carlos E. R. wrote:

...

On 14/04/2021 07.10, Andrei Borzenkov wrote:

...
On 14.04.2021 01:38, Carlos E. R. wrote:

...
As "lazytime" current implementation delays write for 24 hours (or up to), I wonder if I can "sync" just a partition.

man 1 sync

{flush}

Well... I modified the cron job to sync and time things. logger -p cron.warning -t texpire "CER: running leafnode's texpire without ionice" logger -p cron.warning -t texpire "CER: Full sync prior to running texpire" sync logger -p cron.warning -t texpire "CER: full sync done, now running texpire" test -x /usr/sbin/texpire && su - news -c "/usr/sbin/texpire" logger -p cron.warning -t texpire "CER: finished" sync /var/spool/news/ logger -p cron.warning -t texpire "CER: and synced" sync /dev/sdc9 logger -p cron.warning -t texpire "CER: and synced /dev/sdc9" sync /dev/sdc logger -p cron.warning -t texpire "CER: and synced /dev/sdc" sync logger -p cron.warning -t texpire "CER: and full synced" Now, see what the log contains: <9.4> 2021-05-23T23:45:01.892831+02:00 Telcontar texpire - - - CER: running leafnode's texpire without ionice <9.4> 2021-05-23T23:45:01.927445+02:00 Telcontar texpire - - - CER: Full sync prior to running texpire <9.4> 2021-05-23T23:47:44.861249+02:00 Telcontar texpire - - - CER: full sync done, now running texpire <7.3> 2021-05-23T23:48:39.654533+02:00 Telcontar fetchnews 25746 - - Cannot obtain lock file, aborting. <7.3> 2021-05-23T23:52:22.666498+02:00 Telcontar fetchnews 25885 - - Cannot obtain lock file, aborting. <7.3> 2021-05-23T23:56:50.578464+02:00 Telcontar fetchnews 26013 - - Cannot obtain lock file, aborting. <7.3> 2021-05-24T00:00:43.538555+02:00 Telcontar fetchnews 26229 - - Cannot obtain lock file, aborting. <7.3> 2021-05-24T00:04:50.044474+02:00 Telcontar fetchnews 26691 - - Cannot obtain lock file, aborting. <7.3> 2021-05-24T00:08:37.679295+02:00 Telcontar fetchnews 26814 - - Cannot obtain lock file, aborting. <7.3> 2021-05-24T00:12:58.919793+02:00 Telcontar fetchnews 26939 - - Cannot obtain lock file, aborting. <7.3> 2021-05-24T00:16:27.510675+02:00 Telcontar fetchnews 27123 - - Cannot obtain lock file, aborting. <7.3> 2021-05-24T00:20:46.999665+02:00 Telcontar fetchnews 27255 - - Cannot obtain lock file, aborting. <7.3> 2021-05-24T00:24:42.812881+02:00 Telcontar fetchnews 27386 - - Cannot obtain lock file, aborting. <9.4> 2021-05-24T00:27:23.661874+02:00 Telcontar texpire - - - CER: finished <9.4> 2021-05-24T00:27:23.693867+02:00 Telcontar texpire - - - CER: and synced <9.4> 2021-05-24T00:27:57.036451+02:00 Telcontar texpire - - - CER: and synced /dev/sdc9 <9.4> 2021-05-24T00:27:57.204751+02:00 Telcontar texpire - - - CER: and synced /dev/sdc <9.4> 2021-05-24T00:29:03.812514+02:00 Telcontar texpire - - - CER: and full synced The machine was idling otherwise, I was not present. The full initial sync takes almost 3 seconds. This would sync everything done during the whole day (per lazytime mount option) The job itself takes about 45 minutes. The sync of the mount takes 2 milliseconds - so I assume it does nothing. The sync of the partition takes 34 seconds. Why the difference? They should be the same thing, at least that is what I understand from the manual. Maybe not. But the full sync takes one minute more. Either texpire exercised some other partitions, which I very much doubt, or rather the sync of /dev/sdc9 doesn't sync the entire partition. I will repeat the experiment tomorrow, with a change, to see the effect of "-f" on sync: test -x /usr/sbin/texpire && su - news -c "/usr/sbin/texpire" logger -p cron.warning -t texpire "CER: finished" sync /var/spool/news/ logger -p cron.warning -t texpire "CER: and synced" sync -f /var/spool/news/ logger -p cron.warning -t texpire "CER: and synced -f" sync /dev/sdc9 [...] <9.4> 2021-05-24T23:45:01.071842+02:00 Telcontar texpire - - - CER: running leafnode's texpire without ionice <9.4> 2021-05-24T23:45:01.099237+02:00 Telcontar texpire - - - CER: Full sync prior to running texpire <9.4> 2021-05-24T23:45:07.163680+02:00 Telcontar texpire - - - CER: full sync done, now running texpire <7.3> 2021-05-24T23:48:09.182450+02:00 Telcontar fetchnews 10367 - - Cannot obtain lock file, aborting. <7.3> 2021-05-24T23:52:09.386434+02:00 Telcontar fetchnews 10552 - - Cannot obtain lock file, aborting. <7.3> 2021-05-24T23:56:09.767140+02:00 Telcontar fetchnews 10751 - - Cannot obtain lock file, aborting. <7.3> 2021-05-25T00:00:09.952507+02:00 Telcontar fetchnews 10995 - - Cannot obtain lock file, aborting. <7.3> 2021-05-25T00:04:22.073284+02:00 Telcontar fetchnews 11470 - - Cannot obtain lock file, aborting. <7.3> 2021-05-25T00:08:27.606159+02:00 Telcontar fetchnews 11628 - - Cannot obtain lock file, aborting. <7.3> 2021-05-25T00:12:40.215851+02:00 Telcontar fetchnews 11770 - - Cannot obtain lock file, aborting. <7.3> 2021-05-25T00:16:36.626945+02:00 Telcontar fetchnews 11979 - - Cannot obtain lock file, aborting. <7.3> 2021-05-25T00:20:14.068475+02:00 Telcontar fetchnews 12153 - - Cannot obtain lock file, aborting. <7.3> 2021-05-25T00:24:52.732133+02:00 Telcontar fetchnews 12305 - - Cannot obtain lock file, aborting. <7.3> 2021-05-25T00:28:49.968242+02:00 Telcontar fetchnews 12495 - - Cannot obtain lock file, aborting. <9.4> 2021-05-25T00:31:57.558573+02:00 Telcontar texpire - - - CER: finished <9.4> 2021-05-25T00:31:57.591806+02:00 Telcontar texpire - - - CER: and synced <9.4> 2021-05-25T00:34:25.794323+02:00 Telcontar texpire - - - CER: and synced -f <9.4> 2021-05-25T00:34:27.029827+02:00 Telcontar texpire - - - CER: and synced /dev/sdc9 <9.4> 2021-05-25T00:34:28.414938+02:00 Telcontar texpire - - - CER: and synced /dev/sdc <9.4> 2021-05-25T00:34:38.613719+02:00 Telcontar texpire - - - CER: and full synced Pre-job sync: 6 seconds. Job: about 45 minutes. The sync of the mount takes 332 milliseconds. The sync -f of the mount takes 2.68 minutes! This is the grunt sync job. The sync of the partition takes 2 seconds. Something was missing from the previous command? The sync of everything takes 10 seconds. Again, something was not synced. Or the computer did other busy things while texpire was running. - -- Cheers, Carlos E. R. (from openSUSE 15.2 x86_64 at Telcontar) -----BEGIN PGP SIGNATURE----- iHoEARECADoWIQQZEb51mJKK1KpcU/W1MxgcbY1H1QUCYKxQoRwccm9iaW4ubGlz dGFzQHRlbGVmb25pY2EubmV0AAoJELUzGBxtjUfViGcAnRcUC0bol6MUXjiXXykD f5YBGqsuAJ94dlBj270O2lu5VUMyKM9BCvyPDw== =fK7J -----END PGP SIGNATURE-----

David Haller

14 Apr 14 Apr

05:46

Hello, On Wed, 14 Apr 2021, Carlos E. R. wrote: [*yawn* uhm *yawn* me tired]

...

As "lazytime" current implementation delays write for 24 hours (or up to), I wonder if I can "sync" just a partition.

More precisely: you can sync just one filesystem ;) $ sync --help -f, --file-system sync the file systems that contain the files cue the manpages of the aforementioned syscalls ;) So: sync -f /var/spool/news/message.id Works even on a /var/spool/news/ as bind-mounted reiserfs-image from a chroot with a sync(1) that knows '-f' while the sync(1) of the host does not yet know the option, but the running kernel has the matching syscall 'syncfs(2)' :) I.e.: root@${on_host]# mount -o loop /foo/news_reiserfs.img /var/spool/news root@${on_host]# mount --bind /var/spool/news/ ${mnt_chroot}/var/spool/news [.. chroot ${mnt_chroot} ..] root@${in_chroot}# strace sync -f /var/spool/news/message.id [..] open("/var/spool/news/message.id", O_RDONLY|O_NONBLOCK) = 3 [..] syncfs(3) = 0 close(3) = 0 [..] Mind: the 'syncfs(3)' here means: call syncfs(2) with the argument of the integer '3', which happend to be returned by the preceding 'open(2)' syscall and is the "fd" or "file descriptor" argument of the function[0]. I already gave you pointers to where to find the real work done by 'syncfs(2)'. -dnh [0] int FD; FD = open("foo", ...); syncfs(FD); close(FD); in strace, the return values comes after the '=' so, the strace rewritten: open("/var/spool/news/message.id", O_RDONLY|O_NONBLOCK) = FD syncfs(FD) = OK close(FD) = OK -- I still get this recurrent vision of people descending on Redmond one morning and - armed with nothing more^H^H^H^Hless than a Windows CD each, holding them aloft to reflect the rays of the morning sun, thus burning M$ headquarters to a crisp. No doubt M$ would blame this on "global warming" (Tanuki)

Carlos E. R.

10:49

On 14/04/2021 07.46, David Haller wrote:

...

Hello,

On Wed, 14 Apr 2021, Carlos E. R. wrote: [*yawn* uhm *yawn* me tired]

...
As "lazytime" current implementation delays write for 24 hours (or up to), I wonder if I can "sync" just a partition.

More precisely: you can sync just one filesystem ;)

$ sync --help -f, --file-system sync the file systems that contain the files

Yeah, I had the distinct recollection™ of "sync" not accepting parameters, so I did not even think of looking at the man page. Funny memory. Although this man page is new, feb 2018. You can even ask to sync a file.

...

cue the manpages of the aforementioned syscalls ;)

So: sync -f /var/spool/news/message.id

Works even on a /var/spool/news/ as bind-mounted reiserfs-image from a chroot with a sync(1) that knows '-f' while the sync(1) of the host does not yet know the option, but the running kernel has the matching syscall 'syncfs(2)' :) I.e.:

root@${on_host]# mount -o loop /foo/news_reiserfs.img /var/spool/news root@${on_host]# mount --bind /var/spool/news/ ${mnt_chroot}/var/spool/news [.. chroot ${mnt_chroot} ..] root@${in_chroot}# strace sync -f /var/spool/news/message.id [..] open("/var/spool/news/message.id", O_RDONLY|O_NONBLOCK) = 3 [..] syncfs(3) = 0 close(3) = 0 [..]

Mind: the 'syncfs(3)' here means: call syncfs(2) with the argument of the integer '3', which happend to be returned by the preceding 'open(2)' syscall and is the "fd" or "file descriptor" argument of the function[0].

Yes, I have seen that methodology on trace dumps :-) [...] Heh, you mention traces below.

...

I already gave you pointers to where to find the real work done by 'syncfs(2)'.

Yes, thanks, I have material to read.

...

-dnh

[0] int FD; FD = open("foo", ...); syncfs(FD); close(FD);

in strace, the return values comes after the '=' so, the strace rewritten:

open("/var/spool/news/message.id", O_RDONLY|O_NONBLOCK) = FD syncfs(FD) = OK close(FD) = OK

-- Cheers / Saludos, Carlos E. R. (from 15.2 x86_64 at Telcontar)

Adam Majer

09:59

On Wed, Apr 14, 2021 at 12:38:59AM +0200, Carlos E. R. wrote:

...

On 12/04/2021 09.50, Carlos E. R. wrote: As "lazytime" current implementation delays write for 24 hours (or up to), I wonder if I can "sync" just a partition. I know I have asked this before, I just can't remember what we found out about it in the past...

https://blog.confirm.ch/lazyupdate-feature-4-0-kernel/ It looks like all the access times that are in cache are getting flushed to disk. So there are two options you can pass to sync. Maybe experiment with just flushing the filedata instead of everything so that the time modifications are not flushed as well? But no, you can't just flush a partition. What you can do is sync particular files with `fsync`. - Adam

L A Walsh

12 Apr 12 Apr

22:19

New subject: why does dropcaches take so long? (Re: Why can't the process "sync" be killed?)

On 2021/04/11 15:30, Carlos E. R. wrote:

...

Telcontar:~ # killall sync Telcontar:~ # killall sync Telcontar:~ # killall -9 sync Telcontar:~ # killall -9 sync Telcontar:~ # --- Often 'sync' can be in io-wait state -- that can't be killed.

Not sure what you were trying to do w/sync, but it's usually pointless as syncs occur in seconds, at most. Lately, I've noticed more problems with tasks waiting on memory even though most memory should be "free" (used as readcache for files). Thing I don't understand is why "dropcaches" takes so long -- often near 10s, but occasionally 30-40s. I have it setup as a unprivileged (in that 'sudo' is 'built-in') shell script on my system. Because it takes so long, often, I always run it with time in front:

...

cat dropcaches #!/bin/bash function dropcaches () { echo -n "3"|sudo dd status=none of=/proc/sys/vm/drop_caches } time dropcaches

-----

...

dropcaches 12.10sec 0.01usr 11.62sys (96.10% cpu)

But the memory it releases -- most of it is supposedly read-cached files.

Carlos E. R.

13 Apr 13 Apr

00:49

New subject: why does dropcaches take so long? (Re: Why can't the process "sync" be killed?)

On 13/04/2021 00.19, L A Walsh wrote:

...

On 2021/04/11 15:30, Carlos E. R. wrote:

...
Telcontar:~ # killall sync Telcontar:~ # killall sync Telcontar:~ # killall -9 sync Telcontar:~ # killall -9 sync Telcontar:~ # --- Often 'sync' can be in io-wait state -- that can't be killed.

Not sure what you were trying to do w/sync, but it's usually pointless as syncs occur in seconds, at most.

If you read the rest of the thread, I describe a situation that takes 1 minute in my machine, and another that takes half an hour. When I was trying to kill it above, it was going to take half an hour. -- Cheers / Saludos, Carlos E. R. (from 15.2 x86_64 at Telcontar)

1301

Age (days ago)

1345

Last active (days ago)

List overview

Download

16 comments

7 participants

participants (7)

Adam Majer
Andrei Borzenkov
Carlos E. R.
Dave Howorth
David Haller
L A Walsh
Simon Lees

Why can't the process "sync" be killed?

tags

participants (7)