NetworkManager dispatcher script does not finish when going to sleep
Hi *, starting with the last two NetworkManager versions, my dispatcher script does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe. Any idea, what is happening here, and how to avoid it? TIA. Bye. Michael.
On 13.07.2024 17:11, mh@mike.franken.de wrote:
Hi *,
starting with the last two NetworkManager versions, my dispatcher script
On which distribution?
does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe.
Why do you believe it? How dispatcher script is different from any other running program that will continue to run across suspend/resume? Besides, quoting manual page: Dispatcher scripts are run one at a time ... once a script is queued, it will always be run, even if a later event renders it obsolete. Your script may not even begin execution before resume because some other scripts have been running.
Any idea, what is happening here, and how to avoid it?
TIA.
Bye. Michael.
On Samstag, 13. Juli 2024 16:25:16 MESZ Andrei Borzenkov wrote:
On 13.07.2024 17:11, mh@mike.franken.de wrote:
Hi *,
starting with the last two NetworkManager versions, my dispatcher script On which distribution?
openSUSE Tumbleweed 20240711, sry
does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe.
Why do you believe it? How dispatcher script is different from any other running program that will continue to run across suspend/resume?
If it would have been killed before, because of running into the dispatcher timeout, I'd suspect it would be running afterwards!?!? I already had that situation, but the actual behavior is different.
Besides, quoting manual page:
Dispatcher scripts are run one at a time ... once a script is queued, it will always be run, even if a later event renders it obsolete.
Your script may not even begin execution before resume because some other scripts have been running.
It is the only dispatcher script. And the same happens, if it is linked into no-wait.d.
Any idea, what is happening here, and how to avoid it?
TIA.
Bye. Michael.
On 13.07.2024 17:39, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:25:16 MESZ Andrei Borzenkov wrote:
On 13.07.2024 17:11, mh@mike.franken.de wrote:
Hi *,
starting with the last two NetworkManager versions, my dispatcher script On which distribution?
openSUSE Tumbleweed 20240711, sry
does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe.
Why do you believe it? How dispatcher script is different from any other running program that will continue to run across suspend/resume?
If it would have been killed before, because of running into the dispatcher timeout, I'd suspect it would be running afterwards!?!?
I am sorry, I do not understand what you are saying. I suspect "not" is missing somewhere, just not sure where.
I already had that situation, but the actual behavior is different.
Besides, quoting manual page:
Dispatcher scripts are run one at a time ... once a script is queued, it will always be run, even if a later event renders it obsolete.
Your script may not even begin execution before resume because some other scripts have been running.
It is the only dispatcher script. And the same happens, if it is linked into no-wait.d.
Any idea, what is happening here, and how to avoid it?
TIA.
Bye. Michael.
On Samstag, 13. Juli 2024 16:55:13 MESZ Andrei Borzenkov wrote:
On 13.07.2024 17:39, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:25:16 MESZ Andrei Borzenkov wrote: [...]
If it would have been killed before, because of running into the dispatcher timeout, I'd suspect it would be running afterwards!?!? ^ not
I am sorry, I do not understand what you are saying. I suspect "not" is missing somewhere, just not sure where.
Missing "not" indeed.
On 13.07.2024 18:16, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:55:13 MESZ Andrei Borzenkov wrote:
On 13.07.2024 17:39, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:25:16 MESZ Andrei Borzenkov wrote: [...]
If it would have been killed before, because of running into the dispatcher timeout, I'd suspect it would be running afterwards!?!? ^ not
I am sorry, I do not understand what you are saying. I suspect "not" is missing somewhere, just not sure where.
Missing "not" indeed.
OK, but again - we have no idea when script had been started, how long it has been running etc. We have really zero information and only have you conclusions that we cannot verify without facts.
On Samstag, 13. Juli 2024 17:35:33 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:16, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:55:13 MESZ Andrei Borzenkov wrote:
On 13.07.2024 17:39, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:25:16 MESZ Andrei Borzenkov wrote: [...]
If it would have been killed before, because of running into the dispatcher timeout, I'd suspect it would be running afterwards!?!?
^ not
I am sorry, I do not understand what you are saying. I suspect "not" is missing somewhere, just not sure where.
Missing "not" indeed.
OK, but again - we have no idea when script had been started, how long it has been running etc. We have really zero information and only have you conclusions that we cannot verify without facts.
As I wrote: My script only needs 1-2 seconds according to it's own log and according to the journal entries. What further facts can I provide?
On 13.07.2024 18:42, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 17:35:33 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:16, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:55:13 MESZ Andrei Borzenkov wrote:
On 13.07.2024 17:39, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:25:16 MESZ Andrei Borzenkov wrote: [...]
If it would have been killed before, because of running into the dispatcher timeout, I'd suspect it would be running afterwards!?!?
^ not
I am sorry, I do not understand what you are saying. I suspect "not" is missing somewhere, just not sure where.
Missing "not" indeed.
OK, but again - we have no idea when script had been started, how long it has been running etc. We have really zero information and only have you conclusions that we cannot verify without facts.
As I wrote: My script only needs 1-2 seconds according to it's own log and according to the journal entries. What further facts can I provide?
You can explain the problem so that others also understand it. If you suspend your system 1 second after script has been started it will run 1 more second after resume. I still do not understand the problem.
On Samstag, 13. Juli 2024 17:44:54 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:42, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 17:35:33 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:16, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:55:13 MESZ Andrei Borzenkov wrote:
On 13.07.2024 17:39, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:25:16 MESZ Andrei Borzenkov wrote: [...]
If it would have been killed before, because of running into the dispatcher timeout, I'd suspect it would be running afterwards!?!?
^ not
I am sorry, I do not understand what you are saying. I suspect "not" is missing somewhere, just not sure where.
Missing "not" indeed.
OK, but again - we have no idea when script had been started, how long it has been running etc. We have really zero information and only have you conclusions that we cannot verify without facts.
As I wrote: My script only needs 1-2 seconds according to it's own log and according to the journal entries. What further facts can I provide?
You can explain the problem so that others also understand it. If you suspend your system 1 second after script has been started it will run 1 more second after resume. I still do not understand the problem.
Sry, I am not a native speaker, so sometimes it is not easy to find the correct wording. But: No, the system takes another about 50 seconds to finally enter sleep state. Even systemd scripts in /usr/lib/systemd/system-sleep/ are started and finished afterwards. And whats more: It worked for at least a few years without any problems. As I also wrote, the problems started with the before last NM version. My script didn't change in the last months. My conclusion: Something must have changed in NM.
On 13.07.2024 19:25, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 17:44:54 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:42, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 17:35:33 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:16, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:55:13 MESZ Andrei Borzenkov wrote:
On 13.07.2024 17:39, mh@mike.franken.de wrote: > On Samstag, 13. Juli 2024 16:25:16 MESZ Andrei Borzenkov wrote: [...]
> If it would have been killed before, because of running into the > dispatcher timeout, I'd suspect it would be running > afterwards!?!? > ^ not
I am sorry, I do not understand what you are saying. I suspect "not" is missing somewhere, just not sure where.
Missing "not" indeed.
OK, but again - we have no idea when script had been started, how long it has been running etc. We have really zero information and only have you conclusions that we cannot verify without facts.
As I wrote: My script only needs 1-2 seconds according to it's own log and according to the journal entries. What further facts can I provide?
You can explain the problem so that others also understand it. If you suspend your system 1 second after script has been started it will run 1 more second after resume. I still do not understand the problem.
Sry, I am not a native speaker, so sometimes it is not easy to find the correct wording. But: No, the system takes another about 50 seconds to finally enter sleep state. Even systemd scripts in /usr/lib/systemd/system-sleep/ are started and finished afterwards.
User processes get frozen rather early.
And whats more: It worked for at least a few years without any problems. As I also wrote, the problems started with the before last NM version. My script didn't change in the last months. My conclusion: Something must have changed in NM.
It is possible. I am not sure whether NM blocks suspend until all scripts are completed. Are you sure script simply is not started again on resume?
On Samstag, 13. Juli 2024 19:06:31 MESZ Andrei Borzenkov wrote:
On 13.07.2024 19:25, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 17:44:54 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:42, mh@mike.franken.de wrote: [...] But: No, the system takes another about 50 seconds to finally enter sleep state. Even systemd scripts in /usr/lib/systemd/system-sleep/ are started and finished afterwards.
User processes get frozen rather early.
Is NetworkManager-dispatcher really a user process?
And whats more: It worked for at least a few years without any problems. As I also wrote, the problems started with the before last NM version. My script didn't change in the last months. My conclusion: Something must have changed in NM.
It is possible. I am not sure whether NM blocks suspend until all scripts are completed. Are you sure script simply is not started again on resume?
Yes, in my log I call date at start an end: 01-mh_dispatch-enp62s0-pre-down.log: + /bin/date Fri Jul 12 00:33:09 CEST 2024 ... ... + /bin/date Fri Jul 12 10:27:42 CEST 2024
On 2024-07-13 20:25, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 19:06:31 MESZ Andrei Borzenkov wrote:
On 13.07.2024 19:25, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 17:44:54 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:42, mh@mike.franken.de wrote: [...] But: No, the system takes another about 50 seconds to finally enter sleep state. Even systemd scripts in /usr/lib/systemd/system-sleep/ are started and finished afterwards.
User processes get frozen rather early.
Is NetworkManager-dispatcher really a user process?
It is not a kernel process. -- Cheers / Saludos, Carlos E. R. (from 15.5 x86_64 at Telcontar)
On Samstag, 13. Juli 2024 22:29:32 MESZ Carlos E. R. wrote:
On 2024-07-13 20:25, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 19:06:31 MESZ Andrei Borzenkov wrote:
On 13.07.2024 19:25, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 17:44:54 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:42, mh@mike.franken.de wrote:
[...]
But: No, the system takes another about 50 seconds to finally enter sleep state. Even systemd scripts in /usr/lib/systemd/system-sleep/ are started and finished afterwards.
User processes get frozen rather early.
Is NetworkManager-dispatcher really a user process?
It is not a kernel process.
Ok. So is it possible, that the problem is neither my dispatcher script nor NetworkManager itself, but the fact, that NetworkManager also gets frozen too early? The follwing journal excerpt looks a bit strange to me, because it seems, that not only my dispatcher script, but also NM itself continues with disconnecting eth0 on resume: Jul 13 22:24:41 client NetworkManager[2539]: <info> [1720902281.2284] manager: sleep: sleep requested (sleeping: no enabled: yes) Jul 13 22:24:41 client NetworkManager[2539]: <info> [1720902281.4159] manager: NetworkManager state is now ASLEEP Jul 13 22:24:41 client NetworkManager[2539]: <info> [1720902281.4163] device (enp62s0): state change: activated -> deactivating (reason 'sleeping', sys-iface-state: 'managed') Jul 13 22:24:41 client NetworkManager[2539]: <info> [1720902281.4163] dhcp4 (enp62s0): canceled DHCP transaction Jul 13 22:24:41 client NetworkManager[2539]: <info> [1720902281.4163] dhcp4 (enp62s0): activation: beginning transaction (timeout in 45 seconds) Jul 13 22:24:41 client NetworkManager[2539]: <info> [1720902281.4164] dhcp4 (enp62s0): state changed no lease Jul 13 22:27:21 client NetworkManager[2539]: <info> [1720902441.2287] manager: sleep: wake requested (sleeping: yes enabled: yes) Jul 13 22:27:21 client NetworkManager[2539]: <info> [1720902441.2320] manager: NetworkManager state is now DISCONNECTING Jul 13 22:27:23 client NetworkManager[2539]: <info> [1720902443.0419] device (enp62s0): carrier: link connected Jul 13 22:28:41 client NetworkManager[2539]: <info> [1720902521.2619] device (enp62s0): state change: deactivating -> disconnected (reason 'sleeping', sys-iface-state: 'managed') Jul 13 22:28:41 client NetworkManager[2539]: <info> [1720902521.3118] manager: NetworkManager state is now DISCONNECTED Jul 13 22:28:41 client NetworkManager[2539]: <info> [1720902521.3127] policy: auto-activating connection 'eth0_(LAN)' (25b6d1be-8a6b-4762-88da-cd27eb0a0c89) Jul 13 22:28:41 client NetworkManager[2539]: <info> [1720902521.3131] device (enp62s0): state change: disconnected -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed') Jul 13 22:28:45 client NetworkManager[2539]: <info> [1720902525.0586] device (enp62s0): carrier: link connected Jul 13 22:28:45 client NetworkManager[2539]: <info> [1720902525.0591] device (enp62s0): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed') Jul 13 22:28:45 client NetworkManager[2539]: <info> [1720902525.0623] policy: auto-activating connection 'eth0_(LAN)' (25b6d1be-8a6b-4762-88da-cd27eb0a0c89) Jul 13 22:28:45 client NetworkManager[2539]: <info> [1720902525.0630] device (enp62s0): Activation: starting connection 'eth0_(LAN)' (25b6d1be-8a6b-4762-88da-cd27eb0a0c89) Jul 13 22:28:45 client NetworkManager[2539]: <info> [1720902525.0631] device (enp62s0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed') Jul 13 22:28:45 client NetworkManager[2539]: <info> [1720902525.0634] manager: NetworkManager state is now CONNECTING Jul 13 22:28:45 client NetworkManager[2539]: <info> [1720902525.1110] device (enp62s0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed') Jul 13 22:28:45 client NetworkManager[2539]: <info> [1720902525.1434] device (enp62s0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed') Jul 13 22:28:45 client NetworkManager[2539]: <info> [1720902525.1452] dhcp4 (enp62s0): activation: beginning transaction (timeout in 45 seconds) Jul 13 22:28:48 client NetworkManager[2539]: <info> [1720902528.8286] device (enp62s0): carrier: link connected Jul 13 22:28:49 client NetworkManager[2539]: <info> [1720902529.8293] dhcp4 (enp62s0): state changed new lease, address=192.168.1.42, acd pending Jul 13 22:28:49 client NetworkManager[2539]: <info> [1720902529.9641] dhcp4 (enp62s0): state changed new lease, address=192.168.1.42 Jul 13 22:28:49 client NetworkManager[2539]: <info> [1720902529.9649] policy: set 'eth0_(LAN)' (enp62s0) as default for IPv4 routing and DNS Jul 13 22:28:50 client NetworkManager[2539]: <info> [1720902530.2659] device (enp62s0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') Jul 13 22:28:51 client NetworkManager[2539]: <info> [1720902531.3682] device (enp62s0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') Jul 13 22:28:51 client NetworkManager[2539]: <info> [1720902531.3684] device (enp62s0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') Jul 13 22:28:51 client NetworkManager[2539]: <info> [1720902531.3688] manager: NetworkManager state is now CONNECTED_SITE Jul 13 22:28:51 client NetworkManager[2539]: <info> [1720902531.3692] device (enp62s0): Activation: successful, device activated. Jul 13 22:28:51 client NetworkManager[2539]: <info> [1720902531.7302] manager: NetworkManager state is now CONNECTED_GLOBAL
On Samstag, 13. Juli 2024 22:44:23 MESZ mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 22:29:32 MESZ Carlos E. R. wrote:
On 2024-07-13 20:25, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 19:06:31 MESZ Andrei Borzenkov wrote:
On 13.07.2024 19:25, mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 17:44:54 MESZ Andrei Borzenkov wrote:
On 13.07.2024 18:42, mh@mike.franken.de wrote: [...]
But: No, the system takes another about 50 seconds to finally enter sleep state. Even systemd scripts in /usr/lib/systemd/system-sleep/ are started and finished afterwards.
User processes get frozen rather early.
Is NetworkManager-dispatcher really a user process?
It is not a kernel process.
Ok. So is it possible, that the problem is neither my dispatcher script nor NetworkManager itself, but the fact, that NetworkManager also gets frozen too early? The follwing journal excerpt looks a bit strange to me, because it seems, that not only my dispatcher script, but also NM itself continues with disconnecting eth0 on resume:
It seems, that NM indeed gets frozen, before it could deactivate the network card, and continues after resume. Without my dispatcher script, NM can finish the deactivation before sleep state has been reached. Jul 15 12:35:16 client NetworkManager[2542]: <info> [1721039716.2150] manager: sleep: sleep requested (sleeping: no enabled: yes) Jul 15 12:35:16 client NetworkManager[2542]: <info> [1721039716.2166] manager: NetworkManager state is now ASLEEP Jul 15 12:35:16 client NetworkManager[2542]: <info> [1721039716.2168] device (enp62s0): state change: activated -> deactivating (reason 'sleeping', sys-iface-state: 'managed') Jul 15 12:35:16 client NetworkManager[2542]: <info> [1721039716.2169] dhcp4 (enp62s0): canceled DHCP transaction Jul 15 12:35:16 client NetworkManager[2542]: <info> [1721039716.2169] dhcp4 (enp62s0): activation: beginning transaction (timeout in 45 seconds) Jul 15 12:35:16 client NetworkManager[2542]: <info> [1721039716.2169] dhcp4 (enp62s0): state changed no lease Jul 15 12:35:16 client NetworkManager[2542]: <info> [1721039716.4575] device (enp62s0): state change: deactivating -> disconnected (reason 'sleeping', sys-iface-state: 'managed') Jul 15 12:35:16 client NetworkManager[2542]: <info> [1721039716.5085] device (enp62s0): state change: disconnected -> unmanaged (reason 'unmanaged-sleeping', sys-iface-state: 'managed') Jul 15 12:37:10 client NetworkManager[2542]: <info> [1721039830.2101] manager: sleep: wake requested (sleeping: yes enabled: yes) Jul 15 12:37:10 client NetworkManager[2542]: <info> [1721039830.2106] device (enp62s0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external') Jul 15 12:37:10 client NetworkManager[2542]: <info> [1721039830.2426] manager: NetworkManager state is now DISCONNECTEDJul 15 12:37:13 client NetworkManager[2542]: <info> [1721039833.3308] device (enp62s0): carrier: link connected Jul 15 12:37:13 client NetworkManager[2542]: <info> [1721039833.3313] device (enp62s0): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed') The question is why? My dispatcher script takes only 1-2 seconds - can this really be too long?
On Montag, 15. Juli 2024 12:45:54 MESZ mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 22:44:23 MESZ mh@mike.franken.de wrote: [...]
The question is why? My dispatcher script takes only 1-2 seconds - can this really be too long?
It has something to do with nfs, as it seems. Commenting out the part, where nfs shares should be umounted, NM finishes as expected before entering sleep state - even if I add an additional "wait 3" in the dispatcher script.
On Tuesday, 16 July 2024 02:10:43 ACST mh@mike.franken.de wrote:
On Montag, 15. Juli 2024 12:45:54 MESZ mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 22:44:23 MESZ mh@mike.franken.de wrote: [...]
The question is why? My dispatcher script takes only 1-2 seconds - can this really be too long?
It has something to do with nfs, as it seems. Commenting out the part, where nfs shares should be umounted, NM finishes as expected before entering sleep state - even if I add an additional "wait 3" in the dispatcher script.
Have you tried doing on-demand nfs mounts with systemd-mountd instead? I have that working here and it is seamless. It mounts the nfs shares when file/ folder access is requested by a user, and then unmounts them after a configurable period of time when no files are open (or when the system is shutting down). The same can be done with remote samba shares (I have both types of shares mounted that way). If users' home directories are mounted via nfs, it should still work because it will (or at least should) auto-mount at the appropriate time. It needs the appropriate .mount and .automount files created in /etc/systemd/ system, but the syntax of those is straightforward and it seems to "Just Work". It might be worth trying. -- ========================================================================================================== Rodney Baker rodney.baker@outlook.com.au ==========================================================================================================
On Dienstag, 16. Juli 2024 02:14:24 MESZ Rodney Baker wrote:
On Tuesday, 16 July 2024 02:10:43 ACST mh@mike.franken.de wrote:
On Montag, 15. Juli 2024 12:45:54 MESZ mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 22:44:23 MESZ mh@mike.franken.de wrote: [...]
The question is why? My dispatcher script takes only 1-2 seconds - can this really be too long?
It has something to do with nfs, as it seems. Commenting out the part, where nfs shares should be umounted, NM finishes as expected before entering sleep state - even if I add an additional "wait 3" in the dispatcher script.
Have you tried doing on-demand nfs mounts with systemd-mountd instead? I have that working here and it is seamless. It mounts the nfs shares when file/ folder access is requested by a user, and then unmounts them after a configurable period of time when no files are open (or when the system is shutting down). The same can be done with remote samba shares (I have both types of shares mounted that way).
Yep, that is, what I am testing at the moment - but not with self created automount files. Instead I modified the entries in fstab and let systemd create the appropriate mount files. Found the solution in ArchWiki. Nevertheless I wonder, what caused my setup to fail, though it had worked for a long time, i.e. what changed to cause the problem.
If users' home directories are mounted via nfs, it should still work because it will (or at least should) auto-mount at the appropriate time.
My home dir isn't on nfs, because this is a notebook, which is not always connected to the network, where my nfs server lives.
It needs the appropriate .mount and .automount files created in /etc/systemd/ system, but the syntax of those is straightforward and it seems to "Just Work".
It might be worth trying.
See above. Thx and bye. Michael.
On Mittwoch, 17. Juli 2024 15:11:28 MESZ mh@mike.franken.de wrote:
On Dienstag, 16. Juli 2024 02:14:24 MESZ Rodney Baker wrote:
On Tuesday, 16 July 2024 02:10:43 ACST mh@mike.franken.de wrote:
On Montag, 15. Juli 2024 12:45:54 MESZ mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 22:44:23 MESZ mh@mike.franken.de [...]
Have you tried doing on-demand nfs mounts with systemd-mountd instead? I have that working here and it is seamless. It mounts the nfs shares when file/ folder access is requested by a user, and then unmounts them after a configurable period of time when no files are open (or when the system is shutting down). The same can be done with remote samba shares (I have both types of shares mounted that way). Yep, that is, what I am testing at the moment - but not with self created automount files. Instead I modified the entries in fstab and let systemd create the appropriate mount files. Found the solution in ArchWiki.
Nevertheless I wonder, what caused my setup to fail, though it had worked for a long time, i.e. what changed to cause the problem.
If users' home directories are mounted via nfs, it should still work because it will (or at least should) auto-mount at the appropriate time.
My home dir isn't on nfs, because this is a notebook, which is not always connected to the network, where my nfs server lives.
It needs the appropriate .mount and .automount files created in /etc/systemd/ system, but the syntax of those is straightforward and it seems to "Just Work".
It might be worth trying.
See above.
doesn't work for me - the system doesn't go to sleep (suspend.target: Job suspend.target/start failed with result 'dependency'), if any file is open on one of the mounted filesystems, because the share can't be unmounted in this case. And one of the reasons I send the system to sleep is, that I can seamlessly continue working after resume.
Thx and bye. Michael.
On 2024-07-18 11:23, mh@mike.franken.de wrote:
On Mittwoch, 17. Juli 2024 15:11:28 MESZ mh@mike.franken.de wrote:
On Dienstag, 16. Juli 2024 02:14:24 MESZ Rodney Baker wrote:
On Tuesday, 16 July 2024 02:10:43 ACST mh@mike.franken.de wrote:
On Montag, 15. Juli 2024 12:45:54 MESZ mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 22:44:23 MESZ mh@mike.franken.de [...]
Have you tried doing on-demand nfs mounts with systemd-mountd instead? I have that working here and it is seamless. It mounts the nfs shares when file/ folder access is requested by a user, and then unmounts them after a configurable period of time when no files are open (or when the system is shutting down). The same can be done with remote samba shares (I have both types of shares mounted that way). Yep, that is, what I am testing at the moment - but not with self created automount files. Instead I modified the entries in fstab and let systemd create the appropriate mount files. Found the solution in ArchWiki.
Nevertheless I wonder, what caused my setup to fail, though it had worked for a long time, i.e. what changed to cause the problem.
If users' home directories are mounted via nfs, it should still work because it will (or at least should) auto-mount at the appropriate time.
My home dir isn't on nfs, because this is a notebook, which is not always connected to the network, where my nfs server lives.
It needs the appropriate .mount and .automount files created in /etc/systemd/ system, but the syntax of those is straightforward and it seems to "Just Work".
It might be worth trying.
See above.
doesn't work for me - the system doesn't go to sleep (suspend.target: Job suspend.target/start failed with result 'dependency'), if any file is open on one of the mounted filesystems, because the share can't be unmounted in this case. And one of the reasons I send the system to sleep is, that I can seamlessly continue working after resume.
Try "umount --lazy" perhaps. It returns fast. -l, --lazy Lazy unmount. Detach the filesystem from the file hierarchy now, and clean up all references to this filesystem as soon as it is not busy anymore. -- Cheers / Saludos, Carlos E. R. (from Elessar, using openSUSE Leap 15.5)
On Donnerstag, 18. Juli 2024 22:39:35 MESZ Carlos E. R. wrote:
On 2024-07-18 11:23, mh@mike.franken.de wrote:
On Mittwoch, 17. Juli 2024 15:11:28 MESZ mh@mike.franken.de wrote:
[...]
doesn't work for me - the system doesn't go to sleep (suspend.target:
Job suspend.target/start failed with result
'dependency'), if any file is open on one of the mounted filesystems, because the share can't be unmounted in this case. And one of the reasons I send the system to sleep is, that I can seamlessly continue working after resume.
Try "umount --lazy" perhaps. It returns fast.
-l, --lazy Lazy unmount. Detach the filesystem from the file hierarchy now, and clean up all references to this filesystem as soon as it is not busy anymore.
In my dispatcher script I use -l -f, but as for systemd automounts there seems to be no way to provide these options.
-- Cheers / Saludos,
Carlos E. R.
Bye. Michael.
On 2024-07-19 11:40, mh@mike.franken.de wrote:
On Donnerstag, 18. Juli 2024 22:39:35 MESZ Carlos E. R. wrote:
On 2024-07-18 11:23, mh@mike.franken.de wrote:
Try "umount --lazy" perhaps. It returns fast.
-l, --lazy Lazy unmount. Detach the filesystem from the file hierarchy
now, and clean up all references to
this filesystem as soon as it is not busy anymore.
In my dispatcher script I use -l -f, but as for systemd automounts there seems to be no way to provide these options.
I have a small old laptop that now I use to play videos that reside on the server, connected directly to HDMI port via 2 metre cable. Well, the laptop automounts the server directory via systemd automount in fstab. I went to another city, and took the laptop to play videos, just that the videos were locally stored on disk. One day I noticed messages about nfs not finding the server (different LAN). It kept bitching all week, never gave up. But the machine hibernated fine every day. -- Cheers / Saludos, Carlos E. R. (from 15.5 x86_64 at Telcontar)
On 2024-07-19 15:05, Carlos E. R. wrote:
On 2024-07-19 11:40, mh@mike.franken.de wrote:
On Donnerstag, 18. Juli 2024 22:39:35 MESZ Carlos E. R. wrote:
On 2024-07-18 11:23, mh@mike.franken.de wrote:
Try "umount --lazy" perhaps. It returns fast.
-l, --lazy Lazy unmount. Detach the filesystem from the file hierarchy
now, and clean up all references to
this filesystem as soon as it is not busy anymore.
In my dispatcher script I use -l -f, but as for systemd automounts there seems to be no way to provide these options.
I have a small old laptop that now I use to play videos that reside on the server, connected directly to HDMI port via 2 metre cable. Well, the laptop automounts the server directory via systemd automount in fstab.
I went to another city, and took the laptop to play videos, just that the videos were locally stored on disk. One day I noticed messages about nfs not finding the server (different LAN). It kept bitching all week, never gave up. But the machine hibernated fine every day.
Oh, the server is on Leap 15.5, and the laptop might be on 15.4 -- Cheers / Saludos, Carlos E. R. (from 15.5 x86_64 at Telcontar)
On Samstag, 13. Juli 2024 16:11:41 MESZ mh@mike.franken.de wrote:
Hi *,
starting with the last two NetworkManager versions, my dispatcher script does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe.
Any idea, what is happening here, and how to avoid it?
It seems, that NM takes down the dhcp lease, even before the dispatcher service starts. So there is no chance to umount nfs shares due to a lack of connectivity. Jun 30 01:21:12 transformer systemd-logind[2036]: Lid closed. Jun 30 01:21:12 transformer systemd-logind[2036]: The system will suspend now! Jun 30 01:21:12 transformer NetworkManager[2543]: <info> [1719703272.8559] manager: sleep: sleep requested (sleeping: n o enabled: yes) Jun 30 01:21:12 transformer ModemManager[2236]: <info> [sleep-monitor-systemd] system is about to suspend Jun 30 01:21:12 transformer NetworkManager[2543]: <info> [1719703272.8561] device (wlp0s20f3): state change: unavailable -> unmanaged (reason 'unmanaged-sleeping', sys-iface-state: 'managed') Jun 30 01:21:12 transformer NetworkManager[2543]: <info> [1719703272.8566] device (wlp0s20f3): set-hw-addr: reset MAC address to 20:1E:88:D2:9F:5B (unmanage) Jun 30 01:21:12 transformer NetworkManager[2543]: <info> [1719703272.9772] manager: NetworkManager state is now ASLEEP Jun 30 01:21:12 transformer NetworkManager[2543]: <info> [1719703272.9774] device (enp62s0): state change: activated -> deactivating (reason 'sleeping', sys-iface-state: 'managed') Jun 30 01:21:12 transformer NetworkManager[2543]: <info> [1719703272.9775] dhcp4 (enp62s0): canceled DHCP transaction Jun 30 01:21:12 transformer NetworkManager[2543]: <info> [1719703272.9775] dhcp4 (enp62s0): activation: beginning transaction (timeout in 45 seconds) Jun 30 01:21:12 transformer NetworkManager[2543]: <info> [1719703272.9775] dhcp4 (enp62s0): state changed no lease Jun 30 01:21:12 transformer dbus-daemon[1994]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.13' (uid=0 pid=2543 comm="/usr/sbin/NetworkManager --no-daemon") Jun 30 01:21:12 transformer netconfig[12337]: Executing 'modify --service NetworkManager' for pid 2543 Jun 30 01:21:12 transformer NetworkManager[12337]: <13>Jun 30 01:21:12 netconfig: Executing 'modify --service NetworkManager' for pid 2543 Jun 30 01:21:12 transformer NetworkManager[12335]: debug: lockfile created (/run/netconfig.pid) for PID 12335 Jun 30 01:21:12 transformer NetworkManager[12335]: debug: lockfile created Jun 30 01:21:13 transformer NetworkManager[12335]: debug: write new STATE file /run/netconfig//NetworkManager.netconfig Jun 30 01:21:13 transformer NetworkManager[12335]: debug: Module order: dns-resolver dns-bind dns-dnsmasq nis ntp-runtime Jun 30 01:21:13 transformer systemd[1]: Starting Network Manager Script Dispatcher Service... Or do I misunderstand the journal entries?
TIA.
Bye. Michael.
On Fri, Jul 19, 2024 at 3:57 PM <mh@mike.franken.de> wrote:
On Samstag, 13. Juli 2024 16:11:41 MESZ mh@mike.franken.de wrote:
Hi *,
starting with the last two NetworkManager versions, my dispatcher script does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe.
Any idea, what is happening here, and how to avoid it?
It seems, that NM takes down the dhcp lease, even before the dispatcher service starts. So there is no chance to umount nfs shares due to a lack of connectivity.
Assuming you are talking about the standard nfs script which is part of openSUSE NetworkManager package - the very idea of this script is to unmount when the NFS server is no longer accessible. And yes, in the past my experience with NFS was - if the server does not answer, attempt to unmount hangs. One hack was to configure the server IP on a local interface, then "umount -l -f" would succeed. Quoting man page. --><-- -f, --force Force an unmount (in case of an unreachable NFS system). Note that this option does not guarantee that umount command does not hang. It’s strongly recommended to use absolute paths without symlinks to avoid unwanted readlink(2) and stat(2) system calls on unreachable NFS in umount. --><--
On Freitag, 19. Juli 2024 15:21:23 MESZ Andrei Borzenkov wrote:
On Fri, Jul 19, 2024 at 3:57 PM <mh@mike.franken.de> wrote:
On Samstag, 13. Juli 2024 16:11:41 MESZ mh@mike.franken.de wrote:
Hi *,
starting with the last two NetworkManager versions, my dispatcher script does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe.
Any idea, what is happening here, and how to avoid it?
It seems, that NM takes down the dhcp lease, even before the dispatcher service starts. So there is no chance to umount nfs shares due to a lack of connectivity. Assuming you are talking about the standard nfs script which is part of openSUSE NetworkManager package - the very idea of this script is to unmount when the NFS server is no longer accessible.
No, I wrote my own script, because the default script wasn't sufficient. And as already said - it worked for years until NM 1.46.
And yes, in the past my experience with NFS was - if the server does not answer, attempt to unmount hangs. One hack was to configure the server IP on a local interface, then "umount -l -f" would succeed.
Thats what my script already does. Thx anyway.
On 19.07.2024 16:28, mh@mike.franken.de wrote:
On Freitag, 19. Juli 2024 15:21:23 MESZ Andrei Borzenkov wrote:
On Fri, Jul 19, 2024 at 3:57 PM <mh@mike.franken.de> wrote:
On Samstag, 13. Juli 2024 16:11:41 MESZ mh@mike.franken.de wrote:
Hi *,
starting with the last two NetworkManager versions, my dispatcher script does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe.
Any idea, what is happening here, and how to avoid it?
It seems, that NM takes down the dhcp lease, even before the dispatcher service starts. So there is no chance to umount nfs shares due to a lack of connectivity. Assuming you are talking about the standard nfs script which is part of openSUSE NetworkManager package - the very idea of this script is to unmount when the NFS server is no longer accessible.
No, I wrote my own script, because the default script wasn't sufficient.
Without seeing this script everything will be just a wild guess.
And as already said - it worked for years until NM 1.46.
Did you verify that downgrading NetworkManager (and *only* NetworkManager) restores the previous behavior?
And yes, in the past my experience with NFS was - if the server does not answer, attempt to unmount hangs. One hack was to configure the server IP on a local interface, then "umount -l -f" would succeed.
Thats what my script already does.
Thx anyway.
On Montag, 22. Juli 2024 06:08:39 MESZ Andrei Borzenkov wrote:
On 19.07.2024 16:28, mh@mike.franken.de wrote:
On Freitag, 19. Juli 2024 15:21:23 MESZ Andrei Borzenkov wrote:
On Fri, Jul 19, 2024 at 3:57 PM <mh@mike.franken.de> wrote: [...]
No, I wrote my own script, because the default script wasn't sufficient. Without seeing this script everything will be just a wild guess.
See attachment.
And as already said - it worked for years until NM 1.46.
Did you verify that downgrading NetworkManager (and *only* NetworkManager) restores the previous behavior?
No, I am not sure, if this is even possible with Tumbleweed without replacing lots of other packages? Where would I get a working NM < 1.46 for Tumbleweed from?
On 29.07.2024 19:48, mh@mike.franken.de wrote:
On Montag, 22. Juli 2024 06:08:39 MESZ Andrei Borzenkov wrote:
On 19.07.2024 16:28, mh@mike.franken.de wrote:
On Freitag, 19. Juli 2024 15:21:23 MESZ Andrei Borzenkov wrote:
On Fri, Jul 19, 2024 at 3:57 PM <mh@mike.franken.de> wrote: [...]
No, I wrote my own script, because the default script wasn't sufficient. Without seeing this script everything will be just a wild guess.
See attachment.
And you also followed the documentation?
pre-down The interface will be deactivated but has not yet been disconnected from the network. Scripts acting on this event must be placed or symlinked into the /etc/NetworkManager/dispatcher.d/pre-down.d directory, and NetworkManager will wait for script execution to complete before disconnecting the interface from its network.
And as already said - it worked for years until NM 1.46.
Did you verify that downgrading NetworkManager (and *only* NetworkManager) restores the previous behavior?
No, I am not sure, if this is even possible with Tumbleweed without replacing lots of other packages? Where would I get a working NM < 1.46 for Tumbleweed from?
For one, you can rollback to a previous snapshot and then update only NetworkManager. You could also look on https://download.opensuse.org/history/. Otherwise the option is to rebuild earlier revision of NetworkManager.
On Montag, 29. Juli 2024 19:21:14 MESZ Andrei Borzenkov wrote:
On 29.07.2024 19:48, mh@mike.franken.de wrote:
On Montag, 22. Juli 2024 06:08:39 MESZ Andrei Borzenkov wrote:
On 19.07.2024 16:28, mh@mike.franken.de wrote:
On Freitag, 19. Juli 2024 15:21:23 MESZ Andrei Borzenkov wrote:
On Fri, Jul 19, 2024 at 3:57 PM <mh@mike.franken.de> wrote: [...]
No, I wrote my own script, because the default script wasn't sufficient.
Without seeing this script everything will be just a wild guess.
See attachment.
And you also followed the documentation?
pre-down
The interface will be deactivated but has not yet been disconnected from the network. Scripts acting on this event must be placed or symlinked into the /etc/NetworkManager/dispatcher.d/pre-down.d directory, and NetworkManager will wait for script execution to complete before disconnecting the interface from its network.
Yes, otherwise it wouldn't get started at all. As I wrote, the log shows, that it stops between two nfs umounts before sleep and continues after resume.
And as already said - it worked for years until NM 1.46.
Did you verify that downgrading NetworkManager (and *only* NetworkManager) restores the previous behavior?
No, I am not sure, if this is even possible with Tumbleweed without replacing lots of other packages? Where would I get a working NM < 1.46 for Tumbleweed from?
For one, you can rollback to a previous snapshot and then update only NetworkManager. You could also look on https://download.opensuse.org/history/. Otherwise the option is to rebuild earlier revision of NetworkManager.
This system doesn't know about snapshots. As for the history packages: Trying to install NM from there, would lead to a downgrade of about 50 other packages. So I would have to rebuild it myself - I'll try that in the next days.
On Samstag, 13. Juli 2024 16:11:41 MESZ mh@mike.franken.de wrote:
Hi *,
starting with the last two NetworkManager versions, my dispatcher script does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe.
Any idea, what is happening here, and how to avoid it?
seems to be a bug: https://bugzilla.suse.com/show_bug.cgi?id=1228154#c5
TIA.
Bye. Michael.
On Donnerstag, 22. August 2024 15:50:05 MESZ mh@mike.franken.de wrote:
On Samstag, 13. Juli 2024 16:11:41 MESZ mh@mike.franken.de wrote:
Hi *,
starting with the last two NetworkManager versions, my dispatcher script does not finish when going to sleep. It finishes on resume, though, causing some strange effects. It's runtime is 1 or 2 seconds only, so this is not a problem of hitting the dispatcher timeout. Anyway in this case it wouldn't continue after resume, I believe.
Any idea, what is happening here, and how to avoid it?
seems to be a bug:
solved with NetworkManager-1.48.8-2.1.x86_64.
TIA.
Bye. Michael.
participants (4)
-
Andrei Borzenkov
-
Carlos E. R.
-
mh@mike.franken.de
-
Rodney Baker