[opensuse-support] user disappeared
Hi, I've got an odd problem, that probably deserves a bug report, but I don't know where to start looking for evidence. I'm running openSUSE 15.0 "transactional server" on a vps. I had freenet daemon installed, running as a dedicated "freenet" user. What happened is that the "freenet" user suddenly disappeared. It's not mentioned in /etc/passwd, /etc/shadow nor /etc/group. If I do "ls -l" in /home/freenet, there are numbers instead of owner and group. Service obviously fails to start. I had a look at logs, but there's nothing alarming. Do you have any clues, what could go wrong? Recent logs for freenet service: http://susepaste.org/view/a0905f56 Systemd service file: # /etc/systemd/system/freenet.service [Unit] Description=FreeNet Service After=local-fs.target network.target time-sync.target [Service] User=freenet Type=forking WorkingDirectory=/home/freenet/Freenet/ ExecStart=/home/freenet/Freenet/run.sh start ExecStop=/home/freenet/Freenet/run.sh stop PIDFile=/home/freenet/Freenet/Freenet.pid [Install] WantedBy=multi-user.target -- Adam Mizerski
24.01.2019 21:32, Adam Mizerski пишет:
Hi,
I've got an odd problem, that probably deserves a bug report, but I don't know where to start looking for evidence.
I'm running openSUSE 15.0 "transactional server" on a vps. I had freenet daemon installed, running as a dedicated "freenet" user.
What happened is that the "freenet" user suddenly disappeared. It's not mentioned in /etc/passwd, /etc/shadow nor /etc/group. If I do "ls -l" in /home/freenet, there are numbers instead of owner and group. Service obviously fails to start.
I had a look at logs, but there's nothing alarming.
Do you have any clues, what could go wrong?
No. If you are using btrfs with snapshots enabled you may compare previous snapshots; this may provide better information when it happened which will allow more targeted log analysis.
Recent logs for freenet service: http://susepaste.org/view/a0905f56
Systemd service file: # /etc/systemd/system/freenet.service [Unit] Description=FreeNet Service After=local-fs.target network.target time-sync.target [Service] User=freenet Type=forking WorkingDirectory=/home/freenet/Freenet/ ExecStart=/home/freenet/Freenet/run.sh start ExecStop=/home/freenet/Freenet/run.sh stop PIDFile=/home/freenet/Freenet/Freenet.pid [Install] WantedBy=multi-user.target
W dniu 24.01.2019 o 19:46, Andrei Borzenkov pisze:
24.01.2019 21:32, Adam Mizerski пишет:
Hi,
I've got an odd problem, that probably deserves a bug report, but I don't know where to start looking for evidence.
I'm running openSUSE 15.0 "transactional server" on a vps. I had freenet daemon installed, running as a dedicated "freenet" user.
What happened is that the "freenet" user suddenly disappeared. It's not mentioned in /etc/passwd, /etc/shadow nor /etc/group. If I do "ls -l" in /home/freenet, there are numbers instead of owner and group. Service obviously fails to start.
I had a look at logs, but there's nothing alarming.
Do you have any clues, what could go wrong?
No. If you are using btrfs with snapshots enabled you may compare previous snapshots; this may provide better information when it happened which will allow more targeted log analysis.
In snapshots there's nothing wrong. The last shapshot has "freenet" user in /etc/passwd. But the problem I see is that, unlike on normal openSUSE installation, "transactional server" makes no "post" snapshots. And it looks like the last update broke freenet. In logs, exactly after transactional-update made a new shapshot, when shutting down, systemd complains about freenet user. The update was also not very interesting: just wireguard-kmp-default and wireguard-tools from network:vpn:wireguard were updated to 0.0.20190123.
Recent logs for freenet service: http://susepaste.org/view/a0905f56
Systemd service file: # /etc/systemd/system/freenet.service [Unit] Description=FreeNet Service After=local-fs.target network.target time-sync.target [Service] User=freenet Type=forking WorkingDirectory=/home/freenet/Freenet/ ExecStart=/home/freenet/Freenet/run.sh start ExecStop=/home/freenet/Freenet/run.sh stop PIDFile=/home/freenet/Freenet/Freenet.pid [Install] WantedBy=multi-user.target
Hello, Am Donnerstag, 24. Januar 2019, 21:54:42 CET schrieb Adam Mizerski:
In snapshots there's nothing wrong. The last shapshot has "freenet" user in /etc/passwd. But the problem I see is that, unlike on normal openSUSE installation, "transactional server" makes no "post" snapshots.
That's probably caused by the difference in how "normal" and "transactional" works. In a "normal" system, your system always boots from the same snapshot - typically snapshot 1. (Maybe this changes if you use "snapper rollback", but I'd have to test that - let's just ignore this detail in the description below.) When you start zypper, it first does a "pre" snapshot, and after installing packages etc. (in the current snapshot, typically 1), it does a "post" snapshot. You can think of these snapshots as a backup of your system before and after running zypper. Typically, you'll have something like this: 1 current 20 pre 21 post When rebooting, you'll be in snapshot 1 again. With the transactional setup, things work differently. When you start zypper, it creates a snapshot (for better understanding you should call it the "future" snapshot instead of "pre"). The difference is that your currently running system doesn't get changed. Instead, the package installation is done _inside the "future" snapshot_. At the end, the to-be-booted snapshot gets set as new default. This will typically look like this: 19 old 20 current 21 future (copy of 20 + package updates) When rebooting, your system will boot the new ("future"/21) snapshot, and snapshot 20 becomes an "old" snapshot. Does this explain the differences?
And it looks like the last update broke freenet. In logs, exactly after transactional-update made a new shapshot, when shutting down, systemd complains about freenet user.
IIRC, the changes you colletced in the overlayfs (like manual [1] changes to /etc/passwd) also get merged into the future snapshot when creating that snapshot (and the overlayfs gets emptied when booting the future snapshot). This _could_ also mean that changes between creating the snapshot and rebooting get lost - but I don't know transactional systems good enough to be sure about that. Based on your freenet.service (using /home/... paths), I'd guess that you create the freenet user manually - correct? Did you create the user this before or after you did the update? (Hint: if in doubt, check the logs in the snapshot where the user exists, and compare the timestamps to /var/log/zypp/history.) Regards, Christian Boltz [1] "manual" as in "(simplified) not done by the package management", so running "useradd" also counts as "manual" in this case -- also lasst euren primitiven schreibstil und schreibt was produtives in die liste, sonst seid ihr nämlich hier falsch! [Marcel Stein in suse-linux] -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
W dniu 26.01.2019 o 17:54, Christian Boltz pisze:
Hello,
Am Donnerstag, 24. Januar 2019, 21:54:42 CET schrieb Adam Mizerski:
In snapshots there's nothing wrong. The last shapshot has "freenet" user in /etc/passwd. But the problem I see is that, unlike on normal openSUSE installation, "transactional server" makes no "post" snapshots.
That's probably caused by the difference in how "normal" and "transactional" works.
In a "normal" system, your system always boots from the same snapshot - typically snapshot 1. (Maybe this changes if you use "snapper rollback", but I'd have to test that - let's just ignore this detail in the description below.)
When you start zypper, it first does a "pre" snapshot, and after installing packages etc. (in the current snapshot, typically 1), it does a "post" snapshot. You can think of these snapshots as a backup of your system before and after running zypper.
Typically, you'll have something like this: 1 current 20 pre 21 post
When rebooting, you'll be in snapshot 1 again.
With the transactional setup, things work differently.
When you start zypper, it creates a snapshot (for better understanding you should call it the "future" snapshot instead of "pre"). The difference is that your currently running system doesn't get changed. Instead, the package installation is done _inside the "future" snapshot_. At the end, the to-be-booted snapshot gets set as new default.
This will typically look like this: 19 old 20 current 21 future (copy of 20 + package updates)
When rebooting, your system will boot the new ("future"/21) snapshot, and snapshot 20 becomes an "old" snapshot.
Does this explain the differences?
I think it does. Thanks.
And it looks like the last update broke freenet. In logs, exactly after transactional-update made a new shapshot, when shutting down, systemd complains about freenet user.
IIRC, the changes you colletced in the overlayfs (like manual [1] changes to /etc/passwd) also get merged into the future snapshot when creating that snapshot (and the overlayfs gets emptied when booting the future snapshot). This _could_ also mean that changes between creating the snapshot and rebooting get lost - but I don't know transactional systems good enough to be sure about that.
Based on your freenet.service (using /home/... paths), I'd guess that you create the freenet user manually - correct?
Yes, I created the user manually.
Did you create the user this before or after you did the update? (Hint: if in doubt, check the logs in the snapshot where the user exists, and compare the timestamps to /var/log/zypp/history.)
I created the user and service at least month ago. It survived many updates and reboots.
Regards,
Christian Boltz
[1] "manual" as in "(simplified) not done by the package management", so running "useradd" also counts as "manual" in this case
Hello, Am Samstag, 26. Januar 2019, 18:00:36 CET schrieb Adam Mizerski:
I created the user and service at least month ago. It survived many updates and reboots.
Then all I wrote was superfluous, and this is most probably unrelated to how transactional updates work ;-) Totally different (and much simpler) question - did you check journal or /var/log/messages? If the user was deleted with "userdel", you should find something about this in the log. If so, check if other events, updates, whatever happened at the same time. Regards, Christian Boltz -- The tendens seems to go towards not having a forum. Not realy a surprise. It is as if you were asking what the best sport is at a soccer club. ;-) [houghi in opensuse] -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
W dniu 26.01.2019 o 23:08, Christian Boltz pisze:
Hello,
Am Samstag, 26. Januar 2019, 18:00:36 CET schrieb Adam Mizerski:
I created the user and service at least month ago. It survived many updates and reboots.
Then all I wrote was superfluous, and this is most probably unrelated to how transactional updates work ;-)
Totally different (and much simpler) question - did you check journal or /var/log/messages? If the user was deleted with "userdel", you should find something about this in the log. If so, check if other events, updates, whatever happened at the same time.
In journal - nothing suspicious (I can share some of my logs privately, if you want). /var/log/messages? Man, I haven't seed that file in ages :) I grepped whole /var/log with some keywords and all I found is /var/log/audit/audit.log that contains all my useradd actions, but nothing related to deleting freenet user (I checked by name and id). Now I got the idea to check mtime on /etc/passwd, but I've recreated freenet user, so that information is lost.
Regards,
Christian Boltz
On 27/01/2019 15.08, Adam Mizerski wrote:
Totally different (and much simpler) question - did you check journal or /var/log/messages? If the user was deleted with "userdel", you should find something about this in the log. If so, check if other events, updates, whatever happened at the same time.
In journal - nothing suspicious (I can share some of my logs privately, if you want). /var/log/messages? Man, I haven't seed that file in ages :)
I grepped whole /var/log with some keywords and all I found is /var/log/audit/audit.log that contains all my useradd actions, but nothing related to deleting freenet user (I checked by name and id).
The user could be deleted by editing the passwd file directly with any editor, leaving no trace. Well, some editors would leave a bak or ~ file. -- Cheers / Saludos, Carlos E. R. (from 15.0 x86_64 at Telcontar)
W dniu 24.01.2019 o 19:32, Adam Mizerski pisze:
Hi,
I've got an odd problem, that probably deserves a bug report, but I don't know where to start looking for evidence.
I'm running openSUSE 15.0 "transactional server" on a vps. I had freenet daemon installed, running as a dedicated "freenet" user.
What happened is that the "freenet" user suddenly disappeared. It's not mentioned in /etc/passwd, /etc/shadow nor /etc/group. If I do "ls -l" in /home/freenet, there are numbers instead of owner and group. Service obviously fails to start.
I had a look at logs, but there's nothing alarming.
Do you have any clues, what could go wrong?
So, this happened again. Not only this, but also some other files in /etc got reverted to earlier versions. It seems to me that the mechanism of making /etc overlays does some weird stuff. I've found that my config files are still available in /var/lib/overlay/*, but those overlays are not mounted. # mount | grep overlay overlay on /etc type overlay (rw,relatime,lowerdir=/sysroot/var/lib/overlay/168/etc:/sysroot/var/lib/overlay/167/etc:/sysroot/var/lib/overlay/166/etc:/sysroot/var/lib/overlay/165/etc:/sysroot/var/lib/overlay/164/etc:/sysroot/var/lib/overlay/163/etc:/sysroot/var/lib/overlay/162/etc:/sysroot/var/lib/overlay/161/etc:/sysroot/var/lib/overlay/160/etc:/sysroot/var/lib/overlay/159/etc:/sysroot/var/lib/overlay/158/etc:/sysroot/var/lib/overlay/etc:/sysroot/etc,upperdir=/sysroot/var/lib/overlay/169/etc,workdir=/sysroot/var/lib/overlay/work-etc) /var/lib/overlay/etc/wireguard/wg0.conf has some very old version, which is visible now in /etc Those are newer versions that I've made (overlay 151 is the latest version I expected to find in /etc): # ls -la /var/lib/overlay/*/etc/wireguard/wg0.conf -rw------- 1 root root 600 01-14 11:43 /var/lib/overlay/146/etc/wireguard/wg0.conf -rw------- 1 root root 725 01-25 12:05 /var/lib/overlay/151/etc/wireguard/wg0.conf It seems the same happened to /etc/passwd, which caused my freenet user to disappear. Do you have any suggestions on how to debug this further? -- Adam Mizerski
I have kind of the same behavior. Installed MicroOS on baremetal (I assume, though, that this would do the same with Openstack image), then installed docker on it (transactional-update -n pkg install docker), then rebooted. At first, everything worked. That was yesterday. Left the machine up and running and, this morning, I saw that the etc directory was empty (while my last commands was showing something in it). Weird.... Rebooted the server and now, it does not boot anymore, with this error: Overlayfs: failed to revolve '/sysroot/var/lib/overlay/2/etc': -2 Failed to mount /sysrot/etc/... So it's falling to evergency mode, where I can see that the overlays (/sysroot/etc for lowerdir, /sysroot/var/lib/overlay/etc for upperdir, and /sysroot/var/lib/overlay/work-etc for workdir) are all there, but not exactly at the same path the system is looking at... Any clue ? -- Christian Tardif Architecte principal infonuagique | Cloud computing senior Architect Network IaaS Bell Canada -----Original Message----- From: Adam Mizerski <adam@mizerski.pl> Sent: Wednesday, February 27, 2019 7:13 AM To: opensuse-support@opensuse.org Subject: [opensuse-support] Re: user disappeared W dniu 24.01.2019 o 19:32, Adam Mizerski pisze:
Hi,
I've got an odd problem, that probably deserves a bug report, but I don't know where to start looking for evidence.
I'm running openSUSE 15.0 "transactional server" on a vps. I had freenet daemon installed, running as a dedicated "freenet" user.
What happened is that the "freenet" user suddenly disappeared. It's not mentioned in /etc/passwd, /etc/shadow nor /etc/group. If I do "ls -l" in /home/freenet, there are numbers instead of owner and group. Service obviously fails to start.
I had a look at logs, but there's nothing alarming.
Do you have any clues, what could go wrong?
So, this happened again. Not only this, but also some other files in /etc got reverted to earlier versions. It seems to me that the mechanism of making /etc overlays does some weird stuff. I've found that my config files are still available in /var/lib/overlay/*, but those overlays are not mounted. # mount | grep overlay overlay on /etc type overlay (rw,relatime,lowerdir=/sysroot/var/lib/overlay/168/etc:/sysroot/var/lib/overlay/167/etc:/sysroot/var/lib/overlay/166/etc:/sysroot/var/lib/overlay/165/etc:/sysroot/var/lib/overlay/164/etc:/sysroot/var/lib/overlay/163/etc:/sysroot/var/lib/overlay/162/etc:/sysroot/var/lib/overlay/161/etc:/sysroot/var/lib/overlay/160/etc:/sysroot/var/lib/overlay/159/etc:/sysroot/var/lib/overlay/158/etc:/sysroot/var/lib/overlay/etc:/sysroot/etc,upperdir=/sysroot/var/lib/overlay/169/etc,workdir=/sysroot/var/lib/overlay/work-etc) /var/lib/overlay/etc/wireguard/wg0.conf has some very old version, which is visible now in /etc Those are newer versions that I've made (overlay 151 is the latest version I expected to find in /etc): # ls -la /var/lib/overlay/*/etc/wireguard/wg0.conf -rw------- 1 root root 600 01-14 11:43 /var/lib/overlay/146/etc/wireguard/wg0.conf -rw------- 1 root root 725 01-25 12:05 /var/lib/overlay/151/etc/wireguard/wg0.conf It seems the same happened to /etc/passwd, which caused my freenet user to disappear. Do you have any suggestions on how to debug this further? -- Adam Mizerski
I've reported this on bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1132269. It turned out that "transactional-update" package on openSUSE 15.0 is old and buggy. A maintenance request is on the way updating it to newer version: https://build.opensuse.org/request/show/694803 -- Adam Mizerski W dniu 28.02.2019 o 17:58, Tardif, Christian pisze:
I have kind of the same behavior. Installed MicroOS on baremetal (I assume, though, that this would do the same with Openstack image), then installed docker on it (transactional-update -n pkg install docker), then rebooted. At first, everything worked. That was yesterday. Left the machine up and running and, this morning, I saw that the etc directory was empty (while my last commands was showing something in it). Weird.... Rebooted the server and now, it does not boot anymore, with this error:
Overlayfs: failed to revolve '/sysroot/var/lib/overlay/2/etc': -2 Failed to mount /sysrot/etc/...
So it's falling to evergency mode, where I can see that the overlays (/sysroot/etc for lowerdir, /sysroot/var/lib/overlay/etc for upperdir, and /sysroot/var/lib/overlay/work-etc for workdir) are all there, but not exactly at the same path the system is looking at...
Any clue ?
-- Christian Tardif Architecte principal infonuagique | Cloud computing senior Architect
Network IaaS Bell Canada
-----Original Message----- From: Adam Mizerski <adam@mizerski.pl> Sent: Wednesday, February 27, 2019 7:13 AM To: opensuse-support@opensuse.org Subject: [opensuse-support] Re: user disappeared
W dniu 24.01.2019 o 19:32, Adam Mizerski pisze:
Hi,
I've got an odd problem, that probably deserves a bug report, but I don't know where to start looking for evidence.
I'm running openSUSE 15.0 "transactional server" on a vps. I had freenet daemon installed, running as a dedicated "freenet" user.
What happened is that the "freenet" user suddenly disappeared. It's not mentioned in /etc/passwd, /etc/shadow nor /etc/group. If I do "ls -l" in /home/freenet, there are numbers instead of owner and group. Service obviously fails to start.
I had a look at logs, but there's nothing alarming.
Do you have any clues, what could go wrong?
So, this happened again. Not only this, but also some other files in /etc got reverted to earlier versions.
It seems to me that the mechanism of making /etc overlays does some weird stuff.
I've found that my config files are still available in /var/lib/overlay/*, but those overlays are not mounted.
# mount | grep overlay overlay on /etc type overlay (rw,relatime,lowerdir=/sysroot/var/lib/overlay/168/etc:/sysroot/var/lib/overlay/167/etc:/sysroot/var/lib/overlay/166/etc:/sysroot/var/lib/overlay/165/etc:/sysroot/var/lib/overlay/164/etc:/sysroot/var/lib/overlay/163/etc:/sysroot/var/lib/overlay/162/etc:/sysroot/var/lib/overlay/161/etc:/sysroot/var/lib/overlay/160/etc:/sysroot/var/lib/overlay/159/etc:/sysroot/var/lib/overlay/158/etc:/sysroot/var/lib/overlay/etc:/sysroot/etc,upperdir=/sysroot/var/lib/overlay/169/etc,workdir=/sysroot/var/lib/overlay/work-etc)
/var/lib/overlay/etc/wireguard/wg0.conf has some very old version, which is visible now in /etc
Those are newer versions that I've made (overlay 151 is the latest version I expected to find in /etc): # ls -la /var/lib/overlay/*/etc/wireguard/wg0.conf -rw------- 1 root root 600 01-14 11:43 /var/lib/overlay/146/etc/wireguard/wg0.conf -rw------- 1 root root 725 01-25 12:05 /var/lib/overlay/151/etc/wireguard/wg0.conf
It seems the same happened to /etc/passwd, which caused my freenet user to disappear.
Do you have any suggestions on how to debug this further?
-- Adam Mizerski
N�����r��y隊Z)z{.��.��+�맲��r��z�^�ˬz��N�(�֜��^� ޭ隊Z)z{.��.��+��0�����Ǩrg==
participants (5)
-
Adam Mizerski
-
Andrei Borzenkov
-
Carlos E. R.
-
Christian Boltz
-
Tardif, Christian