[Bug 1156336] New: Virtualization:containers/lxd: Bug
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336 Bug ID: 1156336 Summary: Virtualization:containers/lxd: Bug Classification: openSUSE Product: openSUSE.org Version: unspecified Hardware: x86 OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: 3rd party software Assignee: asarai@suse.com Reporter: appspotio@gmail.com QA Contact: bnc-team-screening@forge.provo.novell.com CC: containers-bugowner@suse.de, fcastelli@suse.com Found By: --- Blocker: --- systemctl start lxd, dose not start. Reference this this thread. https://forums.opensuse.org/showthread.php/538138-LXD-dose-not-start/ -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c1
Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c2
seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c3
Aleksa Sarai
Is it working for you on leap 15.1? Can you share what packages did you install please?
I use the ones in Virtualization:containers, but the ones in main Leap repos should work just as well (whenever I update LXD in Virtualization:containers, I immediately forward the request to Factory and to the Leaps).
We wish to use lxd from opensuse packages not snapd.
Yup, that's why I packaged it for openSUSE in the first place. :D -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
Flavio Castelli
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c4
--- Comment #4 from seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c5
--- Comment #5 from Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c6
--- Comment #6 from seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c7
--- Comment #7 from seve skeis
Hi, actually, all using root, no wired structure, and on real hardware, never tried on a vm. Can i ask what are the packages you installed?
zypper in lxd Loading repository data... Reading installed packages... Resolving package dependencies... The following 11 NEW packages are going to be installed: criu liblxc1 libnet9 libuv1 lxcfs lxcfs-hooks-lxc lxd lxd-bash-completion python2-ipaddr python2-protobuf squashfs 11 new packages to install. Overall download size: 45.2 MiB. Already cached: 0 B. After the operation, additional 144.8 MiB will be used. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c8
--- Comment #8 from seve skeis
(In reply to seve skeis from comment #6)
Hi, actually, all using root, no wired structure, and on real hardware, never tried on a vm. Can i ask what are the packages you installed?
zypper in lxd Loading repository data... Reading installed packages... Resolving package dependencies...
The following 11 NEW packages are going to be installed: criu liblxc1 libnet9 libuv1 lxcfs lxcfs-hooks-lxc lxd lxd-bash-completion python2-ipaddr python2-protobuf squashfs
11 new packages to install. Overall download size: 45.2 MiB. Already cached: 0 B. After the operation, additional 144.8 MiB will be used.
Overall download size: 45.2 MiB. Already cached: 0 B. After the operation, additional 144.8 MiB will be used. Continue? [y/n/v/...? shows all options] (y): Retrieving package libnet9-1.2~rc3-lp151.2.2.x86_64 (1/11), 44.7 KiB (100.2 KiB unpacked) Retrieving: libnet9-1.2~rc3-lp151.2.2.x86_64.rpm ............................................................................................................................................................[done] Retrieving package libuv1-1.18.0-lp151.2.3.x86_64 (2/11), 83.8 KiB (164.7 KiB unpacked) Retrieving: libuv1-1.18.0-lp151.2.3.x86_64.rpm ..............................................................................................................................................................[done] Retrieving package python2-ipaddr-2.1.11-lp151.2.2.noarch (3/11), 37.6 KiB (193.7 KiB unpacked) Retrieving: python2-ipaddr-2.1.11-lp151.2.2.noarch.rpm ......................................................................................................................................................[done] Retrieving package python2-protobuf-3.5.0-lp151.4.7.x86_64 (4/11), 493.0 KiB ( 4.0 MiB unpacked) Retrieving: python2-protobuf-3.5.0-lp151.4.7.x86_64.rpm .....................................................................................................................................................[done] Retrieving package squashfs-4.3-lp151.2.3.x86_64 (5/11), 134.6 KiB (351.1 KiB unpacked) Retrieving: squashfs-4.3-lp151.2.3.x86_64.rpm ...............................................................................................................................................................[done] Retrieving package criu-3.8.1-lp151.2.3.x86_64 (6/11), 596.4 KiB ( 2.3 MiB unpacked) Retrieving: criu-3.8.1-lp151.2.3.x86_64.rpm .................................................................................................................................................................[done] Retrieving package liblxc1-3.2.1-lp151.4.5.1.x86_64 (7/11), 355.1 KiB ( 1.0 MiB unpacked) Retrieving: liblxc1-3.2.1-lp151.4.5.1.x86_64.rpm ..............................................................................................................................................[done (220.3 KiB/s)] Retrieving package lxcfs-3.1.2-lp151.2.3.1.x86_64 (8/11), 70.3 KiB (128.8 KiB unpacked) Retrieving: lxcfs-3.1.2-lp151.2.3.1.x86_64.rpm ..............................................................................................................................................................[done] Retrieving package lxcfs-hooks-lxc-3.1.2-lp151.2.3.1.noarch (9/11), 17.3 KiB ( 103 B unpacked) Retrieving: lxcfs-hooks-lxc-3.1.2-lp151.2.3.1.noarch.rpm .......................................................................................................................................[done (64.4 KiB/s)] Retrieving package lxd-3.18-lp151.11.1.x86_64 (10/11), 43.4 MiB (136.6 MiB unpacked) Retrieving: lxd-3.18-lp151.11.1.x86_64.rpm ......................................................................................................................................................[done (1.9 MiB/s)] Retrieving package lxd-bash-completion-3.18-lp151.11.1.noarch (11/11), 14.8 KiB ( 12.2 KiB unpacked) Retrieving: lxd-bash-completion-3.18-lp151.11.1.noarch.rpm ..................................................................................................................................................[done] Checking for file conflicts: ................................................................................................................................................................................[done] ( 1/11) Installing: libnet9-1.2~rc3-lp151.2.2.x86_64 ........................................................................................................................................................[done] ( 2/11) Installing: libuv1-1.18.0-lp151.2.3.x86_64 ..........................................................................................................................................................[done] ( 3/11) Installing: python2-ipaddr-2.1.11-lp151.2.2.noarch ..................................................................................................................................................[done] ( 4/11) Installing: python2-protobuf-3.5.0-lp151.4.7.x86_64 .................................................................................................................................................[done] ( 5/11) Installing: squashfs-4.3-lp151.2.3.x86_64 ...........................................................................................................................................................[done] ( 6/11) Installing: criu-3.8.1-lp151.2.3.x86_64 .............................................................................................................................................................[done] ( 7/11) Installing: liblxc1-3.2.1-lp151.4.5.1.x86_64 ........................................................................................................................................................[done] Additional rpm output: setting /usr/lib/lxc/lxc-user-nic to root:kvm 4750. (wrong permissions 0750) ( 8/11) Installing: lxcfs-3.1.2-lp151.2.3.1.x86_64 ..........................................................................................................................................................[done] ( 9/11) Installing: lxcfs-hooks-lxc-3.1.2-lp151.2.3.1.noarch ................................................................................................................................................[done] (10/11) Installing: lxd-3.18-lp151.11.1.x86_64 ..............................................................................................................................................................[done] (11/11) Installing: lxd-bash-completion-3.18-lp151.11.1.noarch ..............................................................................................................................................[done] seven:~ # systemctl status lxd ● lxd.service - LXD Container Hypervisor Loaded: loaded (/usr/lib/systemd/system/lxd.service; disabled; vendor preset: disabled) Active: inactive (dead) Docs: man:lxd(1) seven:~ # systemctl start lxd ...hangs -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c9
--- Comment #9 from Aleksa Sarai
Hi, actually, all using root, no wired structure, and on real hardware, never tried on a vm. Can i ask what are the packages you installed?
I installed the same ones you did -- the ones that are in the Leap 15.1 repos. I am running LXD on my server (on bare metal) and it also works fine, but the reason I tested it in a VM is to check whether there was an issue if you did a fresh install (I've upgraded my server incrementally from Leap 42.2). Do you only have this problem on one machine, or can you replicate the problem on any other machines? What is the output of dmesg after LXD crashes (please don't paste it as a comment -- add it as an attachment)? If you run just 'sudo lxd` in a terminal (to start the server in your shell), what happens? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c10
--- Comment #10 from seve skeis
(In reply to seve skeis from comment #6)
Hi, actually, all using root, no wired structure, and on real hardware, never tried on a vm. Can i ask what are the packages you installed?
I installed the same ones you did -- the ones that are in the Leap 15.1 repos. I am running LXD on my server (on bare metal) and it also works fine, but the reason I tested it in a VM is to check whether there was an issue if you did a fresh install (I've upgraded my server incrementally from Leap 42.2).
Do you only have this problem on one machine, or can you replicate the problem on any other machines? What is the output of dmesg after LXD crashes (please don't paste it as a comment -- add it as an attachment)? If you run just 'sudo lxd` in a terminal (to start the server in your shell), what happens?
i have no issues with pc, its fresh install, no repos, no apps, just trying to get LXD work, can you check the thread mentioned please, all this commands and dmsegs and logs are there. and my pc hardware details too. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c11
--- Comment #11 from Aleksa Sarai
(In reply to Aleksa Sarai from comment #9)
(In reply to seve skeis from comment #6)
Hi, actually, all using root, no wired structure, and on real hardware, never tried on a vm. Can i ask what are the packages you installed?
I installed the same ones you did -- the ones that are in the Leap 15.1 repos. I am running LXD on my server (on bare metal) and it also works fine, but the reason I tested it in a VM is to check whether there was an issue if you did a fresh install (I've upgraded my server incrementally from Leap 42.2).
Do you only have this problem on one machine, or can you replicate the problem on any other machines? What is the output of dmesg after LXD crashes (please don't paste it as a comment -- add it as an attachment)? If you run just 'sudo lxd` in a terminal (to start the server in your shell), what happens?
i have no issues with pc, its fresh install, no repos, no apps, just trying to get LXD work
My question was whether this only happens on this particular machine -- do you have another PC or laptop on which you can run this test? The reason I'm asking is to figure out whether it's specific to your hardware.
can you check the thread mentioned please
I read it after you posted comment 2.
all this commands and dmesgs and logs are there.
Ah I missed that you ran 'sudo lxd -d' (I thought you modified the .service file the last time I skimmed through it). But there isn't a dmesg log -- the logs posted were from journalctl or from the output of LXD. dmesg will give you It would also be helpful to get the coredump (which gives useful debugging information to understand in which function the crash occured) -- you can get it using coredumpctl. It might be too large to upload here, but you can always upload it on a temporary sharing site and I'll download it.
and my pc hardware details too.
(For future reference.)
CPU I7 6700K RAM 32GB DDR4 MB: ASUS MAXIMUS EXTREME G: NVIDIA GTX 970 TURBO P.S: do not know if its worth mentioning, i had to boot with kernel param: acpi_enforce_resources=lax.
I notice you're using the NVIDIA drivers:
Fri Nov 8 06:55:09 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.31 Driver Version: 440.31 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+
Do you have the same problem if you run with the Nouveau drivers? I ask because LXD supports GPU virutalisation (which uses a bunch of features from the GPU, GPU kernel module, and the userspace libraries). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c12
--- Comment #12 from seve skeis
(In reply to seve skeis from comment #10)
(In reply to Aleksa Sarai from comment #9)
(In reply to seve skeis from comment #6)
Hi, actually, all using root, no wired structure, and on real hardware, never tried on a vm. Can i ask what are the packages you installed?
I installed the same ones you did -- the ones that are in the Leap 15.1 repos. I am running LXD on my server (on bare metal) and it also works fine, but the reason I tested it in a VM is to check whether there was an issue if you did a fresh install (I've upgraded my server incrementally from Leap 42.2).
Do you only have this problem on one machine, or can you replicate the problem on any other machines? What is the output of dmesg after LXD crashes (please don't paste it as a comment -- add it as an attachment)? If you run just 'sudo lxd` in a terminal (to start the server in your shell), what happens?
i have no issues with pc, its fresh install, no repos, no apps, just trying to get LXD work
My question was whether this only happens on this particular machine -- do you have another PC or laptop on which you can run this test? The reason I'm asking is to figure out whether it's specific to your hardware.
can you check the thread mentioned please
I read it after you posted comment 2.
all this commands and dmesgs and logs are there.
Ah I missed that you ran 'sudo lxd -d' (I thought you modified the .service file the last time I skimmed through it). But there isn't a dmesg log -- the logs posted were from journalctl or from the output of LXD. dmesg will give you
It would also be helpful to get the coredump (which gives useful debugging information to understand in which function the crash occured) -- you can get it using coredumpctl. It might be too large to upload here, but you can always upload it on a temporary sharing site and I'll download it.
and my pc hardware details too.
(For future reference.)
CPU I7 6700K RAM 32GB DDR4 MB: ASUS MAXIMUS EXTREME G: NVIDIA GTX 970 TURBO P.S: do not know if its worth mentioning, i had to boot with kernel param: acpi_enforce_resources=lax.
I notice you're using the NVIDIA drivers:
Fri Nov 8 06:55:09 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.31 Driver Version: 440.31 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+
Do you have the same problem if you run with the Nouveau drivers? I ask because LXD supports GPU virutalisation (which uses a bunch of features from the GPU, GPU kernel module, and the userspace libraries).
Hi, i will run dumps, later and share it, i have not other machines, have not tried with nouvuew, but it works with snapd with nvidia. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c13
--- Comment #13 from seve skeis
(In reply to Aleksa Sarai from comment #11)
(In reply to seve skeis from comment #10)
(In reply to Aleksa Sarai from comment #9)
(In reply to seve skeis from comment #6)
Hi, actually, all using root, no wired structure, and on real hardware, never tried on a vm. Can i ask what are the packages you installed?
I installed the same ones you did -- the ones that are in the Leap 15.1 repos. I am running LXD on my server (on bare metal) and it also works fine, but the reason I tested it in a VM is to check whether there was an issue if you did a fresh install (I've upgraded my server incrementally from Leap 42.2).
Do you only have this problem on one machine, or can you replicate the problem on any other machines? What is the output of dmesg after LXD crashes (please don't paste it as a comment -- add it as an attachment)? If you run just 'sudo lxd` in a terminal (to start the server in your shell), what happens?
i have no issues with pc, its fresh install, no repos, no apps, just trying to get LXD work
My question was whether this only happens on this particular machine -- do you have another PC or laptop on which you can run this test? The reason I'm asking is to figure out whether it's specific to your hardware.
can you check the thread mentioned please
I read it after you posted comment 2.
all this commands and dmesgs and logs are there.
Ah I missed that you ran 'sudo lxd -d' (I thought you modified the .service file the last time I skimmed through it). But there isn't a dmesg log -- the logs posted were from journalctl or from the output of LXD. dmesg will give you
It would also be helpful to get the coredump (which gives useful debugging information to understand in which function the crash occured) -- you can get it using coredumpctl. It might be too large to upload here, but you can always upload it on a temporary sharing site and I'll download it.
and my pc hardware details too.
(For future reference.)
CPU I7 6700K RAM 32GB DDR4 MB: ASUS MAXIMUS EXTREME G: NVIDIA GTX 970 TURBO P.S: do not know if its worth mentioning, i had to boot with kernel param: acpi_enforce_resources=lax.
I notice you're using the NVIDIA drivers:
Fri Nov 8 06:55:09 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.31 Driver Version: 440.31 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+
Do you have the same problem if you run with the Nouveau drivers? I ask because LXD supports GPU virutalisation (which uses a bunch of features from the GPU, GPU kernel module, and the userspace libraries).
Hi, i will run dumps, later and share it, i have not other machines, have not tried with nouvuew, but it works with snapd with nvidia.
My system dose not have coredumpctl , or systemd-coredumpctl. and do not know how to use it. This is going way harder than i expected. i will be using snapd, till you guys fix it. i suggest you use real hardware with fresh leap 15.1, then the issue will be clear whether its my hardware or a bug. Sorry for any inconvenience, Thank you. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c14
--- Comment #14 from Aleksa Sarai
(In reply to seve skeis from comment #12)
(In reply to Aleksa Sarai from comment #11)
(In reply to seve skeis from comment #10)
(In reply to Aleksa Sarai from comment #9)
(In reply to seve skeis from comment #6)
Hi, actually, all using root, no wired structure, and on real hardware, never tried on a vm. Can i ask what are the packages you installed?
I installed the same ones you did -- the ones that are in the Leap 15.1 repos. I am running LXD on my server (on bare metal) and it also works fine, but the reason I tested it in a VM is to check whether there was an issue if you did a fresh install (I've upgraded my server incrementally from Leap 42.2).
Do you only have this problem on one machine, or can you replicate the problem on any other machines? What is the output of dmesg after LXD crashes (please don't paste it as a comment -- add it as an attachment)? If you run just 'sudo lxd` in a terminal (to start the server in your shell), what happens?
i have no issues with pc, its fresh install, no repos, no apps, just trying to get LXD work
My question was whether this only happens on this particular machine -- do you have another PC or laptop on which you can run this test? The reason I'm asking is to figure out whether it's specific to your hardware.
can you check the thread mentioned please
I read it after you posted comment 2.
all this commands and dmesgs and logs are there.
Ah I missed that you ran 'sudo lxd -d' (I thought you modified the .service file the last time I skimmed through it). But there isn't a dmesg log -- the logs posted were from journalctl or from the output of LXD. dmesg will give you
It would also be helpful to get the coredump (which gives useful debugging information to understand in which function the crash occured) -- you can get it using coredumpctl. It might be too large to upload here, but you can always upload it on a temporary sharing site and I'll download it.
and my pc hardware details too.
(For future reference.)
CPU I7 6700K RAM 32GB DDR4 MB: ASUS MAXIMUS EXTREME G: NVIDIA GTX 970 TURBO P.S: do not know if its worth mentioning, i had to boot with kernel param: acpi_enforce_resources=lax.
I notice you're using the NVIDIA drivers:
Fri Nov 8 06:55:09 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.31 Driver Version: 440.31 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+
Do you have the same problem if you run with the Nouveau drivers? I ask because LXD supports GPU virutalisation (which uses a bunch of features from the GPU, GPU kernel module, and the userspace libraries).
Hi, i will run dumps, later and share it, i have not other machines, have not tried with nouvuew, but it works with snapd with nvidia.
My system dose not have coredumpctl , or systemd-coredumpctl. and do not know how to use it.
First, trigger the coredump (run `lxd`) then do % sudo coredumpctl info lxd which should give you the latest backtrace and a few other bits of information from LXD. This should give us plenty of information by itself, but to get the actual coredump you need to do: % sudo coredumpctl dump lxd > lxd.core If the above give you errors about missing corefiles, then you might have to modify your coredump.conf configuration -- but for me it just works out of the box. You can also disable systemd-coredump and just make a regular corefile if you modify the kernel.core_pattern sysctl to be a simple file: % sudo sysctl -w kernel.core_pattern=%e.%p_%u.%g_%t.core And then if you trigger the coredump, there will be a corefile in your current directory with a name that looks like "lxd.$PID_$UID.$GID_$TIME.core".
i will be using snapd, till you guys fix it.
I'm not sure how it's reasonable to think we can fix it, if we can't even reproduce it. But sure, feel free to use whatever works for you.
i suggest you use real hardware with fresh leap 15.1, then the issue will be clear whether its my hardware or a bug.
Unless I'm missing something, several other people in the original thread said they tried on real hardware (just as I'm running things on my real hardware) and didn't have issues. -- You are receiving this mail because: You are on the CC list for the bug.
(In reply to seve skeis from comment #13)
(In reply to seve skeis from comment #12)
(In reply to Aleksa Sarai from comment #11)
(In reply to seve skeis from comment #10)
(In reply to Aleksa Sarai from comment #9)
(In reply to seve skeis from comment #6) > Hi, actually, all using root, no wired structure, and on real hardware, > never tried on a vm. Can i ask what are the packages you installed?
I installed the same ones you did -- the ones that are in the Leap 15.1 repos. I am running LXD on my server (on bare metal) and it also works fine, but the reason I tested it in a VM is to check whether there was an issue if you did a fresh install (I've upgraded my server incrementally from Leap 42.2).
Do you only have this problem on one machine, or can you replicate the problem on any other machines? What is the output of dmesg after LXD crashes (please don't paste it as a comment -- add it as an attachment)? If you run just 'sudo lxd` in a terminal (to start the server in your shell), what happens?
i have no issues with pc, its fresh install, no repos, no apps, just trying to get LXD work
My question was whether this only happens on this particular machine -- do you have another PC or laptop on which you can run this test? The reason I'm asking is to figure out whether it's specific to your hardware.
can you check the thread mentioned please
I read it after you posted comment 2.
all this commands and dmesgs and logs are there.
Ah I missed that you ran 'sudo lxd -d' (I thought you modified the .service file the last time I skimmed through it). But there isn't a dmesg log -- the logs posted were from journalctl or from the output of LXD. dmesg will give you
It would also be helpful to get the coredump (which gives useful debugging information to understand in which function the crash occured) -- you can get it using coredumpctl. It might be too large to upload here, but you can always upload it on a temporary sharing site and I'll download it.
and my pc hardware details too.
(For future reference.)
CPU I7 6700K RAM 32GB DDR4 MB: ASUS MAXIMUS EXTREME G: NVIDIA GTX 970 TURBO P.S: do not know if its worth mentioning, i had to boot with kernel param: acpi_enforce_resources=lax.
I notice you're using the NVIDIA drivers:
Fri Nov 8 06:55:09 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.31 Driver Version: 440.31 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+
Do you have the same problem if you run with the Nouveau drivers? I ask because LXD supports GPU virutalisation (which uses a bunch of features from the GPU, GPU kernel module, and the userspace libraries).
Hi, i will run dumps, later and share it, i have not other machines, have not tried with nouvuew, but it works with snapd with nvidia.
My system dose not have coredumpctl , or systemd-coredumpctl. and do not know how to use it.
First, trigger the coredump (run `lxd`) then do
% sudo coredumpctl info lxd
which should give you the latest backtrace and a few other bits of information from LXD. This should give us plenty of information by itself, but to get the actual coredump you need to do:
% sudo coredumpctl dump lxd > lxd.core
If the above give you errors about missing corefiles, then you might have to modify your coredump.conf configuration -- but for me it just works out of the box.
You can also disable systemd-coredump and just make a regular corefile if you modify the kernel.core_pattern sysctl to be a simple file:
% sudo sysctl -w kernel.core_pattern=%e.%p_%u.%g_%t.core
And then if you trigger the coredump, there will be a corefile in your current directory with a name that looks like "lxd.$PID_$UID.$GID_$TIME.core".
i will be using snapd, till you guys fix it.
I'm not sure how it's reasonable to think we can fix it, if we can't even reproduce it. But sure, feel free to use whatever works for you.
i suggest you use real hardware with fresh leap 15.1, then the issue will be clear whether its my hardware or a bug.
Unless I'm missing something, several other people in the original thread said they tried on real hardware (just as I'm running things on my real hardware) and didn't have issues. This link to core.dump https://gofile.io/?c=1dPPkj kindly check, and get back if possible fix ASAP, as i just started my other
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c15
seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c16
Matei Albu
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c17
seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c18
--- Comment #18 from Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c19
Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c20
seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c21
--- Comment #21 from Matei Albu
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c22
Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c23
--- Comment #23 from Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c24
--- Comment #24 from Matei Albu
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c25
--- Comment #25 from Aleksa Sarai
Created attachment 828578 [details] lxd strace
Here's the strace output. I'm gonna provide delve output soon.
Okay, so a quick look (I'll take a better look tomorrow morning) indicates that
the segfault we see in the coredump happens here -- the go runtime is killing
itself:
22299 rt_sigprocmask(SIG_UNBLOCK, [SEGV], NULL, 8) = 0
22299 rt_sigaction(SIGSEGV, {sa_handler=SIG_DFL, sa_mask=~[],
sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO,
sa_restorer=0x7f42572fd300}, NULL, 8) = 0
22299 gettid() = 22299
22299 tkill(22299, SIGSEGV) = 0
22299 --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_TKILL, si_pid=22286, si_uid=0}
---
But there is an earlier segfault:
22299 mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c26
--- Comment #26 from Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c27
seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c28
seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c29
Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c30
--- Comment #30 from Aleksa Sarai
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c31
--- Comment #31 from seve skeis
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c32
--- Comment #32 from Aleksa Sarai
I found the bug -- it was a bug in dqlite where they would accidentally double-unlock a mutex. I've sent a patch[1], and I'll backport it for testing on Matei's machine.
I've now confirmed that this patch fixes the issue. Sending SRs to Factory and Leap... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c33
Aleksa Sarai
Good job, how can i test it?
I'm about to send an SR (so you'll need to wait a bit for the build to finish), but you can add the development repo for lxd to test it: zypper ar -f obs://Virtualization:containers obs-vc And then you do zypper in -r obs-vc lxd And you'll get the development version of LXD. Note that you should undo this afterwards since the development versions of LXD could break (you should stick with the Leap repos). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c35
--- Comment #35 from seve skeis
(In reply to seve skeis from comment #31)
Good job, how can i test it?
I'm about to send an SR (so you'll need to wait a bit for the build to finish), but you can add the development repo for lxd to test it:
zypper ar -f obs://Virtualization:containers obs-vc
And then you do
zypper in -r obs-vc lxd
And you'll get the development version of LXD. Note that you should undo this afterwards since the development versions of LXD could break (you should stick with the Leap repos).
Hi, i do not want to sound weird, but i just tried it from this repo, it is not working, same old problem. ``` systemctl status lxd ● lxd.service - LXD Container Hypervisor Loaded: loaded (/usr/lib/systemd/system/lxd.service; disabled; vendor preset: disabled) Active: activating (start-post) (Result: signal) since Fri 2020-01-31 16:03:19 +03; 1min 14s ago Docs: man:lxd(1) Process: 3324 ExecStart=/usr/bin/lxd --group=lxd --logfile=/var/log/lxd/lxd.log (code=killed, signal=SEGV) Main PID: 3324 (code=killed, signal=SEGV); Control PID: 3325 (lxd) Tasks: 8 Memory: 64.4M CPU: 320ms CGroup: /system.slice/lxd.service └─control └─3325 /usr/bin/lxd waitready --timeout=600 Jan 31 16:03:19 l-beast systemd[1]: Starting LXD Container Hypervisor... Jan 31 16:03:19 l-beast lxd[3324]: t=2020-01-31T16:03:19+0300 lvl=warn msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored" Jan 31 16:03:20 l-beast systemd[1]: lxd.service: Main process exited, code=killed, status=11/SEGV `` -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c36
--- Comment #36 from Aleksa Sarai
(In reply to Aleksa Sarai from comment #33)
(In reply to seve skeis from comment #31)
Good job, how can i test it?
I'm about to send an SR (so you'll need to wait a bit for the build to finish), but you can add the development repo for lxd to test it:
zypper ar -f obs://Virtualization:containers obs-vc
And then you do
zypper in -r obs-vc lxd
And you'll get the development version of LXD. Note that you should undo this afterwards since the development versions of LXD could break (you should stick with the Leap repos).
Hi, i do not want to sound weird, but i just tried it from this repo, it is not working, same old problem.
The updated packages haven't been published yet -- see that in [1] the package is listed as "finished" not "succeeded" with a published icon. For some reason, OBS has finished the build (more than 2 hours ago) but hasn't yet published the actual package. We've been having problems with this for the past few weeks, I've got no idea what's going on. You can download the RPMs from here[2] -- but note that this is even more unsafe-to-rely-on than using the development repos. The other option would be to build the package locally using osc. [1]: https://build.opensuse.org/package/show/Virtualization:containers/lxd [2]: https://build.opensuse.org/package/binaries/Virtualization:containers/lxd/op... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c38
seve skeis
(In reply to seve skeis from comment #35)
(In reply to Aleksa Sarai from comment #33)
(In reply to seve skeis from comment #31)
Good job, how can i test it?
I'm about to send an SR (so you'll need to wait a bit for the build to finish), but you can add the development repo for lxd to test it:
zypper ar -f obs://Virtualization:containers obs-vc
And then you do
zypper in -r obs-vc lxd
And you'll get the development version of LXD. Note that you should undo this afterwards since the development versions of LXD could break (you should stick with the Leap repos).
Hi, i do not want to sound weird, but i just tried it from this repo, it is not working, same old problem.
The updated packages haven't been published yet -- see that in [1] the package is listed as "finished" not "succeeded" with a published icon. For some reason, OBS has finished the build (more than 2 hours ago) but hasn't yet published the actual package. We've been having problems with this for the past few weeks, I've got no idea what's going on.
You can download the RPMs from here[2] -- but note that this is even more unsafe-to-rely-on than using the development repos. The other option would be to build the package locally using osc.
[1]: https://build.opensuse.org/package/show/Virtualization:containers/lxd [2]: https://build.opensuse.org/package/binaries/Virtualization:containers/lxd/ openSUSE_Leap_15.1 i just installed it from the repo, you mentioned, it complained that libuv1 nothing provides it, i continued then installed from the normal repo, among other packages like dns and cirus ..etc. now bash-completion is not working. and first time launch on a container it freezes the whole system, so i do hard reboot. did you guys face sth similar ?
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c39
seve skeis
(In reply to Aleksa Sarai from comment #36)
(In reply to seve skeis from comment #35)
(In reply to Aleksa Sarai from comment #33)
(In reply to seve skeis from comment #31)
Good job, how can i test it?
I'm about to send an SR (so you'll need to wait a bit for the build to finish), but you can add the development repo for lxd to test it:
zypper ar -f obs://Virtualization:containers obs-vc
And then you do
zypper in -r obs-vc lxd
And you'll get the development version of LXD. Note that you should undo this afterwards since the development versions of LXD could break (you should stick with the Leap repos).
Hi, i do not want to sound weird, but i just tried it from this repo, it is not working, same old problem.
The updated packages haven't been published yet -- see that in [1] the package is listed as "finished" not "succeeded" with a published icon. For some reason, OBS has finished the build (more than 2 hours ago) but hasn't yet published the actual package. We've been having problems with this for the past few weeks, I've got no idea what's going on.
You can download the RPMs from here[2] -- but note that this is even more unsafe-to-rely-on than using the development repos. The other option would be to build the package locally using osc.
[1]: https://build.opensuse.org/package/show/Virtualization:containers/lxd [2]: https://build.opensuse.org/package/binaries/Virtualization:containers/lxd/ openSUSE_Leap_15.1 i just installed it from the repo, you mentioned, it complained that libuv1 nothing provides it, i continued then installed from the normal repo, among other packages like dns and cirus ..etc. now bash-completion is not working. and first time launch on a container it freezes the whole system, so i do hard reboot. did you guys face sth similar ?
1. i confirm the hangout after container creation was a cause of bridge mis-configuration. am stuck with bash-complation now. 2. i do not know if this is a bug or it is the way it should be. after creating a container using an lxd user, then the created container should use that user subgid and subuid in its conf file no? but i notice it only uses the root subuid subgid. i made sure /etc/subuid /etc/subgid has an entry for the user creating the containers and its a member of lxd group too. please correct me here. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c40
--- Comment #40 from Aleksa Sarai
(In reply to seve skeis from comment #38)
(In reply to Aleksa Sarai from comment #36)
(In reply to seve skeis from comment #35)
(In reply to Aleksa Sarai from comment #33)
(In reply to seve skeis from comment #31)
Good job, how can i test it?
I'm about to send an SR (so you'll need to wait a bit for the build to finish), but you can add the development repo for lxd to test it:
zypper ar -f obs://Virtualization:containers obs-vc
And then you do
zypper in -r obs-vc lxd
And you'll get the development version of LXD. Note that you should undo this afterwards since the development versions of LXD could break (you should stick with the Leap repos).
Hi, i do not want to sound weird, but i just tried it from this repo, it is not working, same old problem.
The updated packages haven't been published yet -- see that in [1] the package is listed as "finished" not "succeeded" with a published icon. For some reason, OBS has finished the build (more than 2 hours ago) but hasn't yet published the actual package. We've been having problems with this for the past few weeks, I've got no idea what's going on.
You can download the RPMs from here[2] -- but note that this is even more unsafe-to-rely-on than using the development repos. The other option would be to build the package locally using osc.
[1]: https://build.opensuse.org/package/show/Virtualization:containers/lxd [2]: https://build.opensuse.org/package/binaries/Virtualization:containers/lxd/ openSUSE_Leap_15.1 i just installed it from the repo, you mentioned, it complained that libuv1 nothing provides it, i continued then installed from the normal repo, among other packages like dns and cirus ..etc. now bash-completion is not working. and first time launch on a container it freezes the whole system, so i do hard reboot. did you guys face sth similar ?
1. i confirm the hangout after container creation was a cause of bridge mis-configuration. am stuck with bash-complation now.
If bash-completion isn't working, please open a new bug with a description of the problem (I don't personally use bash -- so it's entirely possible I screwed up the bash-completion installation).
2. i do not know if this is a bug or it is the way it should be. after creating a container using an lxd user, then the created container should use that user subgid and subuid in its conf file no? but i notice it only uses the root subuid subgid. i made sure /etc/subuid /etc/subgid has an entry for the user creating the containers and its a member of lxd group too. please correct me here.
The way we set up /etc/sub[ug]id in the package is correct and will work out of the box. LXD always[*] uses root's subid configuration because it is running as the root user -- it is not running as the user running "lxc" commands. There are also a bunch of other annoying technical reasons why this is done this way ("lxd" is a server and the client could be on a different machine -- in that case, there is no way to associate the user running "lxc" with user on the LXD host). There's nothing wrong with setting up your own subid configuration (though it's not a good idea to overlap the LXD ones with your own users', because it allows your user to gain privileged access to the containers) but it's not necessary. [*}: Except if you're running under snap. In those cases, LXD has special handling but that really doesn't matter right now. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c41
--- Comment #41 from Aleksa Sarai
(In reply to seve skeis from comment #39)
1. i confirm the hangout after container creation was a cause of bridge mis-configuration. am stuck with bash-complation now.
If bash-completion isn't working, please open a new bug with a description of the problem (I don't personally use bash -- so it's entirely possible I screwed up the bash-completion installation).
Never mind, I just tried to use it with bash and you're right that it just doesn't work at all. I'll take a look at this next week. Opened boo#1162426 to track that. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336
http://bugzilla.opensuse.org/show_bug.cgi?id=1156336#c48
Aleksa Sarai
participants (1)
-
bugzilla_noreply@novell.com