[Bug 964342] New: Installing TW (20160116+) as a paravirtual guest hangs
http://bugzilla.suse.com/show_bug.cgi?id=964342 Bug ID: 964342 Summary: Installing TW (20160116+) as a paravirtual guest hangs Classification: openSUSE Product: openSUSE Tumbleweed Version: 2015* Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Installation Assignee: yast2-maintainers@suse.de Reporter: mlatimer@suse.com QA Contact: jsrain@suse.com Found By: --- Blocker: --- Created attachment 663919 --> http://bugzilla.suse.com/attachment.cgi?id=663919&action=edit Console log showing hang The latest versions of Tumbleweed, with a pvops enabled -default kernel, are hanging when installing as a paravirtual guest under Xen. I can reproduce the problem on a SLES12SP1 host and a Tumbleweed host. The console log shows a couple of warnings, but the one which appears to be resulting in the hang is: [ 3.822185] ------------[ cut here ]------------ [ 3.822198] WARNING: CPU: 0 PID: 1 at ../arch/x86/xen/multicalls.c:129 xen_mc_flush+0x1c5/0x1d0() [ 3.822204] Modules linked in: sunrpc crct10dif_pclmul crc32_pclmul crc32c_intel aesni_intel aes_x86_64 xen_netfront xen_blkfront lrw gf128mul glue_helper ablk_helper cryptd scsi_dh_rdac scsi_dh_emc scsi_dh_alua squashfs loop [ 3.822223] CPU: 0 PID: 1 Comm: init Tainted: G W 4.4.0-2-default #1 [ 3.822232] ffffffff81a47911 ffff88003e2cbc10 ffffffff8137f629 0000000000000000 [ 3.822238] ffff88003e2cbc48 ffffffff8107d132 ffff88003f80a2e0 0000000000000001 [ 3.822246] 00007f39dd70a000 0000000000000000 00007f39dd70a000 ffff88003e2cbc58 [ 3.822252] Call Trace: [ 3.822262] [<ffffffff8101a095>] try_stack_unwind+0x175/0x190 [ 3.822270] [<ffffffff81018fe9>] dump_trace+0x69/0x3a0 [ 3.822275] [<ffffffff8101a0fb>] show_trace_log_lvl+0x4b/0x60 [ 3.822281] [<ffffffff8101942c>] show_stack_log_lvl+0x10c/0x180 [ 3.822286] [<ffffffff8101a195>] show_stack+0x25/0x50 [ 3.822291] [<ffffffff8137f629>] dump_stack+0x4b/0x72 [ 3.822300] [<ffffffff8107d132>] warn_slowpath_common+0x82/0xc0 [ 3.822307] [<ffffffff8107d22a>] warn_slowpath_null+0x1a/0x20 [ 3.822313] [<ffffffff81006b05>] xen_mc_flush+0x1c5/0x1d0 [ 3.822319] [<ffffffff81007365>] xen_leave_lazy_mmu+0x15/0x30 [ 3.822325] [<ffffffff811b637d>] remap_pfn_range+0x34d/0x430 [ 3.822335] [<ffffffff81495b9f>] mmap_mem+0xcf/0x120 [ 3.822343] [<ffffffff811bc867>] mmap_region+0x3f7/0x680 [ 3.822349] [<ffffffff811bce23>] do_mmap+0x333/0x420 [ 3.822356] [<ffffffff811a33f1>] vm_mmap_pgoff+0x91/0xc0 [ 3.822362] [<ffffffff811bb1ff>] SyS_mmap_pgoff+0x19f/0x260 [ 3.822369] [<ffffffff8101c0db>] SyS_mmap+0x1b/0x30 [ 3.822377] [<ffffffff816aa076>] entry_SYSCALL_64_fastpath+0x16/0x75 [ 3.823451] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x16/0x75 [ 3.823456] [ 3.823458] Leftover inexact backtrace: [ 3.823458] [ 3.823464] ---[ end trace 17b394fd426e244c ]--- Juergen looked at the logs and believed that the system is trying to setup the framebuffer, and there was a negative return code from a hypercall. He also believes the installation kernel appears to be okay, so this may be due to a problem in the install initrd. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=964342
Mike Latimer
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c1
Charles Arnold
http://bugzilla.suse.com/show_bug.cgi?id=964342
Mike Latimer
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c2
Mike Latimer
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c3
--- Comment #3 from Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c4
Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c5
--- Comment #5 from Steffen Winterfeldt
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c6
Olaf Hering
What was the problem again?
"xen_fbfront and xen_kbdfront are missing" -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c7
Steffen Winterfeldt
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c8
--- Comment #8 from Olaf Hering
How so if they are mentioned in the oops in comment 0?
blk and net is loaded, but fb and kbd is missing in initrd. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c9
Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c10
--- Comment #10 from Mike Latimer
(In reply to Steffen Winterfeldt from comment #7)
How so if they are mentioned in the oops in comment 0?
blk and net is loaded, but fb and kbd is missing in initrd.
The only modules in the initrd are loop and squashfs. Are the rest built into the kernel itself? As I mentioned in comment #2, hacking more modules into the initrd does not seem to help. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c11
--- Comment #11 from Olaf Hering
(In reply to Olaf Hering from comment #8)
(In reply to Steffen Winterfeldt from comment #7)
How so if they are mentioned in the oops in comment 0?
blk and net is loaded, but fb and kbd is missing in initrd.
The only modules in the initrd are loop and squashfs. Are the rest built into the kernel itself? As I mentioned in comment #2, hacking more modules into the initrd does not seem to help.
The initrd contains a compressed image which contains additional drivers. Guess I need to fork the pkg and do a pull rq.... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=964342
Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c12
Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c13
Steffen Winterfeldt
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c14
--- Comment #14 from Mike Latimer
pr merged
Thanks, but I don't think that's enough... I manually tested this change, and the installation gets slightly further, but then encounters the following errors:
openSUSE Tumbleweed installation program v5.0.70 (c) 1996-2015 SUSE LLC <<< Starting udev... [ 13.737791] udevd[186]: starting version 228 [ 13.744094] random: udevd urandom read with 52 bits of entropy available [ 13.757858] udevd[188]: specified group 'input' unknown [ 13.847098] xen_fbfront: Unknown symbol fb_sys_write (err 0) [ 13.847116] xen_fbfront: Unknown symbol sys_imageblit (err 0) [ 13.847124] xen_fbfront: Unknown symbol sys_fillrect (err 0) [ 13.847145] xen_fbfront: Unknown symbol sys_copyarea (err 0) [ 13.847155] xen_fbfront: Unknown symbol fb_sys_read (err 0) [ 13.848120] xen_netfront: Initialising Xen virtual ethernet driver [ 13.901052] blkfront: xvda: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: disabled; [ 13.923135] input: Xen Virtual Keyboard as /devices/virtual/input/input0 [ 13.923216] input: Xen Virtual Pointer as /devices/virtual/input/input1
I manually changed the module.config file (in the squashfs 00_lib, in the initrd) to specify the dependencies on the fbfront driver as follows: xen-fbfront,,,sysimgblt sysfillrect syscopyarea fb_sys_fops However, I still see the same "Unknown symbol" errors. (All required modules are in the lib/modules/4.4.0-3-default/initrd directory, within the 00_lib fs.) It's possible my manual hacking missed something, but modules.dep seems fine and the module.config file looks correct. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c15
Mike Latimer
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c16
Steffen Winterfeldt
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c17
--- Comment #17 from Olaf Hering
Reopening until we get a complete solution.
I think for testing it is required to build system:install:head/installation-images locally with 'osc build --userootforbuild' and use linux/initrd from the resulting binary packages. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c18
Mike Latimer
(In reply to Mike Latimer from comment #15)
Reopening until we get a complete solution.
I think for testing it is required to build system:install:head/installation-images locally with 'osc build --userootforbuild' and use linux/initrd from the resulting binary packages.
Thanks for the tip. I followed your advice and tested a local build. The problem I pointed out in comment #14 did not occur (so that is good). However, I then encountered the following situation: [ 3.866285] input: Xen Virtual Keyboard as /devices/virtual/input/input0 [ 3.866384] input: Xen Virtual Pointer as /devices/virtual/input/input1 [ 3.871274] udevd[215]: failed to execute '/usr/lib/systemd/systemd-vconsole-setup' '/usr/lib/systemd/systemd-vconsole-setup': No such file or directory [ 3.871595] udevd[191]: Process '/usr/lib/systemd/systemd-vconsole-setup' failed with exit code 2. ok This issue is easily resolved with something like (which will need to be pushed upstream): diff -Nurp a/data/initrd/initrd.file_list b/data/initrd/initrd.file_list --- a/data/initrd/initrd.file_list 2016-02-22 15:33:21.829186121 -0700 +++ b/data/initrd/initrd.file_list 2016-02-22 15:33:07.097170345 -0700 @@ -366,6 +366,7 @@ procps: # maybe we don't need everything... systemd: + /usr/lib/systemd/systemd-vconsole-setup /usr/bin/systemd-detect-virt /usr/lib/systemd/systemd-sysctl /usr/lib*/libsystemd*.so.* However, after fixing that issue, there are no further errors or warnings about xen drivers, and I am still encountering the crash reported in comment #0: [ 4.945829] ------------[ cut here ]------------ [ 4.945883] WARNING: CPU: 0 PID: 1 at ../arch/x86/xen/multicalls.c:129 xen_mc_flush+0x1c5/0x1d0() [ 4.945936] Modules linked in: sunrpc xen_kbdfront xen_fbfront syscopyarea sysfillrect sysimgblt xen_netfront fb_sys_fops xen_blkfront crc32c_intel scsi_dh_rdac scsi_dh_emc scsi_dh_alua squashfs loop [ 4.948886] CPU: 0 PID: 1 Comm: init Not tainted 4.4.1-1-default #1 [ 4.949854] ffffffff81a47959 ffff88003e2cbc10 ffffffff8137ea59 0000000000000000 [ 4.950851] ffff88003e2cbc48 ffffffff8107c552 ffff88003f80a2e0 0000000000000001 [ 4.951829] 00007f831908b000 0000000000000000 00007f831908b000 ffff88003e2cbc58 [ 4.952826] Call Trace: [ 4.953778] [<ffffffff8101a095>] try_stack_unwind+0x175/0x190 [ 4.954737] [<ffffffff81018fe9>] dump_trace+0x69/0x3a0 [ 4.955670] [<ffffffff8101a0fb>] show_trace_log_lvl+0x4b/0x60 [ 4.956596] [<ffffffff8101942c>] show_stack_log_lvl+0x10c/0x180 [ 4.957511] [<ffffffff8101a195>] show_stack+0x25/0x50 [ 4.958410] [<ffffffff8137ea59>] dump_stack+0x4b/0x72 [ 4.959295] [<ffffffff8107c552>] warn_slowpath_common+0x82/0xc0 [ 4.960346] [<ffffffff8107c64a>] warn_slowpath_null+0x1a/0x20 [ 4.961227] [<ffffffff81006b05>] xen_mc_flush+0x1c5/0x1d0 [ 4.962109] [<ffffffff81007365>] xen_leave_lazy_mmu+0x15/0x30 [ 4.962987] [<ffffffff811b581d>] remap_pfn_range+0x34d/0x430 [ 4.963864] [<ffffffff81494f6f>] mmap_mem+0xcf/0x120 [ 4.964744] [<ffffffff811bbd07>] mmap_region+0x3f7/0x680 [ 4.965598] [<ffffffff811bc2c3>] do_mmap+0x333/0x420 [ 4.966437] [<ffffffff811a2891>] vm_mmap_pgoff+0x91/0xc0 [ 4.967260] [<ffffffff811ba69f>] SyS_mmap_pgoff+0x19f/0x260 [ 4.968080] [<ffffffff8101c0db>] SyS_mmap+0x1b/0x30 [ 4.968880] [<ffffffff816a94f6>] entry_SYSCALL_64_fastpath+0x16/0x75 [ 4.972514] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x16/0x75 [ 4.973327] [ 4.974107] Leftover inexact backtrace: [ 4.974107] [ 4.975646] ---[ end trace c909cd4ca9382cdf ]--- In other words, all of these changes have been helpful, but not directly related to the crash. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c19
--- Comment #19 from Mike Latimer
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c20
--- Comment #20 from Steffen Winterfeldt
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c21
--- Comment #21 from Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c22
--- Comment #22 from Steffen Winterfeldt
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c23
Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c24
--- Comment #24 from Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c25
--- Comment #25 from Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c26
--- Comment #26 from Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c27
Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c28
Olaf Hering
http://bugzilla.suse.com/show_bug.cgi?id=964342
http://bugzilla.suse.com/show_bug.cgi?id=964342#c29
Mike Latimer
participants (1)
-
bugzilla_noreply@novell.com