New subject: [Bug 1200259] Kernel Panic in bfq_bic_update_cgroup while fstrim after Update to Kernel 5.3.18-150300.59.68.1

6 Jun 2022


      http://bugzilla.opensuse.org/show_bug.cgi?id=1200259


            Bug ID: 1200259
           Summary: Kernel Panic after Update to 5.3.18-150300.59.68.1
    Classification: openSUSE
           Product: openSUSE Distribution
           Version: Leap 15.3
          Hardware: x86-64
                OS: openSUSE Leap 15.3
            Status: NEW
          Severity: Major
          Priority: P5 - None
         Component: Kernel
          Assignee: kernel-bugs@opensuse.org
          Reporter: fgruener@web.de
        QA Contact: qa-bugs@suse.de
          Found By: ---
           Blocker: ---

Created attachment 859446
  --> http://bugzilla.opensuse.org/attachment.cgi?id=859446&action=edit
dmesg from crash>log

After upgrading the default kernel ("kernel-default") provided via regular
update from the official OpenSuse Leap 15.3 (Upstream SLES) repository from 
5.3.18-150300.59.63.1-default to the newest Leap Standard Kernel
5.3.18-150300.59.68.1-default, I encountered already several Kernel Panics.
This could happened within an one day period 2 to 3 times, but also last time
it took 2 to 3 days, until it reoccurs.

An possibility to reproduce the issue step by step, I have not yet found, only
be using my PC.

crash>log unveiled a null pointer execption. Please see dmesg attachted.

Enabling KDUMP unveiled the following backtrace:
crash> bt
PID: 23800  TASK: ffff8d1b32ae8000  CPU: 0   COMMAND: "fstrim"
 #0 [ffffb2134512f790] machine_kexec at ffffffffa7e6fe01
 #1 [ffffb2134512f7e8] __crash_kexec at ffffffffa7f595fd
 #2 [ffffb2134512f8b0] crash_kexec at ffffffffa7f5a4bd
 #3 [ffffb2134512f8c8] oops_end at ffffffffa7e36d3f
 #4 [ffffb2134512f8e8] no_context at ffffffffa7e82bbf
 #5 [ffffb2134512f950] do_page_fault at ffffffffa7e83e40
 #6 [ffffb2134512f980] page_fault at ffffffffa880130e
    [exception RIP: bfq_bio_bfqg+37]
    RIP: ffffffffa8277b55  RSP: ffffb2134512fa30  RFLAGS: 00010002
    RAX: 000000000000001f  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: ffff8d1b8f614e00  RSI: ffff8d1b7fd47200  RDI: ffff8d1b7fd47200
    RBP: ffff8d1887ff0800   R8: ffff8d199aeb54b8   R9: ffff8d199aeb5488
    R10: 0000000000000000  R11: ffff8d1ab7742e00  R12: ffff8d17c7744640
    R13: ffff8d1887ff0800  R14: ffff8d1b7fd47200  R15: ffff8d1b8d91a894
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb2134512fa40] bfq_bic_update_cgroup at ffffffffa8277e78
 #8 [ffffb2134512fa78] bfq_bio_merge at ffffffffa826ee9f
 #9 [ffffb2134512fad0] blk_mq_submit_bio at ffffffffa8248769
#10 [ffffb2134512fb58] submit_bio_noacct at ffffffffa823c343
#11 [ffffb2134512fbe8] submit_bio at ffffffffa823c3db
#12 [ffffb2134512fc38] submit_bio_wait at ffffffffa8234cc4
#13 [ffffb2134512fc78] blkdev_issue_discard at ffffffffa8243d20
#14 [ffffb2134512fd08] ext4_trim_fs at ffffffffc079a7ea [ext4]
#15 [ffffb2134512fe10] ext4_ioctl at ffffffffc0790ef6 [ext4]
#16 [ffffb2134512fef8] ksys_ioctl at ffffffffa80fadc2
#17 [ffffb2134512ff30] __x64_sys_ioctl at ffffffffa80fadf6
#18 [ffffb2134512ff38] do_syscall_64 at ffffffffa7e0538b
#19 [ffffb2134512ff50] entry_SYSCALL_64_after_hwframe at ffffffffa880008c
    RIP: 00007f73ecd54c47  RSP: 00007ffc4e041c08  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 000055764b87c310  RCX: 00007f73ecd54c47
    RDX: 00007ffc4e041c20  RSI: 00000000c0185879  RDI: 0000000000000003
    RBP: 000055764b8788c0   R8: 000055764b87c310   R9: 0000000000000002
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007ffc4e041d00
    R13: 000055764b8788c0  R14: 0000000000000003  R15: 000055764b87a890
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

According to the logfiles of the changed items of the last patch, there was a
patch added to address:
- bfq: Update cgroup information before merging bio (bsc#1197926).

Searching in the internet "bfq_bic_update_cgroup+0x28/0x1b0 core dump opensuse"
I also found: 
https://lkml.kernel.org/linux-block/20220330124255.24581-2-jack@suse.cz/T/

There at least in the same stack changes have been done in the same area.
@@ -2457,10 +2457,17 @@ static bool bfq_bio_merge(struct request_queue *q,
struct bio *bio,
+        bfq_bic_update_cgroup(bic, bio);

Maybe bio is not fully initialized?

Maybe my refernece is wrong, but it is remarkable, that at least the kernel
post contains a change in the same stack.

As this is my first post, I am not quite sure what other information might be
helpfull here. Please ask, if you will need further information.

Thanks already for taking this issue into account. And for your openSuse.

Best regards
cyp

-- 
You are receiving this mail because:
You are on the CC list for the bug.

    

[Bug 1200259] New: Kernel Panic after Update to 5.3.18-150300.59.68.1

bugzilla_noreply＠suse.com

bugzilla_noreply＠suse.com

bugzilla_noreply＠suse.com

bugzilla_noreply＠suse.com

bugzilla_noreply＠suse.com

bugzilla_noreply＠suse.com

bugzilla_noreply＠suse.com

bugzilla_noreply＠suse.com

bugzilla_noreply＠suse.com

tags

participants (1)