[Bug 1219406] New: [BUG] kernel NULL pointer dereference with Linux 6.7.1-2-default
https://bugzilla.suse.com/show_bug.cgi?id=1219406 Bug ID: 1219406 Summary: [BUG] kernel NULL pointer dereference with Linux 6.7.1-2-default Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: kostas.peletidis@suse.com QA Contact: qa-bugs@suse.de Target Milestone: --- Found By: --- Blocker: --- I just saw this bug on my work laptop. The desktop froze and only the mouse pointer was responsive. I connected to the laptop from another machine and noticed that Xorg was a zombie process. I also saw the following kernel messages: [ 8550.326847] pcieport 0000:00:08.1: PME: Spurious native interrupt! [13172.962208] BUG: kernel NULL pointer dereference, address: 0000000000000000 [13172.962228] #PF: supervisor read access in kernel mode [13172.962235] #PF: error_code(0x0000) - not-present page [13172.962243] PGD 0 P4D 0 [13172.962255] Oops: 0000 [#1] PREEMPT SMP NOPTI [13172.962266] CPU: 11 PID: 2019 Comm: Xorg.bin Not tainted 6.7.1-2-default #1 openSUSE Tumbleweed d50116cfdb1b14a701e904c894d8f1c040bf1146 [13172.962281] Hardware name: LENOVO 20XGS0V508/20XGS0V508, BIOS R1NET47W (1.17) 12/21/2021 [13172.962289] RIP: 0010:drm_mode_rmfb+0xb6/0x1c0 [13172.962308] Code: 00 00 4c 89 ef e8 7a 0e 3e 00 48 8b 83 98 00 00 00 48 2d 98 00 00 00 48 39 c3 0f 84 eb 00 00 00 31 d2 b9 01 00 00 00 4c 39 e0 <48> 8b 80 98 00 00 00 0f 44 d1 48 2d 98 00 00 00 48 39 c3 75 e8 85 [13172.962317] RSP: 0018:ffffa86fc2bbfc80 EFLAGS: 00010202 [13172.962327] RAX: ffffffffffffff68 RBX: ffff941bc60f1800 RCX: 0000000000000001 [13172.962334] RDX: 0000000000000001 RSI: ffff941bc2004920 RDI: ffff941bc60f18a8 [13172.962341] RBP: ffff941e7352b318 R08: ffff941bc2004b18 R09: ffff941c88c80200 [13172.962347] R10: 0000000000000000 R11: 0000000000000000 R12: ffff941e7352b300 [13172.962354] R13: ffff941bc60f18a8 R14: ffffa86fc2bbfd68 R15: 0000000000000004 [13172.962361] FS: 00007faa06805980(0000) GS:ffff941e9ef80000(0000) knlGS:0000000000000000 [13172.962368] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [13172.962374] CR2: 0000000000000000 CR3: 00000001045ce000 CR4: 0000000000750ef0 [13172.962379] PKRU: 55555554 [13172.962384] Call Trace: [13172.962390] <TASK> [13172.962403] ? __die+0x23/0x70 [13172.962423] ? page_fault_oops+0x14d/0x490 [13172.962434] ? ttwu_queue_wakelist+0xef/0x110 [13172.962446] ? srso_alias_return_thunk+0x5/0xfbef5 [13172.962468] ? exc_page_fault+0x71/0x160 [13172.962480] ? asm_exc_page_fault+0x26/0x30 [13172.962495] ? drm_mode_rmfb+0xb6/0x1c0 [13172.962508] ? __pfx_drm_mode_rmfb_ioctl+0x10/0x10 [13172.962516] drm_ioctl_kernel+0xce/0x170 [13172.962525] ? __pfx_drm_mode_page_flip_ioctl+0x10/0x10 [13172.962543] drm_ioctl+0x256/0x490 [13172.962552] ? __pfx_drm_mode_rmfb_ioctl+0x10/0x10 [13172.962561] ? __pfx_drm_mode_page_flip_ioctl+0x10/0x10 [13172.962580] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu c19de16ba0fd72478b307639f09a9c13c52c8d28] [13172.963085] __x64_sys_ioctl+0x97/0xd0 [13172.963098] do_syscall_64+0x64/0xe0 [13172.963108] ? srso_alias_return_thunk+0x5/0xfbef5 [13172.963116] ? syscall_exit_to_user_mode+0x2b/0x40 [13172.963122] ? srso_alias_return_thunk+0x5/0xfbef5 [13172.963129] ? do_syscall_64+0x70/0xe0 [13172.963137] ? switch_fpu_return+0x50/0xe0 [13172.963147] ? srso_alias_return_thunk+0x5/0xfbef5 [13172.963154] ? exit_to_user_mode_prepare+0x142/0x1f0 [13172.963165] ? srso_alias_return_thunk+0x5/0xfbef5 [13172.963172] ? syscall_exit_to_user_mode+0x2b/0x40 [13172.963178] ? srso_alias_return_thunk+0x5/0xfbef5 [13172.963185] ? do_syscall_64+0x70/0xe0 [13172.963192] ? srso_alias_return_thunk+0x5/0xfbef5 [13172.963199] ? do_syscall_64+0x70/0xe0 [13172.963206] ? syscall_exit_to_user_mode+0x2b/0x40 [13172.963212] ? srso_alias_return_thunk+0x5/0xfbef5 [13172.963219] ? do_syscall_64+0x70/0xe0 [13172.963227] ? __irq_exit_rcu+0x3b/0xb0 [13172.963242] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [13172.963254] RIP: 0033:0x7faa067139ef [13172.963332] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 [13172.963338] RSP: 002b:00007fff5e104280 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [13172.963346] RAX: ffffffffffffffda RBX: 00005608e31df550 RCX: 00007faa067139ef [13172.963351] RDX: 00007fff5e10431c RSI: 00000000c00464af RDI: 000000000000000e [13172.963355] RBP: 00007fff5e10431c R08: 00000005608e3575 R09: 0000000000000007 [13172.963360] R10: 00005608e35751a0 R11: 0000000000000246 R12: 00000000c00464af [13172.963364] R13: 000000000000000e R14: 00005608e12f3ff0 R15: 0000000000000040 [13172.963377] </TASK> [13172.963381] Modules linked in: tun rfcomm nf_conntrack_netbios_ns nf_conntrack_broadcast ccm af_packet nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink qrtr cmac algif_hash algif_skcipher af_alg bnep msr binfmt_misc snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 nls_iso8859_1 snd_soc_dmic snd_acp3x_pdm_dma snd_acp3x_rn snd_sof_amd_acp63 nls_cp437 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir vfat snd_sof_amd_acp fat snd_ctl_led snd_sof_pci snd_sof_xtensa_dsp snd_hda_codec_realtek mt7921e snd_sof mt7921_common snd_hda_codec_generic btusb mt792x_lib snd_sof_utils btrtl mt76_connac_lib snd_hda_codec_hdmi btintel uvcvideo snd_soc_core intel_rapl_msr btbcm intel_rapl_common mt76 videobuf2_vmalloc btmtk snd_compress uvc snd_pcm_dmaengine snd_hda_intel videobuf2_memops edac_mce_amd bluetooth videobuf2_v4l2 snd_pci_ps snd_intel_dspcfg snd_intel_sdw_acpi [13172.963548] snd_rpl_pci_acp6x mac80211 videodev r8169 snd_acp_pci libarc4 thinkpad_acpi kvm_amd snd_acp_legacy_common snd_hda_codec snd_pci_acp6x videobuf2_common snd_pci_acp5x snd_hda_core realtek ecdh_generic mc ledtrig_audio snd_hwdep kvm mdio_devres cfg80211 snd_rn_pci_acp3x snd_pcm think_lmi platform_profile snd_acp_config irqbypass firmware_attributes_class snd_timer snd_soc_acpi wmi_bmof tiny_power_button efi_pstore libphy rfkill k10temp snd_pci_acp3x i2c_piix4 snd thermal soundcore ac joydev button nvme_fabrics fuse configfs dmi_sysfs ip_tables x_tables usbhid amdgpu crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched nvme drm_suballoc_helper xhci_pci drm_buddy xhci_pci_renesas ucsi_acpi hid_multitouch drm_display_helper nvme_core xhci_hcd aesni_intel cec typec_ucsi video hid_generic nvme_auth crypto_simd cryptd usbcore ccp roles rc_core t10_pi typec sp5100_tco battery [13172.963741] i2c_hid_acpi wmi i2c_hid serio_raw btrfs blake2b_generic libcrc32c crc32c_intel xor raid6_pq br_netfilter bridge stp llc efivarfs [13172.963781] CR2: 0000000000000000 [13172.963787] ---[ end trace 0000000000000000 ]--- [13172.963792] RIP: 0010:drm_mode_rmfb+0xb6/0x1c0 [13172.963801] Code: 00 00 4c 89 ef e8 7a 0e 3e 00 48 8b 83 98 00 00 00 48 2d 98 00 00 00 48 39 c3 0f 84 eb 00 00 00 31 d2 b9 01 00 00 00 4c 39 e0 <48> 8b 80 98 00 00 00 0f 44 d1 48 2d 98 00 00 00 48 39 c3 75 e8 85 [13172.963807] RSP: 0018:ffffa86fc2bbfc80 EFLAGS: 00010202 [13172.963813] RAX: ffffffffffffff68 RBX: ffff941bc60f1800 RCX: 0000000000000001 [13172.963818] RDX: 0000000000000001 RSI: ffff941bc2004920 RDI: ffff941bc60f18a8 [13172.963822] RBP: ffff941e7352b318 R08: ffff941bc2004b18 R09: ffff941c88c80200 [13172.963827] R10: 0000000000000000 R11: 0000000000000000 R12: ffff941e7352b300 [13172.963831] R13: ffff941bc60f18a8 R14: ffffa86fc2bbfd68 R15: 0000000000000004 [13172.963836] FS: 00007faa06805980(0000) GS:ffff941e9ef80000(0000) knlGS:0000000000000000 [13172.963841] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [13172.963846] CR2: 0000000000000000 CR3: 00000001045ce000 CR4: 0000000000750ef0 [13172.963851] PKRU: 55555554 [13172.963856] note: Xorg.bin[2019] exited with irqs disabled -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1219406 https://bugzilla.suse.com/show_bug.cgi?id=1219406#c1 --- Comment #1 from Kostas Peletidis <kostas.peletidis@suse.com> --- According to this email: https://lkml.iu.edu/hypermail/linux/kernel/2401.3/05636.html a very similar issue has been fixed recently: "So we had a number of small annoying issues in rc1, including an amdgpu scheduling bug that could cause a hung desktop (that would *eventually* recover, but after a long enough timeout that most people probably ended up rebooting instead. That one seems to have hit a fair number of people." Therefore, the fix for this bug may be this commit which has been included in Linux v6.8-rc2: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.8-rc2&id=bc8f6d42b1334f486980d57c8d12f3128d30c2e3 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1219406 https://bugzilla.suse.com/show_bug.cgi?id=1219406#c2 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(kostas.peletidis@ | |suse.com) CC| |kostas.peletidis@suse.com, | |tiwai@suse.com --- Comment #2 from Takashi Iwai <tiwai@suse.com> --- (In reply to Kostas Peletidis from comment #1)
According to this email:
https://lkml.iu.edu/hypermail/linux/kernel/2401.3/05636.html
a very similar issue has been fixed recently:
"So we had a number of small annoying issues in rc1, including an amdgpu scheduling bug that could cause a hung desktop (that would *eventually* recover, but after a long enough timeout that most people probably ended up rebooting instead. That one seems to have hit a fair number of people."
Therefore, the fix for this bug may be this commit which has been included in Linux v6.8-rc2:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ ?h=v6.8-rc2&id=bc8f6d42b1334f486980d57c8d12f3128d30c2e3
This commit is likely irrelevant. But there have been lots of fixes in 6.7.x stable. Please try 6.7.3 (or later) on OBS Kernel:stable repo http://download.opensuse.org/repositories/Kernel:/stable/standard/ -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1219406 https://bugzilla.suse.com/show_bug.cgi?id=1219406#c3 --- Comment #3 from Kostas Peletidis <kostas.peletidis@suse.com> --- Indeed. I don't have access to my work laptop these days but I found a TW virtual machine and saw that the commit I was hoping will fix the bug involves a file that doesn't exist in 6.7.2. So, although it fixes a null pointer dereference, it doesn't address the bug I saw. I'll try a more recent kernel when I return from my leave. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1219406 https://bugzilla.suse.com/show_bug.cgi?id=1219406#c4 Kostas Peletidis <kostas.peletidis@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(kostas.peletidis@ |needinfo?(tiwai@suse.com) |suse.com) | --- Comment #4 from Kostas Peletidis <kostas.peletidis@suse.com> --- I haven't seen this bug with more recent kernels. Shall we close? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1219406 https://bugzilla.suse.com/show_bug.cgi?id=1219406#c5 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Flags|needinfo?(tiwai@suse.com) | Resolution|--- |FIXED --- Comment #5 from Takashi Iwai <tiwai@suse.com> --- Yes, let's close, then. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com