Bug ID 1206794
Summary Frequent Kernel Crashes due to problem in af_key.c
Classification openSUSE
Product openSUSE Distribution
Version Leap 15.4
Hardware Other
OS Other
Status NEW
Severity Normal
Priority P5 - None
Component Kernel
Assignee kernel-bugs@opensuse.org
Reporter Manfred.Haertel@lotto-rlp.de
QA Contact qa-bugs@suse.de
Found By ---
Blocker ---

We experienced frequent kernel crashes on some of our OpenSuse based IPSEC
Gateways.

The crash dump looks like this:

[141226.000411][ T3410] BUG: kernel NULL pointer dereference, address:
0000000000000000
[141226.093721][ T3410] #PF: supervisor write access in kernel mode
[141226.166051][ T3410] #PF: error_code(0x0002) - not-present page
[141226.237334][ T3410] PGD 0 P4D 0
[141226.277176][ T3410] Oops: 0002 [#1] PREEMPT SMP PTI
[141226.336929][ T3410] CPU: 13 PID: 3410 Comm: pluto Tainted: G          I    
  N 5.14.21-150400.24.33-default #1 SLE15-SP4
018fa6ac9e6418760457307f6579fc12335772aa
[141226.513033][ T3410] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380
Gen10, BIOS U30 11/13/2019
[141226.625191][ T3410] RIP: 0010:__xfrm_state_delete+0xc0/0x1e0
[141226.694385][ T3410] Code: 02 74 04 48 89 50 08 8b 93 cc 00 00 00 48 b8 22
01 00 00 00 00 ad de 48 89 43 20 85 d2 74 22 48 8b 43 38 48 8b 53 40 48 85 c0
<48> 89 02 74 04 48 89 50 08 48 b8 22 01 00 00 00 00 ad de 48 89 43
[141226.930232][ T3410] RSP: 0018:ffffb6bd410879d0 EFLAGS: 00010246
[141227.002562][ T3410] RAX: 0000000000000000 RBX: ffff9c29c543a700 RCX:
00000000fffffffd
[141227.097953][ T3410] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffffffffb84e1700
[141227.193343][ T3410] RBP: ffffffffb84e0840 R08: 0000000000000001 R09:
0000000000000002
[141227.288736][ T3410] R10: 0000000000000002 R11: 0000000000000032 R12:
ffffffffb84e1700
[141227.384127][ T3410] R13: 0000000000000000 R14: 0000000000000000 R15:
ffffffffb84e0840
[141227.479517][ T3410] FS:  00007fe961572040(0000) GS:ffff9c2c70140000(0000)
knlGS:0000000000000000
[141227.586435][ T3410] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[141227.665053][ T3410] CR2: 0000000000000000 CR3: 0000000110a4c001 CR4:
00000000007706e0
[141227.760443][ T3410] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[141227.855831][ T3410] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[141227.951222][ T3410] PKRU: 55555554
[141227.993154][ T3410] Call Trace:
[141228.031944][ T3410]  <TASK>
[141228.066541][ T3410]  xfrm_state_delete+0x1e/0x40
[141228.123155][ T3410]  xfrm_del_sa+0xcf/0x130 [xfrm_user
056771028dad419158fe83c515fe22924b96b850]
[141228.230080][ T3410]  xfrm_user_rcv_msg+0x26c/0x2e0 [xfrm_user
056771028dad419158fe83c515fe22924b96b850]
[141228.344341][ T3410]  ? __kmalloc_node_track_caller+0x1ba/0x3b0
[141228.415632][ T3410]  ? netlink_deliver_tap+0x3a/0x1f0
[141228.477490][ T3410]  ? copy_to_user_tmpl.part.21+0x150/0x150 [xfrm_user
056771028dad419158fe83c515fe22924b96b850]
[141228.602229][ T3410]  netlink_rcv_skb+0x4e/0x100
[141228.657796][ T3410]  xfrm_netlink_rcv+0x30/0x40 [xfrm_user
056771028dad419158fe83c515fe22924b96b850]
[141228.768914][ T3410]  netlink_unicast+0x1b3/0x280
[141228.825527][ T3410]  netlink_sendmsg+0x320/0x450
[141228.882143][ T3410]  sock_sendmsg+0x5c/0x70
[141228.933515][ T3410]  sock_write_iter+0x97/0x100
[141228.989082][ T3410]  new_sync_write+0x1a1/0x1c0
[141229.044648][ T3410]  vfs_write+0x220/0x280
[141229.094972][ T3410]  ksys_write+0x50/0xe0
[141229.144246][ T3410]  ? do_syscall_64+0x67/0x80
[141229.198758][ T3410]  ? __x64_sys_poll+0x37/0x140
[141229.255372][ T3410]  do_syscall_64+0x58/0x80
[141229.307788][ T3410]  ? __task_pid_nr_ns+0x97/0xb0
[141229.365449][ T3410]  ? syscall_exit_to_user_mode+0x18/0x40
[141229.432544][ T3410]  ? do_syscall_64+0x67/0x80
[141229.487061][ T3410]  ? do_syscall_64+0x67/0x80
[141229.541572][ T3410]  ? do_syscall_64+0x67/0x80
[141229.596083][ T3410]  ? irq_exit_rcu+0x41/0xc0
[141229.649553][ T3410]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[141229.719793][ T3410] RIP: 0033:0x7fe960995af3
[141229.772207][ T3410] Code: 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90
90 90 90 90 90 90 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05
<48> 3d 00 f0 ff ff 77 55 f3 c3 0f 1f 00 41 54 55 49 89 d4 53 48 89
[141230.008050][ T3410] RSP: 002b:00007ffd0916cad8 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[141230.108679][ T3410] RAX: ffffffffffffffda RBX: 00007ffd0916cf80 RCX:
00007fe960995af3
[141230.204070][ T3410] RDX: 0000000000000028 RSI: 00007ffd0916cf80 RDI:
0000000000000017
[141230.299458][ T3410] RBP: 0000000000000028 R08: 00007ffd0916d4f0 R09:
00007ffd0916cf80
[141230.394847][ T3410] R10: 00007ffd0916d23b R11: 0000000000000246 R12:
000055d3d8f12139
[141230.490237][ T3410] R13: 00007ffd0916d4f0 R14: 00007ffd0916d400 R15:
00000000680f3536
[141230.585630][ T3410]  </TASK>
[141230.621277][ T3410] Modules linked in: binfmt_misc authenc echainiv cmac
rmd160 camellia_generic camellia_aesni_avx2 camellia_aesni_avx_x86_64
camellia_x86_64 cast6_avx_x86_64 cast6_generic cast5_avx_x86_64 cast5_generic
cast_common deflate gcm ccm serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64
serpent_generic blowfish_generic blowfish_x86_64 blowfish_common
twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64
twofish_common xcbc sha256_ssse3 sha512_ssse3 des_generic libdes xfrm_user ah6
ah4 esp6 esp4 xfrm4_tunnel tunnel4 ipcomp ipcomp6 xfrm6_tunnel xfrm_ipcomp
tunnel6 af_key xfrm_algo nfnetlink bluetooth ecdh_generic iptable_nat xt_nat
nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 nf_defrag_ipv6 nf_defrag_ipv4
libcrc32c xt_LOG nf_log_syslog iptable_filter xt_policy xt_TCPMSS xt_tcpudp
bpfilter af_packet iscsi_ibft iscsi_boot_sysfs rfkill dmi_sysfs intel_rapl_msr
intel_rapl_common ipmi_ssif nfit libnvdimm x86_pkg_temp_thermal
intel_powerclamp coretemp
[141230.621341][ T3410]  kvm_intel nls_iso8859_1 kvm nls_cp437 vfat fat cdc_eem
ses usbnet pl2303 usbserial irqbypass pcspkr efi_pstore(N) enclosure mii tg3
mei_me libphy mei lpc_ich mfd_core hpilo acpi_ipmi intel_pch_thermal ioatdma
ipmi_si dca ipmi_devintf ipmi_msghandler acpi_tad(N) button fuse configfs
ip_tables x_tables ext4 crc16 mbcache jbd2 hid_generic usbhid uas usb_storage
sr_mod sd_mod cdrom t10_pi crc32_pclmul crc32c_intel ghash_clmulni_intel
mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec
rc_core xhci_pci aesni_intel xhci_pci_renesas drm xhci_hcd ahci smartpqi
crypto_simd cryptd libahci ehci_pci ehci_hcd scsi_transport_sas usbcore libata
i2c_algo_bit hpwdt wmi sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc
scsi_dh_alua scsi_mod msr efivarfs
[141232.491336][ T3410] Supported: No, Unsupported modules are loaded
[141232.565764][ T3410] CR2: 0000000000000000
[141232.615034][ T3410] ---[ end trace 7154d1cff8910ed6 ]---
[141232.759152][ T3410] RIP: 0010:__xfrm_state_delete+0xc0/0x1e0
[141232.828350][ T3410] Code: 02 74 04 48 89 50 08 8b 93 cc 00 00 00 48 b8 22
01 00 00 00 00 ad de 48 89 43 20 85 d2 74 22 48 8b 43 38 48 8b 53 40 48 85 c0
<48> 89 02 74 04 48 89 50 08 48 b8 22 01 00 00 00 00 ad de 48 89 43
[141233.064192][ T3410] RSP: 0018:ffffb6bd410879d0 EFLAGS: 00010246
[141233.136524][ T3410] RAX: 0000000000000000 RBX: ffff9c29c543a700 RCX:
00000000fffffffd
[141233.231919][ T3410] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffffffffb84e1700
[141233.327309][ T3410] RBP: ffffffffb84e0840 R08: 0000000000000001 R09:
0000000000000002
[141233.422697][ T3410] R10: 0000000000000002 R11: 0000000000000032 R12:
ffffffffb84e1700
[141233.518087][ T3410] R13: 0000000000000000 R14: 0000000000000000 R15:
ffffffffb84e0840
[141233.613477][ T3410] FS:  00007fe961572040(0000) GS:ffff9c2c70140000(0000)
knlGS:0000000000000000
[141233.720401][ T3410] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[141233.799019][ T3410] CR2: 0000000000000000 CR3: 0000000110a4c001 CR4:
00000000007706e0
[141233.894409][ T3410] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[141233.989799][ T3410] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[141234.085189][ T3410] PKRU: 55555554
[141234.127122][ T3410] Kernel panic - not syncing: Fatal exception in
interrupt
[141234.241007][ T3410] Kernel Offset: 0x35800000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[141234.450223][ T3410] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---

We found out that this problem is actually a known problem in af_key.c and a
one line patch exists, see

https://patchwork.kernel.org/project/netdevbpf/patch/20221102101848.ibvumaxg2jdvk52y@intra2net.com/

However, it seems that the current OpenSuse 15.4 kernel does NOT include this
patch.

Applying the patch manually fixed the problem for us.

We would like to see the patch included in the OpenSuse kernel.


You are receiving this mail because: