Comment # 41 on bug 1199355 from
Just for fun I tried with only the patch from
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-next&id=5d05426e2d5fd7df8afc866b78c36b37b00188b7
on top of 5.18.0-suse

This made the box crash much earlier (about 240 seconds into a build, and I'm
not even sure it was the "known bad" build job, usually it crashes after about
9000 seconds close to the end of the build.

The crash was also completely different:

[  126.798937] EXT4-fs (loop0): mounted filesystem with writeback data mode.
Quota mode: none.
[  393.252701] Unable to handle kernel paging request at virtual address
ffff066bfc222e00
[  393.252735] Mem abort info:
[  393.252739]   ESR = 0x96000004
[  393.252744]   EC = 0x25: DABT (current EL), IL = 32 bits
[  393.252749]   SET = 0, FnV = 0
[  393.252754]   EA = 0, S1PTW = 0
[  393.252757]   FSC = 0x04: level 0 translation fault
[  393.252762] Data abort info:
[  393.252766]   ISV = 0, ISS = 0x00000004
[  393.252770]   CM = 0, WnR = 0
[  393.252774] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000001cd22000
[  393.252780] [ffff066bfc222e00] pgd=0000000000000000, p4d=0000000000000000
[  393.252791] Internal error: Oops: 96000004 [#1] SMP
[  393.252799] Modules linked in: loop tun af_packet iscsi_ibft
iscsi_boot_sysfs nls_iso8859_1 nls_cp437 xfs vfat libcrc32c fat btsdio
cpufreq_dt hci_uart btqca brcmfmac btrtl btbcm brcmutil btintel bluetooth
cfg80211 ecdh_generic rfkill raspberrypi_cpufreq bcm2711_thermal broadcom
bcm_phy_lib iproc_rng200 genet mdio_bcm_unimac leds_gpio simpledrm
drm_shmem_helper nvmem_rmem drm_kms_helper efi_pstore syscopyarea sysfillrect
sysimgblt uio_pdrv_genirq fb_sys_fops uio fuse drm ip_tables x_tables ext4
mbcache jbd2 uas usb_storage xhci_pci xhci_pci_renesas xhci_hcd usbcore
usb_common bcm2835_dma crct10dif_ce clk_raspberrypi gpio_raspberrypi_exp
bcm2835_wdt raspberrypi_hwmon virt_dma sdhci_iproc gpio_regulator phy_generic
sdhci_pltfm pcie_brcmstb sdhci mmc_core fixed sg efivarfs
[  393.252941] CPU: 1 PID: 20 Comm: ksoftirqd/1 Kdump: loaded Not tainted
5.18.0-seife #4 openSUSE Tumbleweed (unreleased)
67de12037530e9d1ee4775af6e6ed4ae03465d51
[  393.252959] Hardware name: Unknown Unknown Product/Unknown Product, BIOS
2022.04 04/01/2022
[  393.252966] pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  393.252973] pc : percpu_ref_get_many+0x40/0xac
[  393.252996] lr : percpu_ref_get_many+0x1c/0xac
[  393.253002] sp : ffff8000080dbb00
[  393.253008] x29: ffff8000080dbb00 x28: ffff6f74002bc000 x27:
ffff6f7401893180
[  393.253018] x26: ffff6f7400189600 x25: fffffc0000000000 x24:
0001000000000000
[  393.253027] x23: ffff6f74052f9200 x22: 0000000000000000 x21:
ffff6f7401893180
[  393.253035] x20: ffff6f7401893180 x19: 0000000000000001 x18:
0000000000000000
[  393.253045] x17: 0000000000000001 x16: ffffd87d06c26ab0 x15:
0000000000000000
[  393.253053] x14: 0000000000000000 x13: 0000000000000030 x12:
0000000000000068
[  393.253062] x11: 0000000000000100 x10: 0000000000001ba0 x9 :
ffffd87d0695c860
[  393.253070] x8 : ffff6f74002bdc00 x7 : ffff96f7f7285000 x6 :
fffffffffffffcb0
[  393.253078] x5 : ffff6f74001a4000 x4 : fffffffffff51e78 x3 :
0000000000000000
[  393.253086] x2 : ffff96f7f7285000 x1 : ffff6f74002bc000 x0 :
ffff066bfc222e00
[  393.253097] Call trace:
[  393.253101]  percpu_ref_get_many+0x40/0xac
[  393.253108]  refill_obj_stock+0x6c/0x194
[  393.253118]  obj_cgroup_uncharge+0x20/0x2c
[  393.253127]  memcg_slab_free_hook+0xa4/0x190
[  393.253136]  kmem_cache_free+0x2d4/0x310
[  393.253144]  __d_free+0x28/0x34
[  393.253153]  rcu_do_batch+0x174/0x6f0
[  393.253162]  rcu_core+0x264/0x3fc
[  393.253171]  rcu_core_si+0x1c/0x30
[  393.253176]  __do_softirq+0x128/0x3b8
[  393.253183]  run_ksoftirqd+0x6c/0x94
[  393.253192]  smpboot_thread_fn+0x230/0x254
[  393.253205]  kthread+0x114/0x120
[  393.253211]  ret_from_fork+0x10/0x20
[  393.253236] Code: 11000442 b9001022 d538d082 8b020000 (c85f7c04) 
[  393.253253] SMP: stopping secondary CPUs
[  393.253304] Starting crashdump kernel...
[  393.253311] ------------[ cut here ]------------
[  393.253318] Some CPUs may be stale, kdump will be unreliable.
[  393.253332] WARNING: CPU: 1 PID: 20 at arch/arm64/kernel/machine_kexec.c:187
machine_kexec+0x48/0x220
[  393.253351] Modules linked in: loop tun af_packet iscsi_ibft
iscsi_boot_sysfs nls_iso8859_1 nls_cp437 xfs vfat libcrc32c fat btsdio
cpufreq_dt hci_uart btqca brcmfmac btrtl btbcm brcmutil btintel bluetooth
cfg80211 ecdh_generic rfkill raspberrypi_cpufreq bcm2711_thermal broadcom
bcm_phy_lib iproc_rng200 genet mdio_bcm_unimac leds_gpio simpledrm
drm_shmem_helper nvmem_rmem drm_kms_helper efi_pstore syscopyarea sysfillrect
sysimgblt uio_pdrv_genirq fb_sys_fops uio fuse drm ip_tables x_tables ext4
mbcache jbd2 uas usb_storage xhci_pci xhci_pci_renesas xhci_hcd usbcore
usb_common bcm2835_dma crct10dif_ce clk_raspberrypi gpio_raspberrypi_exp
bcm2835_wdt raspberrypi_hwmon virt_dma sdhci_iproc gpio_regulator phy_generic
sdhci_pltfm pcie_brcmstb sdhci mmc_core fixed sg efivarfs
[  393.253473] CPU: 1 PID: 20 Comm: ksoftirqd/1 Kdump: loaded Not tainted
5.18.0-seife #4 openSUSE Tumbleweed (unreleased)
67de12037530e9d1ee4775af6e6ed4ae03465d51
[  393.253485] Hardware name: Unknown Unknown Product/Unknown Product, BIOS
2022.04 04/01/2022
[  393.253492] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  393.253501] pc : machine_kexec+0x48/0x220
[  393.253510] lr : machine_kexec+0x48/0x220
[  393.253519] sp : ffff8000080db600
[  393.253525] x29: ffff8000080db600 x28: ffff8000080db863 x27:
ffffd87d078c5178
[  393.253537] x26: ffffd87d078c5170 x25: ffffd87d0695c888 x24:
0000000000000001
[  393.253548] x23: 0000000000000001 x22: ffff8000080db668 x21:
ffffd87d0875a000
[  393.253558] x20: ffff6f7401f52000 x19: ffff6f7401f52000 x18:
ffffffffffffffff
[  393.253569] x17: 3065323232636662 x16: 3636306666666620 x15:
0720072007200720
[  393.253580] x14: 0720072007200720 x13: 0720072007200720 x12:
0720072007200720
[  393.253591] x11: 00000000ffffdfff x10: ffffd87d08489808 x9 :
ffffd87d0670290c
[  393.253602] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 :
00000000000affa8
[  393.253613] x5 : ffff6f74fef5db48 x4 : 0000000000000000 x3 :
0000000000000027
[  393.253623] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
ffff6f74002bc000
[  393.253634] Call trace:
[  393.253640]  machine_kexec+0x48/0x220
[  393.253650]  __crash_kexec+0x80/0x130
[  393.253660]  crash_kexec+0x88/0xa0
[  393.253669]  die+0x16c/0x234
[  393.253678]  die_kernel_fault+0x38c/0x39c
[  393.253687]  __do_kernel_fault+0xf0/0x224
[  393.253695]  do_translation_fault+0xa0/0xd0
[  393.253704]  do_mem_abort+0x4c/0xa0
[  393.253712]  el1_abort+0x74/0xdc
[  393.253721]  el1h_64_sync_handler+0xa4/0xd0
[  393.253731]  el1h_64_sync+0x78/0x7c
[  393.253739]  percpu_ref_get_many+0x40/0xac
[  393.253747]  refill_obj_stock+0x6c/0x194
[  393.253757]  obj_cgroup_uncharge+0x20/0x2c
[  393.253766]  memcg_slab_free_hook+0xa4/0x190
[  393.253776]  kmem_cache_free+0x2d4/0x310
[  393.253784]  __d_free+0x28/0x34
[  393.253794]  rcu_do_batch+0x174/0x6f0
[  393.253802]  rcu_core+0x264/0x3fc
[  393.253810]  rcu_core_si+0x1c/0x30
[  393.253818]  __do_softirq+0x128/0x3b8
[  393.253826]  run_ksoftirqd+0x6c/0x94
[  393.253835]  smpboot_thread_fn+0x230/0x254
[  393.253845]  kthread+0x114/0x120
[  393.253853]  ret_from_fork+0x10/0x20
[  393.253861] ---[ end trace 0000000000000000 ]---
[  393.253869] Bye!

the loop0 message is from initializing the build VM, the next message is
already the crash.
So at least the one patch in linux-next alone does definitely not fix the issue
;-)


You are receiving this mail because: