Comment # 19 on bug 1213721 from LTC BugProxy
== Comment: #0 - Rajanikanth H. Adaveeshaiah <rajanikanth.ha@in.ibm.com> -
2023-09-25 03:52:57 ==
Description :

After configuring the kdump on my LPAR eonlp66 and crash is triggered ,  vmcore
is not generating in the LPAR.

FYI

I have updated the grub.cfg to add crashkernel value 8192M and performed
transactional-update grub.cfg  and reboot the system.

eonlp66:~ # transactional-update grub.cfg
Checking for newer version.
transactional-update 4.3.0 started
Options: grub.cfg
Separate /var detected.
2023-09-21 14:17:21 tukit 4.3.0 started
2023-09-21 14:17:21 Options: -c2 open
2023-09-21 14:17:21 Using snapshot 2 as base for new snapshot 3.
2023-09-21 14:17:21 Syncing /etc of previous snapshot 1 as base into new
snapshot "/.snapshots/3/snapshot"
2023-09-21 14:17:21 SELinux is enabled.
ID: 3
2023-09-21 14:17:23 Transaction completed.
Creating a new grub2 config
2023-09-21 14:17:23 tukit 4.3.0 started
2023-09-21 14:17:23 Options: call 3 bash -c /usr/sbin/grub2-mkconfig >
/boot/grub2/grub.cfg
2023-09-21 14:17:25 Executing `bash -c /usr/sbin/grub2-mkconfig >
/boot/grub2/grub.cfg`:
Generating grub configuration file ...
Found linux image: /boot/vmlinux-6.4.11-1-default
Found initrd image: /boot/initrd-6.4.11-1-default
Warning: os-prober will not be executed to detect other bootable partitions.
Systems on them will not be added to the GRUB boot configuration.
Check GRUB_DISABLE_OS_PROBER documentation entry.
done
2023-09-21 14:17:26 Application returned with exit status 0.
2023-09-21 14:17:26 Transaction completed.
2023-09-21 14:17:26 tukit 4.3.0 started
2023-09-21 14:17:26 Options: close 3
2023-09-21 14:17:27 New default snapshot is #3 (/.snapshots/3/snapshot).
2023-09-21 14:17:27 Transaction completed.

Please reboot your machine to activate the changes and avoid data loss.
New default snapshot is #3 (/.snapshots/3/snapshot).
transactional-update finished
eonlp66:~ # reboot
eonlp66:~ # [FAILED] Failed unmounting /etc.
[  335.902290][    T1] watchdog: watchdog0: watchdog did not stop!
[  336.075622][    T1] watchdog: watchdog0: watchdog did not stop!
[  336.120149][    T1] dracut Warning: Killing all remaining processes
dracut Warning: Killing all remaining processes
[  336.163352][    T1] dracut Warning: Unmounted /oldroot.
dracut Warning: Unmounted /oldroot.
Rebooting.
[  338.058364][ T2941] reboot: Restarting system

After this,

I have triggered the crash

eonlp66:~ # echo c > /proc/sysrq-trigger
[ 1470.241041][ T2056] sysrq: Trigger a crash
[ 1470.241062][ T2056] Kernel panic - not syncing: sysrq triggered crash
[ 1470.241074][ T2056] CPU: 8 PID: 2056 Comm: bash Tainted: G               X  
 6.4.11-1-default #1 ALP-0.1 (unreleased)
15220d295c615b06d17e022355783fb787caac5e
[ 1470.241091][ T2056] Hardware name: IBM,9043-MRX POWER10 (raw) 0x800200
0xf000006 of:IBM,FW1050.00 (NM1050_032) hv:phyp pSeries
[ 1470.241102][ T2056] Call Trace:
[ 1470.241108][ T2056] [c0000000099b3a80] [c000000000fb0f08]
dump_stack_lvl+0x6c/0x9c (unreliable)
[ 1470.241133][ T2056] [c0000000099b3ab0] [c00000000014de94] panic+0x178/0x434
[ 1470.241178][ T2056] [c0000000099b3b50] [c000000000a61d68]
sysrq_handle_crash+0x28/0x30
[ 1470.241196][ T2056] [c0000000099b3bb0] [c000000000a626b4]
__handle_sysrq+0xd4/0x220
[ 1470.241212][ T2056] [c0000000099b3c50] [c000000000a63088]
write_sysrq_trigger+0xc8/0x174
[ 1470.241230][ T2056] [c0000000099b3c90] [c0000000006627fc]
proc_reg_write+0xfc/0x160
[ 1470.241246][ T2056] [c0000000099b3cc0] [c000000000585f0c]
vfs_write+0xfc/0x4a0
[ 1470.241262][ T2056] [c0000000099b3d80] [c0000000005865f8]
ksys_write+0x88/0x150
[ 1470.241280][ T2056] [c0000000099b3dd0] [c00000000002f118]
system_call_exception+0x138/0x260
[ 1470.241294][ T2056] [c0000000099b3e50] [c00000000000cfdc]
system_call_vectored_common+0x15c/0x2ec
[ 1470.241314][ T2056] --- interrupt: 3000 at 0x7fffabd3fd98
[ 1470.241325][ T2056] NIP:  00007fffabd3fd98 LR: 0000000000000000 CTR:
0000000000000000
[ 1470.241340][ T2056] REGS: c0000000099b3e80 TRAP: 3000   Tainted: G          
    X     (6.4.11-1-default)
[ 1470.241359][ T2056] MSR:  800000000280f033
<SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48422208  XER: 00000000
[ 1470.241384][ T2056] IRQMASK: 0
[ 1470.241384][ T2056] GPR00: 0000000000000004 00007fffcf447350
00007fffabe56f00 0000000000000001
[ 1470.241384][ T2056] GPR04: 000000014ffdad80 0000000000000002
fffffffffffffff1 0000000000000063
[ 1470.241384][ T2056] GPR08: 00007fffcf447438 0000000000000000
0000000000000000 0000000000000000
[ 1470.241384][ T2056] GPR12: 0000000000000000 00007fffac0fb540
0000000040000000 0000000000000000
[ 1470.241384][ T2056] GPR16: 000000014ffc0100 000000012616a8d0
0000000000000000 0000000000000000
[ 1470.241384][ T2056] GPR20: 0000000000000000 000000014ffd6830
000000014ffe2a10 0000000000000001
[ 1470.241384][ T2056] GPR24: 0000000000000000 0000000000000000
000000014ffdad80 0000000000000002
[ 1470.241384][ T2056] GPR28: 0000000000000002 00007fffabe51980
000000014ffdad80 0000000000000002
[ 1470.241476][ T2056] NIP [00007fffabd3fd98] 0x7fffabd3fd98
[ 1470.241483][ T2056] LR [0000000000000000] 0x0
[ 1470.241494][ T2056] --- interrupt: 3000
[ 1470.253982][ T2056] pstore: backend (nvram) writing error (-1)
[ 1470.266171][ T2056] Rebooting in 90 seconds..

after reboot

there was no vmcore file generated

eonlp66:~ # ls -l /var/crash
total 0
eonlp66:~ #

Contact Information = rajanikanth.ha@in.ibm.com

LPAR details : eonlp66
ISO : agama-live.ppc64le-3.0.0-ALP-GM.iso
IO : VFC : NPIV: SAN LUN
Network : Virtual Ethernet
System : Power 10
FW : 1050

raylp33:~ #

---boot type---
CDROM / ISO image

---Install repository type---
CDROM
agama-live.ppc64le-3.0.0-ALP-GM.iso

Kernel /build : Latest SUSE ALP Dolomite 1.0 Milestone4 (ppc64le) 
6.4.11-1-default

---Failure description---

after triggering the crash vmcore file is not generated in the LPAR

== Comment: #5 - SEETEENA THOUFEEK <sthoufee@in.ibm.com> - 2023-09-26 04:55:06
==

== Comment: #7 - Rajanikanth H. Adaveeshaiah <rajanikanth.ha@in.ibm.com> -
2023-09-26 05:03:21 ==
Thanks Seetena,

When I tried to configue kdump from transactional-update commind it was
successful, post reboot still kdump service is not coming up. Even tried
updating grub from transactional-update and rebooted the LPAR.

Please advice how to work with kdump in SLES16 ALP,

FYI

eonlp66:~ # transactional-update kdump
Checking for newer version.
transactional-update 4.3.0 started
Options: kdump
Separate /var detected.
2023-09-26 09:47:51 tukit 4.3.0 started
2023-09-26 09:47:51 Options: -c5 open
2023-09-26 09:47:51 Using snapshot 5 as base for new snapshot 6.
2023-09-26 09:47:51 Syncing /etc of previous snapshot 4 as base into new
snapshot "/.snapshots/6/snapshot"
2023-09-26 09:47:51 SELinux is enabled.
ID: 6
2023-09-26 09:47:52 Transaction completed.
Trying to rebuild kdump initrd
2023-09-26 09:47:53 tukit 4.3.0 started
2023-09-26 09:47:53 Options: call 6 /sbin/mkdumprd
2023-09-26 09:47:54 Executing `/sbin/mkdumprd`:
/var/lib/kdump not writable, not regenerating initrd.
2023-09-26 09:47:54 Application returned with exit status 0.
2023-09-26 09:47:54 Transaction completed.
2023-09-26 09:47:54 tukit 4.3.0 started
2023-09-26 09:47:54 Options: close 6
2023-09-26 09:47:55 New default snapshot is #6 (/.snapshots/6/snapshot).
2023-09-26 09:47:55 Transaction completed.

Please reboot your machine to activate the changes and avoid data loss.
New default snapshot is #6 (/.snapshots/6/snapshot).
transactional-update finished
eonlp66:~ #

---
eonlp66:~ # service kdump status
? kdump.service - Load kdump kernel and initrd
Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; preset:
disabled)
Active: failed (Result: exit-code) since Tue 2023-09-26 09:49:22 UTC; 3min 6s
ago
Process: 1858 ExecStart=/usr/lib/kdump/load.sh --update (code=exited,
status=1/FAILURE)
Main PID: 1858 (code=exited, status=1/FAILURE)
CPU: 64ms

Sep 26 09:49:22 localhost systemd[1]: Starting Load kdump kernel and initrd...
Sep 26 09:49:22 localhost.localdomain load.sh[1858]: Failed to open file
/var/lib/kdump/kernel:Permission denied
Sep 26 09:49:22 localhost.localdomain load.sh[1858]: Cannot open
`/var/lib/kdump/kernel': Permission denied
Sep 26 09:49:22 localhost.localdomain systemd[1]: kdump.service: Main process
exited, code=exited, status=1/FAILURE
Sep 26 09:49:22 localhost.localdomain systemd[1]: kdump.service: Failed with
result 'exit-code'.
Sep 26 09:49:22 localhost.localdomain systemd[1]: Failed to start Load kdump
kernel and initrd.
eonlp66:~ # service kdump start
Job for kdump.service failed because the control process exited with error
code.
See "systemctl status kdump.service" and "journalctl -xeu kdump.service" for
details.
eonlp66:~ # systemctl status kdump.service
? kdump.service - Load kdump kernel and initrd
Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; preset:
disabled)
Active: failed (Result: exit-code) since Tue 2023-09-26 09:52:43 UTC; 27s ago
Process: 2181 ExecStart=/usr/lib/kdump/load.sh --update (code=exited,
status=1/FAILURE)
Main PID: 2181 (code=exited, status=1/FAILURE)
CPU: 64ms

Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com systemd[1]: Starting Load
kdump kernel and initrd...
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com load.sh[2181]: Failed to open
file /var/lib/kdump/kernel:Permission denied
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com load.sh[2181]: Cannot open
`/var/lib/kdump/kernel': Permission denied
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com systemd[1]: kdump.service:
Main process exited, code=exited, status=1/FAILURE
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com systemd[1]: kdump.service:
Failed with result 'exit-code'.
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com systemd[1]: Failed to start
Load kdump kernel and initrd.
eonlp66:~ # journalctl -xeu kdump.service
??
?? The unit kdump.service has entered the 'failed' state with result
'exit-code'.
Sep 26 09:49:22 localhost.localdomain systemd[1]: Failed to start Load kdump
kernel and initrd.
?? Subject: A start job for unit kdump.service has failed
?? Defined-By: systemd
?? Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
??
?? A start job for unit kdump.service has finished with a failure.
??
?? The job identifier is 306 and the job result is failed.
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com systemd[1]: Starting Load
kdump kernel and initrd...
?? Subject: A start job for unit kdump.service has begun execution
?? Defined-By: systemd
?? Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
??
?? A start job for unit kdump.service has begun execution.
??
?? The job identifier is 789.
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com load.sh[2181]: Failed to open
file /var/lib/kdump/kernel:Permission denied
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com load.sh[2181]: Cannot open
`/var/lib/kdump/kernel': Permission denied
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com systemd[1]: kdump.service:
Main process exited, code=exited, status=1/FAILURE
?? Subject: Unit process exited
?? Defined-By: systemd
?? Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
??
?? An ExecStart= process belonging to unit kdump.service has exited.
??
?? The process' exit code is 'exited' and its exit status is 1.
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com systemd[1]: kdump.service:
Failed with result 'exit-code'.
?? Subject: Unit failed
?? Defined-By: systemd
?? Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
??
?? The unit kdump.service has entered the 'failed' state with result
'exit-code'.
Sep 26 09:52:43 eonlp66.isst.aus.stglabs.ibm.com systemd[1]: Failed to start
Load kdump kernel and initrd.
?? Subject: A start job for unit kdump.service has failed
?? Defined-By: systemd
?? Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
??
?? A start job for unit kdump.service has finished with a failure.
??
?? The job identifier is 789 and the job result is failed.
eonlp66:~ #
(In reply to LTC BugProxy from comment #3)
> Created attachment 869766 [details]
> kdump config file

Hello SUSE,
Do you have an update about this bug?
If the system crashes during any test, then the dump is lost, so it is P1 for
IBM.
Thank you for your support.
The selinux policy blocks kdump from reading symlinks in /var/lib/kdump and
thus the crash kernel is not loaded. This has been fixed in bsc#1213721; For
ALP the updated selinux-policy (version 20230523+git4.261ed027) did not make it
into Milestone4 (submitrequest https://build.suse.de/request/show/307424 )

This will be fixed in newer snapshots.

To test on Milestone4 you can either disable selinux or get a newer snapshot of
the selinux-policy package.

*** This bug has been marked as a duplicate of bug 1213721 ***


You are receiving this mail because: