http://bugzilla.novell.com/show_bug.cgi?id=1058028 Bug ID: 1058028 Summary: SLES12 SP2: During boot, within 30seconds systemd-udevd daemons are getting killed and megaraid_sas failing to configure controller Classification: openSUSE Product: openSUSE Distribution Version: Leap 42.2 Hardware: x86-64 OS: SLES 12 Status: NEW Severity: Major Priority: P5 - None Component: Other Assignee: bnc-team-screening@forge.provo.novell.com Reporter: shivasharan.srikanteshwara@broadcom.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Created attachment 740146 --> http://bugzilla.novell.com/attachment.cgi?id=740146&action=edit Boot journalctl logs with dracut and udev debug enabled. We are seeing one boot issue on SLES12 SP2 when there are two MegaRAID controller connected to the host. If the first controller takes more than 30secs time to configure, then we are seeing even the second controller fails Setup details: SLES12 SP2 installed on a local drive. Have two megaraid controllers. The first controller has a faulty firmware. Observation: After boot both controllers are not configured by the driver. Expectation: The second controller should get configured. Analysis: As per our analysis, from megaraid_sas driver perspective it takes about 3minutes for the first controller probe to fail. PCI probe happens synchronously, so second controller probe is waiting for almost 3minutes to start. During this time, as soon as the root device is up initrd-cleanup.service is getting triggered. This service is cleaning up all the pending systemd daemons. Within 30seconds all the daemon are getting stopped.
From systemd code, during stop systemd-udevd initially sends SIGTERM signal to all its child processes and then after 30 seconds sends a SIGKILL.
The 30secs timeout we feel is a bit too aggressive. There maybe many valid scenarios where the driver could take more than 30seconds for controller probe. Or consider scenarios where customers have more number of controllers connected to the host and each controller taking some time. We did not find any tunable as well to control this timeout. But by removing "KillMode=mixed" from systemd-udevd.service file, the issue is not seen. We tried similar test on other Linux distribution (RHEL 7.3) and did not see that same issue. There the wait time is around 3minutes before the cleanup service get triggered. So driver gets sufficient time to configure the controllers. Attaching the complete boot logs with dracut and udev debug enabled. This is with an instrumented megaraid driver that adds a delay of 40secs before failing controller configuration. Let me know if you need any other logs or details. If needed, I could also provide a driver patch that can simulate the 40sec delay and failure. Thanks. -- You are receiving this mail because: You are on the CC list for the bug.