We have a customer that has requested that Novell provide full support for the kdumpcheck STONITH plugin in SLE-HAE.
The plugin is documented in the SLE HAE manual in section 9.5 "Special Fencing Devices".
This plug-in checks if a Kernel dump is in progress on a node. If so, it returns true, and acts as if the node has been fenced. The node cannot run any resources during the dump anyway. This avoids fencing a node that is already down but doing a dump, which takes some time. The plug-in must be used in concert with another, real STONITH device. For more details, see /usr/share/doc/packages/cluster-glue/README_kdumpcheck.txt.
The plugin is also provided as part of the sle-ha pattern set of packages. However, the plugin requires a patch to mkdumprd that is not present in the SLES distribution. There is some history in SLES bugzilla associated with this plugin and mkdumprd. https://bugzilla.novell.com/show_bug.cgi?id=445870
Essentially, mkdumprd is a RHEL mechanism, so Novell will need to figure out the equivalent changes to enable use of this STONITH plugin. As it is, it is not usable despite being present and documented in the HAE Guide.
Note that the real requirement here is some sort of robust mechanism to allow a crash dump to take place when HA is active with STONITH enabled. Is there some other supported way to do this? Can kdumpcheck be made to work in a SLES11 SP3 HAE environment?