[Bug 838871] New: Resource Agent for apache does not work in Factory
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c0 Summary: Resource Agent for apache does not work in Factory Classification: openSUSE Product: openSUSE Factory Version: 13.1 Milestone 4 Platform: x86-64 OS/Version: SUSE Other Status: NEW Severity: Major Priority: P5 - None Component: High Availability AssignedTo: lmb@suse.com ReportedBy: kgronlund@suse.com QAContact: qa-bugs@suse.de Found By: Development Blocker: No Using pacemaker found in network:ha-clustering:Factory, the apache resource agent is unable to start apache, and hangs indefinitely. Starting apache manually using systemctl works fine. The configuration was carried over from an older version, so this is possibly an upgrade issue. Pacemaker version: 1.1.10-55.1 crm configuration: primitive apache ocf:heartbeat:apache \ params configfile="/etc/apache2/httpd.conf" \ op start timeout="40" interval="0" \ op stop timeout="60" interval="0" \ op monitor interval="10" timeout="20" primitive virtual-ip ocf:heartbeat:IPaddr2 \ params ip="192.168.122.13" lvs_support="false" \ op start timeout="20" interval="0" \ op stop timeout="20" interval="0" \ op monitor interval="10" timeout="20" \ meta target-role="Started" group web-server virtual-ip apache \ meta target-role="Started" property $id="cib-bootstrap-options" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ placement-strategy="balanced" \ dc-version="1.1.10-55.1-5d0a223" \ cluster-infrastructure="corosync" \ expected-quorum-votes="3" \ symmetric-cluster="true" rsc_defaults $id="rsc-options" \ resource-stickiness="1" \ migration-threshold="3" op_defaults $id="op-options" \ timeout="600" \ record-pending="true" -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c1 Lars Marowsky-Bree <lmb@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P3 - Medium CC| |lmb@suse.com AssignedTo|lmb@suse.com |dmuhamedagic@suse.com --- Comment #1 from Lars Marowsky-Bree <lmb@suse.com> 2013-09-06 10:14:40 UTC --- I'm afraid this may not be the only one - some of the scripts may rely on call-outs to init scripts or other things that have changed with systemd. Probably needs to be addressed upstream too. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c2 Dejan Muhamedagic <dmuhamedagic@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |kgronlund@suse.com --- Comment #2 from Dejan Muhamedagic <dmuhamedagic@suse.com> 2013-09-07 06:23:56 CEST --- Please provide a hb_report. A resource agent trace would probably be good too: # crm resource trace apache start The trace files should be captured by hb_report. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c3 --- Comment #3 from Kristoffer Gronlund <kgronlund@suse.com> 2013-09-09 08:10:22 UTC --- Created an attachment (id=556332) --> (http://bugzilla.novell.com/attachment.cgi?id=556332) hb_report for cluster (no usable information as far as I can tell) This is an hb_report output for the cluster. It looks like hb_report is unable to capture any relevant information. Attaching journalctl -u pacemaker output separately. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c4 --- Comment #4 from Kristoffer Gronlund <kgronlund@suse.com> 2013-09-09 08:11:23 UTC --- Created an attachment (id=556333) --> (http://bugzilla.novell.com/attachment.cgi?id=556333) journalctl -u pacemaker Output from journalctl -u pacemaker -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c5 --- Comment #5 from Dejan Muhamedagic <dmuhamedagic@suse.com> 2013-09-09 11:24:52 CEST --- (In reply to comment #3)
Created an attachment (id=556332) --> (http://bugzilla.novell.com/attachment.cgi?id=556332) [details] hb_report for cluster (no usable information as far as I can tell)
This is an hb_report output for the cluster. It looks like hb_report is unable to capture any relevant information.
There are a few oddities in the report: - the By: field in description says: By: hb_report -Z -f Wed Aug 21 08:56:12 2013 /var/cache/crm/history/live This really happened in August? Can you try hb_report -f "date_from" -t "date_to" with the right dates. - distribution is 12.3: Distribution: /etc/SuSE-release openSUSE 12.3 (x86_64) VERSION = 12.3 CODENAME = Dartmouth Is it that factory doesn't update the release info? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c6 --- Comment #6 from Kristoffer Gronlund <kgronlund@suse.com> 2013-09-09 09:34:25 UTC --- Created an attachment (id=556350) --> (http://bugzilla.novell.com/attachment.cgi?id=556350) hb_report attempt 2 Sorry, I used hb_report via the crm history interface, and thought it would pick the last hour as the default timespan. Here's another attempt. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c7 --- Comment #7 from Dejan Muhamedagic <dmuhamedagic@suse.com> 2013-09-09 14:10:54 CEST --- Much better :) + 08:03:36: apache_monitor:267: silent_status + 08:03:36: silent_status:128: '[' -f /var/run//httpd2.pid ']' + 08:03:36: silent_status:132: : No pid file + 08:03:36: silent_status:133: false Looks like the httpd creates the pid file elsewhere. Can you check /etc/apache2/httpd.conf and see if there's the PidFile directive. If so and if it's different from the one above, then we have a parser issue. Otherwise, take a look at the apache logs. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c8 --- Comment #8 from Kristoffer Gronlund <kgronlund@suse.com> 2013-09-09 12:22:20 UTC --- There is no PidFile directive configured at all. I suspect that if apache has been modified to be started by systemd, that the use of a pid file may have been removed completely (since systemd does not need pid files, if I remember correctly). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c9 --- Comment #9 from Lars Marowsky-Bree <lmb@suse.com> 2013-09-09 13:21:22 UTC --- The documentation for PidFile explicitly states that they discourage the use of the pidfile for starting/stopping the server, and would rather have us use the "apachectl" command. http://httpd.apache.org/docs/2.2/programs/apachectl.html -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c10 --- Comment #10 from Dejan Muhamedagic <dmuhamedagic@suse.com> 2013-09-09 16:28:07 CEST --- Using apachectl would result in a completely new apache RA. Somebody tried to do that once, but it didn't work completely out. Let me see if I can find it... https://developerbugs.linuxfoundation.org/show_bug.cgi?id=1943 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c11 --- Comment #11 from Lars Marowsky-Bree <lmb@suse.com> 2013-09-11 08:09:59 UTC --- It was worth a thought ;-) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c12 Dejan Muhamedagic <dmuhamedagic@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tserong@suse.com --- Comment #12 from Dejan Muhamedagic <dmuhamedagic@suse.com> 2013-09-11 11:44:20 CEST --- (In reply to comment #11)
It was worth a thought ;-)
Definitely. Adding Tim who'll try to revive the "start default apache" code. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c13 Tim Serong <tserong@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED InfoProvider|kgronlund@suse.com | Resolution| |FIXED --- Comment #13 from Tim Serong <tserong@suse.com> 2013-09-11 13:19:27 UTC --- Irritatingly, it works fine for me :-/ So the PidFile business is as follows, looking at the RA source on my test VM: - If there's no PidFile in your apache config, the parser in apache-conf.sh notices, deliberately sets PidFile=$HA_VARRUNDIR/${httpd_basename}.pid (which becomes /var/run//httpd2.pid), and sets PIDFILE_DIRECTIVE="true" - Later, the RA checks if $PIDFILE_DIRECTIVE is set (as it will be from the above) and starts apache with: ocf_run $HTTPD $HTTPDOPTS $OPTIONS -f $CONFIGFILE -c "PidFile $PidFile" Looking at your RA traces, I can see it setting PidFile=/var/run//httpd2.pid, but I *can't* see any mention of PIDFILE_DIRECTIVE being either set or checked, anywhere: + 08:02:40: GetParams:139: case $PidFile in + 08:02:40: GetParams:142: PidFile=/var/run//httpd2.pid + 08:02:40: GetParams:145: for p in '"$PORT"' '"$Port"' 80 + 08:02:40: GetParams:146: CheckPort '' .. + 08:02:40: apache_start:175: '[' -d /var/run/apache2 ']' + 08:02:40: apache_start:175: mkdir /var/run/apache2 + 08:02:40: apache_start:176: ocf_run /usr/sbin/httpd2 -DSTATUS -f /etc/apache2/httpd.conf Whereas if I trace a start on my system, I get: + 22:40:38: GetParams:139: case $PidFile in + 22:40:38: GetParams:145: PidFile=/var/run//httpd2.pid + 22:40:38: GetParams:149: PIDFILE_DIRECTIVE=true + 22:40:38: GetParams:153: for p in '"$PORT"' '"$Port"' 80 + 22:40:38: GetParams:154: CheckPort '' .. + 22:40:38: apache_start:172: '[' -d /var/run/apache2 ']' + 22:40:38: apache_start:174: '[' -z true ']' + 22:40:38: apache_start:177: ocf_run /usr/sbin/httpd2 -DSTATUS -f /etc/apache2/httpd.conf -c 'PidFile /var/run//httpd2.pid' Note some variance in line numbering. I've tracked this down to: https://github.com/ClusterLabs/resource-agents/commit/2ce0b97 resource-agents in openSUSE:Factory does not have this fix. resource-agents in network:ha-clustering:Factory *does* have it, which is why it works for me. Some further digging around shows that apache actually does create a pidfile by itself if one isn't specified, but it's creating /var/run/httpd.pid (note the missing '2', which doesn't match what the RA wants). Interestingly, apache on SLE 11 SP3 *does* create the pidfile /var/run/httpd2.pid by default, which is why we never saw this before. I hate this RA. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=838871 https://bugzilla.novell.com/show_bug.cgi?id=838871#c14 --- Comment #14 from Dejan Muhamedagic <dmuhamedagic@suse.com> 2013-09-11 16:46:28 CEST --- (In reply to comment #13)
Irritatingly, it works fine for me :-/
That's cool. I mean not the irritation but the other part :)
So the PidFile business is as follows, looking at the RA source on my test VM:
- If there's no PidFile in your apache config, the parser in apache-conf.sh notices, deliberately sets PidFile=$HA_VARRUNDIR/${httpd_basename}.pid (which becomes /var/run//httpd2.pid), and sets PIDFILE_DIRECTIVE="true"
- Later, the RA checks if $PIDFILE_DIRECTIVE is set (as it will be from the above) and starts apache with:
ocf_run $HTTPD $HTTPDOPTS $OPTIONS -f $CONFIGFILE -c "PidFile $PidFile"
Yeah, I can recall now, that's pretty new.
Some further digging around shows that apache actually does create a pidfile by itself if one isn't specified, but it's creating /var/run/httpd.pid (note the missing '2', which doesn't match what the RA wants). Interestingly, apache on SLE 11 SP3 *does* create the pidfile /var/run/httpd2.pid by default, which is why we never saw this before.
The pidfile stuff in Factory: [0]hex-10:obs > grep /var/run/httpd2.pid openSUSE:Factory/apache2/* openSUSE:Factory/apache2/apache2.changes:- set DEFAULT_PIDLOG to /var/run/httpd2.pid, so we don't need to ... openSUSE:Factory/apache2/rc.apache2:: ${pidfile:=/var/run/httpd2.pid} The very same as in SLE11SP3. The difference is that apache this time gets started through systemd and obviously this particular setting wasn't propagated. I guess that somebody would need to open a bugzilla for that.
I hate this RA.
Heh, you're not the only one. But this time it was not at fault. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com