https://bugzilla.novell.com/show_bug.cgi?id=347200
User robin.listas@telefonica.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347200#c1
Summary: spamd/spamc grinds to a halt with two zombi children;
sysread(8) failed after 300 secs at
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/Spamd
ForkScaling.pm line 642.
Product: openSUSE 10.3
Version: Final
Platform: i686
OS/Version: openSUSE 10.3
Status: NEW
Severity: Normal
Priority: P5 - None
Component: Other
AssignedTo: bnc-team-screening@forge.provo.novell.com
ReportedBy: robin.listas@telefonica.net
QAContact: qa@suse.de
Found By: ---
Product: spamassassin --> other
I have problems with spamd children dying on me. When I noticed, there were 8
emails stuck in the queue. The processes are dead:
[Watch: long lines. I suggest you increase the available line length for
reporting]
root 29170 0.0 2.4 30804 25280 ? SNs 09:52 0:02
/usr/sbin/spamd -d -c -r /var/run/spamd.pid
root 14295 0.0 0.0 0 0 ? ZN 13:34 0:00 \_ [spamd]
<defunct>
root 14296 0.0 0.0 0 0 ? ZN 13:34 0:00 \_ [spamd]
<defunct>
and spamc is stuck:
root 5062 0.0 0.1 6408 1048 ? Ss 03:01 0:00
/usr/lib/postfix/master
postfix 14791 0.0 0.1 6248 1820 ? S 13:35 0:00 \_ qmgr -l -t
unix -u
postfix 14983 0.0 0.2 6380 2264 ? S 13:43 0:00 \_ local -t
unix
cer 15699 0.0 0.0 2004 572 ? Ss 13:54 0:00 | \_
/usr/bin/procmail
cer 15709 0.0 0.0 2004 240 ? S 13:54 0:00 | \_
/usr/bin/procmail
cer 15710 0.0 0.0 2536 580 ? S 13:54 0:00 |
\_ /usr/bin/spamc -s 350000
postfix 14993 0.0 0.2 6384 2272 ? S 13:43 0:00 \_ local -t
unix
cer 15703 0.0 0.0 2008 576 ? Ss 13:54 0:00 | \_
/usr/bin/procmail
cer 15713 0.0 0.0 2008 244 ? S 13:54 0:00 | \_
/usr/bin/procmail
cer 15714 0.0 0.0 2536 580 ? S 13:54 0:00 |
\_ /usr/bin/spamc -s 350000
postfix 15688 0.0 0.1 6204 1660 ? S 13:54 0:00 \_ pickup -l
-t unix -u
postfix 15940 0.0 0.1 6208 1716 ? S 13:58 0:00 \_ showq -t
unix -u
The log shows problems:
Dec 9 13:34:41 nimrodel spamd[14295]: spamd: processing message
for cer:500
Dec 9 13:34:42 nimrodel spamd[14296]: spamd: connection from localhost
[127.0.0.1] at port 28643
Dec 9 13:34:42 nimrodel spamd[14296]: spamd: setuid to cer succeeded
Dec 9 13:34:42 nimrodel spamd[14296]: spamd: processing message
for cer:500
Dec 9 13:34:49 nimrodel spamd[14295]: spamd: clean message (-6.6/5.0) for
cer:500 in 7.9 seconds, 5361 bytes.
Dec 9 13:34:49 nimrodel spamd[14295]: spamd: result: . -6 -
BAYES_00,RCVD_IN_DNSWL_MED
scantime=7.9,size=5361,user=cer,uid=500,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=28640,mid=,bayes=0.000000,autolearn=disabled
Dec 9 13:34:49 nimrodel postfix/local[14312]: 723F9B747F:
to=, relay=local, delay=8.3, delays=0.25/0.02/0/8,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 13:34:49 nimrodel postfix/qmgr[5093]: 723F9B747F: removed
Dec 9 13:34:49 nimrodel spamd[14296]: spamd: clean message (-6.6/5.0) for
cer:500 in 7.3 seconds, 5929 bytes.
Dec 9 13:34:49 nimrodel spamd[14296]: spamd: result: . -6 -
BAYES_00,RCVD_IN_DNSWL_MED
scantime=7.3,size=5929,user=cer,uid=500,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=28643,mid=,bayes=0.000000,autolearn=disabled
Dec 9 13:34:50 nimrodel spamd[29170]: prefork: child states: IB
Dec 9 13:34:50 nimrodel spamd[29170]: prefork: child states: II
Dec 9 13:34:50 nimrodel postfix/local[14322]: E339DB7277:
to=, relay=local, delay=8.1, delays=0.09/0.05/0/8,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 13:34:50 nimrodel postfix/qmgr[5093]: E339DB7277: removed
Dec 9 13:35:09 nimrodel postfix/postfix-script[14786]: refreshing the Postfix
mail system
Now comes the problems
Dec 9 13:43:31 nimrodel spamd[14295]: prefork: sysread(8) failed after 300
secs at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm
line 642.
Dec 9 13:43:31 nimrodel spamd[14296]: prefork: sysread(9) failed after 300
secs at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm
line 642.
Dec 9 13:43:31 nimrodel spamd[29170]: prefork: write of ping failed to 14296
fd=8: Broken pipe at
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm line
404.
Dec 9 13:43:31 nimrodel spamd[29170]: prefork: killing failed child 14296 fd=8
at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm line
171.
Dec 9 13:43:31 nimrodel spamd[29170]: prefork: killed child 14296
Dec 9 13:43:31 nimrodel spamd[29170]: prefork: write of ping failed to 14295
fd=7: Broken pipe at
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm line
404.
Dec 9 13:43:31 nimrodel spamd[29170]: prefork: killing failed child 14295 fd=7
at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm line
171.
Dec 9 13:43:31 nimrodel spamd[29170]: prefork: killed child 14295
And there is no more. The children are "defunct" but not really dead (state
"ZN"). I have to do a "rcspamd restart" to recover the system:
Dec 9 14:07:27 nimrodel spamd[29170]: spamd: server killed by SIGTERM,
shutting down
Dec 9 14:07:27 nimrodel postfix/local[16490]: EC926B7489:
to=, relay=local, delay=923, delays=0.07/751/0/172,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 14:07:27 nimrodel postfix/qmgr[14791]: EC926B7489: removed
Dec 9 14:07:27 nimrodel postfix/local[14983]: B2FAEB7486:
to=, relay=local, delay=1428, delays=0.1/1248/0/180,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 14:07:27 nimrodel postfix/qmgr[14791]: B2FAEB7486: removed
Dec 9 14:07:27 nimrodel postfix/qmgr[14791]: B2FAEB7486: removed
Dec 9 14:07:27 nimrodel spamc[16742]: connect to spamd on 127.0.0.1 failed,
retrying (#1 of 3): Connection refused
Dec 9 14:07:27 nimrodel spamc[16746]: connect to spamd on 127.0.0.1 failed,
retrying (#1 of 3): Connection refused
Dec 9 14:07:28 nimrodel spamc[16742]: connect to spamd on 127.0.0.1 failed,
retrying (#2 of 3): Connection refused
Dec 9 14:07:28 nimrodel spamc[16746]: connect to spamd on 127.0.0.1 failed,
retrying (#2 of 3): Connection refused
Dec 9 14:07:29 nimrodel spamc[16742]: connect to spamd on 127.0.0.1 failed,
retrying (#3 of 3): Connection refused
Dec 9 14:07:29 nimrodel spamc[16746]: connect to spamd on 127.0.0.1 failed,
retrying (#3 of 3): Connection refused
Dec 9 14:07:30 nimrodel spamc[16742]: connection attempt to spamd aborted
after 3 retries
Dec 9 14:07:30 nimrodel spamc[16746]: connection attempt to spamd aborted
after 3 retries
Dec 9 14:07:30 nimrodel postfix/local[16490]: 5DA41B748D:
to=, relay=local, delay=926, delays=0.07/923/0/3.1,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 14:07:30 nimrodel postfix/qmgr[14791]: 5DA41B748D: removed
Dec 9 14:07:30 nimrodel spamc[16764]: connect to spamd on 127.0.0.1 failed,
retrying (#1 of 3): Connection refused
Dec 9 14:07:31 nimrodel spamc[16764]: connect to spamd on 127.0.0.1 failed,
retrying (#2 of 3): Connection refused
Dec 9 14:07:32 nimrodel spamc[16764]: connect to spamd on 127.0.0.1 failed,
retrying (#3 of 3): Connection refused
Dec 9 14:07:33 nimrodel spamc[16764]: connection attempt to spamd aborted
after 3 retries
Dec 9 14:07:33 nimrodel postfix/local[14983]: 3C9A6B7490:
to=, relay=local, delay=928, delays=0.08/925/0/3.1,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 14:07:33 nimrodel postfix/qmgr[14791]: 3C9A6B7490: removed
Dec 9 14:07:33 nimrodel spamc[16779]: connect to spamd on 127.0.0.1 failed,
retrying (#1 of 3): Connection refused
Dec 9 14:07:34 nimrodel spamc[16779]: connect to spamd on 127.0.0.1 failed,
retrying (#2 of 3): Connection refused
Dec 9 14:07:35 nimrodel spamc[16779]: connect to spamd on 127.0.0.1 failed,
retrying (#3 of 3): Connection refused
Dec 9 14:07:36 nimrodel spamc[16779]: connection attempt to spamd aborted
after 3 retries
Dec 9 14:07:36 nimrodel postfix/local[16490]: 6340EB747F:
to=, relay=local, delay=428, delays=0.09/425/0/3.1,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 14:07:36 nimrodel postfix/qmgr[14791]: 6340EB747F: removed
Dec 9 14:07:36 nimrodel spamc[16787]: connect to spamd on 127.0.0.1 failed,
retrying (#1 of 3): Connection refused
Dec 9 14:07:37 nimrodel spamc[16787]: connect to spamd on 127.0.0.1 failed,
retrying (#2 of 3): Connection refused
Dec 9 14:07:38 nimrodel postfix/local[16730]: C62C1B748F:
to=, relay=local, delay=934, delays=0.06/923/0/11,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 14:07:38 nimrodel postfix/qmgr[14791]: C62C1B748F: removed
Dec 9 14:07:38 nimrodel spamc[16807]: connect to spamd on 127.0.0.1 failed,
retrying (#1 of 3): Connection refused
Dec 9 14:07:38 nimrodel spamc[16787]: connect to spamd on 127.0.0.1 failed,
retrying (#3 of 3): Connection refused
Dec 9 14:07:39 nimrodel spamc[16807]: connect to spamd on 127.0.0.1 failed,
retrying (#2 of 3): Connection refused
Dec 9 14:07:39 nimrodel spamc[16787]: connection attempt to spamd aborted
after 3 retries
Dec 9 14:07:39 nimrodel postfix/local[14983]: E33FAB7483:
to=, relay=local, delay=431, delays=0.08/428/0/3.1,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 14:07:39 nimrodel postfix/qmgr[14791]: E33FAB7483: removed
Dec 9 14:07:39 nimrodel spamc[16816]: connect to spamd on 127.0.0.1 failed,
retrying (#1 of 3): Connection refused
Dec 9 14:07:42 nimrodel spamd[16794]: logger: removing stderr method
Dec 9 14:07:42 nimrodel spamd[16817]: rules: meta test FM_DDDD_TIMES_2 has
dependency 'FH_HOST_EQ_D_D_D_D' with a zero score
Dec 9 14:07:42 nimrodel spamd[16817]: rules: meta test FM_SEX_HOSTDDDD has
dependency 'FH_HOST_EQ_D_D_D_D' with a zero score
Dec 9 14:07:42 nimrodel spamd[16817]: rules: meta test HS_PHARMA_1 has
dependency 'HS_SUBJ_ONLINE_PHARMACEUTICAL' with a zero score
Dec 9 14:07:43 nimrodel spamd[16817]: spamd: server started on port 783/tcp
(running version 3.2.3)
Dec 9 14:07:43 nimrodel spamd[16817]: spamd: server pid: 16817
Dec 9 14:07:43 nimrodel spamd[16817]: spamd: server successfully spawned child
process, pid 16823
Dec 9 14:07:43 nimrodel spamd[16817]: spamd: server successfully spawned child
process, pid 16824
Dec 9 14:07:43 nimrodel spamd[16823]: spamd: connection from localhost
[127.0.0.1] at port 25670
Dec 9 14:07:43 nimrodel spamd[16823]: spamd: setuid to cer succeeded
Dec 9 14:07:43 nimrodel spamd[16817]: prefork: child states: BS
Dec 9 14:07:43 nimrodel spamd[16817]: spamd: server successfully spawned child
process, pid 16825
Dec 9 14:07:43 nimrodel spamd[16824]: spamd: connection from localhost
[127.0.0.1] at port 25671
Dec 9 14:07:43 nimrodel spamd[16824]: spamd: setuid to cer succeeded
Dec 9 14:07:43 nimrodel spamd[16824]: spamd: processing message
<200712091352.59148.suse_mailing_list@jimmo.com> for cer:500
Dec 9 14:07:43 nimrodel spamd[16817]: prefork: child states: BBI
Dec 9 14:07:43 nimrodel spamd[16823]: spamd: processing message
for cer:500
Dec 9 14:07:50 nimrodel spamd[16823]: spamd: identified spam (14.0/5.0) for
cer:500 in 7.7 seconds, 2507 bytes.
Dec 9 14:07:50 nimrodel spamd[16823]: spamd: result: Y 14 -
BAYES_50,HTML_MESSAGE,RCVD_IN_PBL,RCVD_IN_SORBS_DUL,URIBL_AB_SURBL,URIBL_BLACK,URIBL_JP_SURBL,URIBL_OB_SURBL,URIBL_SBL,URIBL_SC_SURBL,URIBL_WS_SURBL
scantime=7.7,size=2507,user=cer,uid=500,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=25670,mid=,bayes=0.518719,autolearn=disabled
Dec 9 14:07:50 nimrodel postfix/local[16490]: A1653B7493:
to=, relay=local, delay=441, delays=0.14/429/0/13,
dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail)
Dec 9 14:07:50 nimrodel postfix/qmgr[14791]: A1653B7493: removed
Dec 9 14:07:51 nimrodel spamd[16817]: prefork: child states: IBI
Dec 9 14:07:51 nimrodel spamd[16823]: spamd: connection from localhost
[127.0.0.1] at port 25672
Dec 9 14:07:51 nimrodel spamd[16823]: spamd: setuid to cer succeeded
Dec 9 14:07:51 nimrodel spamd[16823]: spamd: processing message
<200712091359.31845.ajh@rydsbo.net> for cer:500
Dec 9 14:07:51 nimrodel spamd[16824]: spamd: clean message (-6.6/5.0) for
cer:500 in 7.9 seconds, 5470 bytes.
.. and its working again, but several mails were not scanned.
[another day]
I had an idea. I grepped my log files to find out where these messages
started to appear; I grepped for
"/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm":
zgrep -c
"/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm"
/var/log/mail.debug-2007* | less
counting the messages per file. I have them as early as:
/var/log/mail.debug-20070217.bz2:0
/var/log/mail.debug-20070219.bz2:0
/var/log/mail.debug-20070225.bz2:6
/var/log/mail.debug-20070304.bz2:14
/var/log/mail.debug-20070311.bz2:2
/var/log/mail.debug-20070318.bz2:2
..
/var/log/mail.debug-20071028.gz:18
/var/log/mail.debug-20071103.bz2:15
/var/log/mail.debug-20071105.bz2:0 <=== upgraded to O.S: 10.3 here
/var/log/mail.debug-20071106.bz2:0
/var/log/mail.debug-20071107.bz2:0
/var/log/mail.debug-20071108.bz2:0
You see that they stopped when I upgraded to 10.3, but they restart on
this file:
/var/log/mail.debug-20071207.bz2:16
They started "Dec 7 15:29:07". I was hopping to see if some thing I
installed then could have an effect... but I don't know. This is what I
installed Thu Dec 06 2007, from 08:16:38 to 08:55:37 PM CET - it's
wonderful what you can learn from the rpm database:
BUILDTIME:day NAME VERSION
Sat Nov 24 2007 htdig 3.2.0b6-110.2
Fri Nov 30 2007 glib2 2.14.1-4.2
Sat Nov 24 2007 release-notes 10.3.19-0.1
Fri Nov 23 2007 alsa 1.0.14-31.2
Fri Nov 23 2007 libasound2 1.0.14-31.2
Fri Nov 23 2007 libcom_err2 1.40.2-20.2
Fri Nov 23 2007 libuuid1 1.40.2-20.2
Mon Nov 26 2007 kernel-source 2.6.22.13-0.3
Mon Nov 26 2007 kernel-default 2.6.22.13-0.3
Fri Nov 30 2007 glib2-devel 2.14.1-4.2
Fri Nov 23 2007 alsa-devel 1.0.14-31.2
Fri Nov 30 2007 OpenOffice_org 2.3.0.1.2-10.3
Fri Nov 23 2007 libext2fs2 1.40.2-20.2
Fri Nov 23 2007 libuuid-devel 1.40.2-20.2
Mon Nov 26 2007 kernel-syms 2.6.22.13-0.3
Wed Nov 28 2007 gnome-screensaver 2.20.0-6.2
Wed Nov 28 2007 xorg-x11 7.2-135.4
Fri Nov 23 2007 libblkid1 1.40.2-20.2
Fri Nov 23 2007 libcom_err-devel 1.40.2-20.2
Fri Nov 30 2007 OpenOffice_org-base 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-draw 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-gnome 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-impress 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-kde 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-math 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-mono 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-officebean 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-pyuno 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-writer 2.3.0.1.2-10.3
Fri Nov 23 2007 libext2fs-devel 1.40.2-20.2
Fri Nov 30 2007 OpenOffice_org-calc 2.3.0.1.2-10.3
Fri Nov 23 2007 e2fsprogs 1.40.2-20.2
Fri Nov 23 2007 libblkid-devel 1.40.2-20.2
Fri Nov 30 2007 OpenOffice_org-mailmerge 2.3.0.1.2-10.3
Fri Nov 30 2007 OpenOffice_org-filters 2.3.0.1.2-10.3
Fri Nov 23 2007 e2fsprogs-devel 1.40.2-20.2
Do you see some rpm related to perl or spamassassin there? Perhaps glibc?
I have reported this error in the mail list: no ideas. I have googled for it,
and found some bugzillas in the spamassassin site that also talk about dying
children, years ago, solved, but nobody knows why [4476, 4370, 4590,4696,
4407]. Some reported had only an instance of this problem: I have it
continuously:
Dec 10 00:24:00 nimrodel spamd[1867]: prefork: sysread(9) failed after 300 secs
at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm line
642.
Dec 10 00:34:47 nimrodel spamd[17637]: prefork: sysread(8) failed after 300
secs at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm
line 642.
Dec 10 00:34:47 nimrodel spamd[17636]: prefork: sysread(7) failed after 300
secs at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm
line 642.
Dec 10 00:34:47 nimrodel spamd[1860]: Use of uninitialized value in
concatenation (.) or string at
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm line
323.
Dec 10 00:34:47 nimrodel spamd[1860]: prefork: select returned error on server
filehandle:
Dec 10 00:34:47 nimrodel spamd[17636]: prefork: sysread(7) failed after 300
secs at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm
line 642.
Dec 10 00:34:47 nimrodel spamd[1860]: Use of uninitialized value in
concatenation (.) or string at
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm line
323.
Dec 10 00:34:47 nimrodel spamd[1860]: prefork: select returned error on server
filehandle:
Dec 10 00:44:09 nimrodel spamd[1860]: Use of uninitialized value in
concatenation (.) or string at
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm line
323.
Dec 10 00:44:09 nimrodel spamd[1860]: prefork: select returned error on server
filehandle:
Dec 10 00:44:09 nimrodel spamd[17964]: prefork: sysread(7) failed after 300
secs at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm
line 642.
Dec 10 00:44:09 nimrodel spamd[17965]: prefork: sysread(8) failed after 300
secs at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm
line 642.
I'm willing to conduct tests, but I'm no perl programmer. If you want to enable
more detailed logs, tell me how.
--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.