Re: [opensuse] Why using postfix at all? [Was: Why is dovecot using user home directory?]
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday, 2014-08-19 at 09:51 -0400, Patrick Shanahan wrote:
* Anton Aylward <> [08-19-14 09:22]:
2. The spam/av processing can be done within procmail so why not eliminate the sendmail/postfix step and have fetchmail hand off to local procmail directly?
What this boils down to is this:
Do you have a good reason to apply the processing that can *only* be carried out by sendmail/procmail in this path?
I poured over this and could not find any.
Please note: I'm not saying "Postfix is not needed" in any universal sense. I said "in this path".
I have postfix and by fetchmail handing to postfix I get *all* mail logged in /var/log/mail, ie: not just fetchmail's and procmail's logs.
It was a *conscious* decision :^).
We forgot another important reason(s): parallelization, speed, and safety. spamd/spamc called from procmail takes 1..3 seconds per email in my machine, but often it goes up to 5 seconds per email (I see this in the log), and even more. This is due not to lack of cpu power, but for waiting for online tests to respond or timeout. See (all are entries trimmed from the 2014-08-15 log): _00,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD scantime=1.1,size=5696,user=cer,uid=1000,required_score=5.0, _00,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD scantime=1.2,size=6345,user=cer,uid=1000,required_score=5.0, _00,DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,FREEMAIL_FROM,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,URIBL_BLOCKED scantime=1.7,size=7578, _00,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,URIBL_BLOCKED scantime=1.9,size=5677,user=cer,uid=1000,re _00,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_TVD_MIME_EPI,URIBL_BLOCKED scantime=1.6,size=6297,user=cer _00,DKIM_SIGNED,RCVD_IN_DNSWL_HI,T_DKIM_INVALID,URIBL_BLOCKED scantime=2.7,size=7320,user=cer,uid _00,DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,FREEMAIL_FROM,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALI _00,DKIM_SIGNED,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,URIBL_BLOCKED scantime=1.6,size=7151 _00,RCVD_IN_DNSWL_HI,URIBL_BLOCKED scantime=13.3,size=6114,user=cer,uid=1000,required_score=5.0 _00,RCVD_IN_DNSWL_HI scantime=13.3,size=5946,user=cer,uid=1000,required_score=5.0,rhost=localhost 00,RCVD_IN_DNSWL_HI scantime=4.0,size=7131,user=cer,uid=1000,required_score=5.0,rhost=localhost 00,DKIM_SIGNED,RCVD_IN_DNSWL_HI,T_DKIM_INVALID,URIBL_BLOCKED scantime=4.1,size=8650,user=cer 0,RCVD_IN_DNSWL_HI,URIBL_BLOCKED scantime=5.3,size=6182,user=cer,uid=1000,required_score=5.0 00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_SOFTFAIL,URIBL_BLOCKED scantime=6.6,size=45027, You can see one scantime=13.3 8seconds) in there (with size=6114). Two, actually. On occasion, 12 seconds per email becomes typical, till I find out that a particular online spamassassin test has to be disabled because the internet server it queries no longer responds. Well, if you use fetchmail calling procmail directly, only one email at a time can be processed, but if you put postfix in the toolchain, it queues email internally, and does so safely, because fetchmail can not tell the imap/pop server to delete that email till postfix has saved it to a file in the queue, and tells fetchmail "Ok, I got it". This is so in the entire chain of servers and daemons, local and remote (should be). Old Unix hands would tell you to mount the partition used for mail processing with the option "sync", for this safety reason. This way fetchmail can run at top speed, without waiting for spam/av processing, because postfix is safely storing email in queues on disk, prior to processing it. And postfix can feed typically 2 procmail processes per user, but if you have the power, you can let it call dozens simultaneously, and that means several spamc/spamd processes running simultaneously. When each email takes a few seconds to be processed, most of that time waiting idle (no cpu used), well, feeding say, 5 emails simultaneously does speed things up a bit. This is a big issue when you get a queue of a thousand emails. It can takes hours if you do one at a time. Ah, and procmail can handle several simultaneous processes, even when using mbox files, because it by default uses a different lock file per destination folder. Of course, if all email goes to the same folder, it has to wait. And typically, the spamc call is in another rule. - -- Cheers, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlP0fMgACgkQtTMYHG2NR9XyyACZAUfz1CJnLJALhrajtbcEkI1O 9kMAn0z1fYqL1JJNMPXn1kHX62gJRC3c =UAnX -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. wrote:
spamd/spamc called from procmail takes 1..3 seconds per email in my machine, but often it goes up to 5 seconds per email (I see this in the log), and even more. This is due not to lack of cpu power, but for waiting for online tests to respond or timeout.
Yup. [snip]
This is a big issue when you get a queue of a thousand emails. It can takes hours if you do one at a time.
Anyone with a lot of email will really want to run spamassassin in parallel - it takes up very little space/cpu because most of the time is spent waiting for the network (DNS queries). (apologies if this has been mentioned earlier, I haven't followed the entire thread.) -- Per Jessen, Zürich (15.1°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-08-20 13:03, Per Jessen wrote:
Carlos E. R. wrote:
(apologies if this has been mentioned earlier, I haven't followed the entire thread.)
Not that I know - Actually I remembered this as I was getting unconscious... (that is, sleeping :-) ) -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
Carlos E. R. wrote:
On 2014-08-20 13:03, Per Jessen wrote:
Carlos E. R. wrote: (apologies if this has been mentioned earlier, I haven't followed the entire thread.) Not that I know - Actually I remembered this as I was getting unconscious... (that is, sleeping :-) ) ---- ... actually ... --- -------- Original Message -------- Subject: Re: Why is dovecot using user home directory? Date: Tue, 19 Aug 2014 14:52:26 -0700 From: Linda Walsh
Linda Walsh wrote:
... In my home dir I have a ".forward" file that forwards my incoming email to my "procmail" equivalent (the perl script that calls spamassassin). ... My perl script drops a summary of the incoming in mail/.log that looks like:
20140701:031807 (Tue) OpenSuSE mail (Re: [opensuse] Firefox Location-Aware Browsing ?)[CLEANED] received (john layt <>) (9.19sec processing) 20140701:032216 (Tue) OpenSuSE mail ( Re: [opensuse] Firefox Location-Aware Browsing ?)[CLEANED] received (per jessen <>) (9.32sec processing) 20140701:033029 (Tue) Firebug mail ([firebug] Re: Missing files on script tab with Firefox 30 + Firebug 2.0)[CLEANED] received () (7.96sec processing) 20140701:033036 (Tue) SpamAssassin mail (Changes in Spamhaus DBL DNSBL return codes) received (axb <>) (9.00sec processing)... 20140819:144406 (Tue) SPAM tagged mail , Supposedly to localaddress, thru localalias: mail (***SPAM*** Perfect solution: Up to 25pounds OFF!) received (daily health tip <>) (8.08sec processing) --- 95% of the processing time is really waiting on network responses. Fortunately, when email comes in, a different copy of the script gets invoked for each message... so if fetchmail has been inactive for some reason, it will pull down hundreds of messages that it tries to process mostly in parallel...(as most of it is waiting)...
Yeah, as Anton says, I could short-circuit the path, but that can create problems if I try to integrate other standard tools in to replace something in my "supply chain"...
The processing w/out calling spamassassin is under 1 sec. If I don't use network tests, then spamassassin takes about 3 seconds to do local processing The 'CLEANED' notation in the log, above, indicates the subject has had at least 1 redundant listname in the subject removed. So, yeah, parallelization in running SA was mentioned "in passing"... :-| (str8face) Linda -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 08/21/2014 11:17 AM, Linda Walsh wrote:
The 'CLEANED' notation in the log, above, indicates the subject has had at least 1 redundant listname in the subject removed.
Ah! One reason I have spamassassin in procmail rather than up front in Postfix is that I have many procmail rules BEFORE I apply spamassassin. * if the ISP has already labelled it is SPAM then put it in the SpamBox right away * if its for any one of a number of lists that I subscribe to, the put it in the list folder right away. Having bracketed list markers on the subject line is nice :-) * if its not in English, put it in the ForeignBox right away * if it meets a pile of garbage conditions them put it in the SpamBox * if its from a 'whitelist' of senders don't bother with spamassassin (yes I know spamassassin has a whitelist; this is faster) All this relieves SpamAssassin of a lot of processing. However you cut it, SpamAssassin is a choke-point. Only using it when I have to relieves a big load. This addresses many of the objections that have come up in this thread. What was that about "metricating"? This isn't instrumentation. This is more 'architecture'. -- Think then act - There is nothing so useless as doing efficiently that which should not be done at all -- Peter Drucker. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday, 2014-08-21 at 12:00 -0400, Anton Aylward wrote:
One reason I have spamassassin in procmail rather than up front in Postfix is that I have many procmail rules BEFORE I apply spamassassin.
* if the ISP has already labelled it is SPAM then put it in the SpamBox right away
I don't. Often, the ISP does the wrong checking, so I let SA have a chance at correcting it. On my telefonica accounts, I explicitly tell fetchmail to also fetch from the spam folder, so that I can verify it, and also to reduce what is stored on the ISP server, that doesn't allow a big storage. On the other hand, google does a quite good filtering, but not always.
* if its for any one of a number of lists that I subscribe to, the put it in the list folder right away.
Again, I don't: I see spam email on several of the mail lists. On some I get a lot (the xfs mail list, for instance). So I let SA check.
Having bracketed list markers on the subject line is nice :-)
I don't look at them at all. I look at the "X-Mailinglist" header instead.
* if its not in English, put it in the ForeignBox right away
I'm bilingual, so I have to handle emails it two languages every day. I don't sort by language.
* if it meets a pile of garbage conditions them put it in the SpamBox
* if its from a 'whitelist' of senders don't bother with spamassassin (yes I know spamassassin has a whitelist; this is faster)
But not as precise; and it is easier to maintain SA than a huge lot of procmail recipes (and I have a lot).
All this relieves SpamAssassin of a lot of processing. However you cut it, SpamAssassin is a choke-point. Only using it when I have to relieves a big load.
I find it easier to let it run and do its job, that have me thinking and adjusting rules and filters all the time. - -- Cheers, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlP33v4ACgkQtTMYHG2NR9UM4gCgg9DfbfDvJR5OiqwUt9MiZIuX 4J0An2L0jlsRwF6WObjC8VG/nazxU0Bs =iZ+Z -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 08/22/2014 08:23 PM, Carlos E. R. wrote:
On Thursday, 2014-08-21 at 12:00 -0400, Anton Aylward wrote:
One reason I have spamassassin in procmail rather than up front in Postfix is that I have many procmail rules BEFORE I apply spamassassin.
* if the ISP has already labelled it is SPAM then put it in the SpamBox right away
I don't. Often, the ISP does the wrong checking,
Mine does too. That's whay I have varying levels of spambox and why I have whitelisting of known things, regardless of what the ISP says, up frot. The idea is to reduce the load on spamassassin. Actuall I have 10Spam 20Spam 30Spam 40Spam 50Spam blacklist fontSpam isSpam probablySpam noSubject numericIP and only after checking fo items get deleted or put in spam-learn or spam-unlearn But the point is that where possible and where sensible I deal with it before spamassassin because of the comparative cost & delay of spamassassin processing.
* if its for any one of a number of lists that I subscribe to, the put it in the list folder right away.
Again, I don't: I see spam email on several of the mail lists. On some I get a lot (the xfs mail list, for instance). So I let SA check.
What's this with being absolutist? There are lists I'm on that are, unlike this one, closed/subscription only. You can't just automatically sign up. Even on this one, spamming is very very rate and I can put up with that, deal with it manually. I don't want SA to ;earn "[opensuse]"
Having bracketed list markers on the subject line is nice :-)
I don't look at them at all. I look at the "X-Mailinglist" header instead.
Being absolutist again? Actually I have a 'trust but verify' with nested tests. If the subject like has "\[.*\]" then and only then do I bring in the mailing list processing rules and look at the other fields. Did I mention I have a highly modular procmail? ~anton/.procmail/<modules> Procmail is very good with patterns!
* if its not in English, put it in the ForeignBox right away
I'm bilingual, so I have to handle emails it two languages every day. I don't sort by language.
About 99.8% of what gets rejected by that module is non-european characters. Mostly Asian. I can make some sense of most of the "Romance" languages having studied Latin and Greek at school, but the Asian ones I can't make sense of. Don't you get any in Asian and Russian character sets?
* if it meets a pile of garbage conditions them put it in the SpamBox
* if its from a 'whitelist' of senders don't bother with spamassassin (yes I know spamassassin has a whitelist; this is faster)
But not as precise; and it is easier to maintain SA than a huge lot of procmail recipes (and I have a lot).
YMMV. I find procmail easier to maintain than SA. SA needs to be compiled; procmail I just add to a test file: This is the "long" version of the whilelist.rc module that makes it all very obvious. --------------------------------------------------- # Test if the email's sender is whitelisted; if so, send it straight to # $DEFAULT. Note that this comes before any other filters. :0: * ? formail -rt -x"From" -x"From:" -x"Sender:" \ -x"Reply-To:" -x"Return-Path:" -x"To:" \ | egrep -is -f ${HOME}/.whitelist { LOG="Found on whitelist$NL" $DEFAULT } So all I have to do is edit ~/.whitelist ------------------------------------------------------ If there's a way to use an external dynamically modified list with SA, one that includes patters as well, I haven't found it.
All this relieves SpamAssassin of a lot of processing. However you cut it, SpamAssassin is a choke-point. Only using it when I have to relieves a big load.
I find it easier to let it run and do its job, that have me thinking and adjusting rules and filters all the time.
Well EXCUSE ME! This isn't an "all the time" any more than running sa-learn is "all the time". It was all up-front thought and design, put in place, tested AND THEN LET RUN. Apart from adding to whitelist or blacklist, which I haven't had to do in the last 4 months, this is maintenance free. back in the resource starved 800MHz 1G RAM system that was out of the Closet of Anxieties and was the mail hub, this was great. I've had not reason that change it. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
I find procmail easier to maintain than SA. SA needs to be compiled;
Uh, compiling is optional. I've never bothered, it's fast enough as it is.
procmail I just add to a test file:
This is the "long" version of the whilelist.rc module that makes it all very obvious.
---------------------------------------------------
# Test if the email's sender is whitelisted; if so, send it straight # to $DEFAULT. Note that this comes before any other filters. :0: * ? formail -rt -x"From" -x"From:" -x"Sender:" \ -x"Reply-To:" -x"Return-Path:" -x"To:" \ | egrep -is -f ${HOME}/.whitelist { LOG="Found on whitelist$NL" $DEFAULT }
So all I have to do is edit ~/.whitelist
Same with SA - just add/remove rules such as: whitelist_from whitelist_from_rcvd whitelist_from_dkim whitelist_from_spf (depending on the kind of whitelisting you want).
If there's a way to use an external dynamically modified list with SA, one that includes patters as well, I haven't found it.
SA handles patterns extremely well, virtually every rule is a regex. Only the whitelisting rules are a little picky (no regexes, only wildcards). I have occasionally had a need for using regex patterns in e.g. "whitelist_from_rcvd". As for using "an external dynamically modified list with SA", here's what I do - my whitelist is just an SA ruleset called whitelist.cf (for instance). When I modify it, I copy it to the servers that run SA, and do an "rcspamd reload". (of course this is all automated). Anyway, I'm not saying SA is better than your method with procmail, only that it could just as easily be done with SA. I do however think SA does the whitelisting a lot better with whitelist_from_{rcvd,dkim,spf}, but if you don't need that, that's obviously not an advantage. -- Per Jessen, Zürich (15.8°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-08-23 10:51, Per Jessen wrote:
Anton Aylward wrote:
I find procmail easier to maintain than SA. SA needs to be compiled;
Uh, compiling is optional. I've never bothered, it's fast enough as it is.
I didn't even know it was possible?
Same with SA - just add/remove rules such as:
whitelist_from whitelist_from_rcvd whitelist_from_dkim whitelist_from_spf
I edit the file ".spamassassin/user_prefs" and add them. How do you do it? But I have only used whitelist_from and blacklist_from, I don't know what the others are for (I can guess a bit, though).
As for using "an external dynamically modified list with SA", here's what I do - my whitelist is just an SA ruleset called whitelist.cf (for instance). When I modify it, I copy it to the servers that run SA, and do an "rcspamd reload". (of course this is all automated).
I'd be interesting in learning more details about that ;-)
Anyway, I'm not saying SA is better than your method with procmail, only that it could just as easily be done with SA. I do however think SA does the whitelisting a lot better with whitelist_from_{rcvd,dkim,spf}, but if you don't need that, that's obviously not an advantage.
Yes, that's what I think. I sometimes add rules in procmail to block specific mail bomb raids, though. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On 08/23/2014 09:36 AM, Carlos E. R. wrote:
As for using "an external dynamically modified list with SA", here's what I do - my whitelist is just an SA ruleset called whitelist.cf (for instance). When I modify it, I copy it to the servers that run SA, and do an "rcspamd reload". (of course this is all automated). I'd be interesting in learning more details about that ;-)
Its a ridiculously simple rule and comes from a list of recipes I found on the 'Net ---------------- here you are... AGAIN ----------- # Test if the email's sender is whitelisted; if so, send it straight to # $DEFAULT. Note that this comes before any other filters. :0: * ? formail -rt -x"From" -x"From:" -x"Sender:" \ -x"Reply-To:" -x"Return-Path:" -x"To:" \ | egrep -is -f ${HOME}/.whitelist { LOG="Found on whitelist$NL" $DEFAULT } -------------------------------------------- Actually there's a short-form for all that, but this illustrates clearly Key is a simple grep against a list of patterns. On the machine this originally ran on, formail and grep were small and fast and used already cached libraries. grep is amazingly fast. Compared to starting perl and loading script and having spamassassin precompile... Well yes, running spamd/spamc changes all that, but not as much as you might think. When you start spamassassin or spamd it start perl and reads in the basic script *and* *compiles* it if you don't have the precomplied version. If you are calling spamassassin from the procmail or most of the other configs I know its gets this done on a per message basis. Starting perl is heavy. That why we use spamd/spamc that has it started already. Only the version running as a daemon is not user-specific, it still has to read all the per user settings and compile them as well. And yes this gets one on a per message basis. Its a lot better than having to start perl on a per message basis! But the custom use settings, that per user whitelist stuff, still gets read in and compiled and the modules found etc etc. So what it comes down to is this: what's the process switching overhead? The spamd/spamc method involves a lot of blocking and system calls and switch of execution processes. How well that is handled varies with CPU and memory. I measured this with the old machine and it was neck and neck. I know I could make my procmail more efficient, doing the 'formail' one "up-front", for example rather than in each module. I often prefer clarity over efficiency (one reason I don't use C or C++ any more and prefer scripting languages, even Perl). The reality is that it may be years before I come back to do 'maintenance' and the least bit of obscurity makes life too difficult. Once in upgrade I had something about spamd fail. I forget the details but think it was to do with perl. But the bulk of procmail still worked, formail, grep. So my mail came though and whitelisting and blacklisting and all the pre-spam checking such as the Asian character set was filtered out. To be honest it was a few days before I noticed the problem, mail was flowing so well. I like robustness and resilience and 'fail-soft'. -- /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 08/23/2014 09:36 AM, Carlos E. R. wrote:
As for using "an external dynamically modified list with SA", here's what I do - my whitelist is just an SA ruleset called whitelist.cf (for instance). When I modify it, I copy it to the servers that run SA, and do an "rcspamd reload". (of course this is all automated). I'd be interesting in learning more details about that ;-)
Its a ridiculously simple rule and comes from a list of recipes I found on the 'Net
[snip]
Actually there's a short-form for all that, but this illustrates clearly
Key is a simple grep against a list of patterns.
On the machine this originally ran on, formail and grep were small and fast and used already cached libraries. grep is amazingly fast. Compared to starting perl and loading script and having spamassassin precompile...
But what is the point Anton? Does it matter if you receive each email 1 or 2 seconds faster? You're optimizing for no reason at all, IMHO. Maybe it's a hobby, fair enough, but that's not a selling point. Given that we already have perl and DNS queries in the mix, performance has already been heavily compromised and is therefore irrelevant, IMHO.
The reality is that it may be years before I come back to do 'maintenance' and the least bit of obscurity makes life too difficult.
That's precisely my argument for keeping it all in SA. Apply the KISS principle. -- Per Jessen, Zürich (17.2°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. wrote:
On 2014-08-23 10:51, Per Jessen wrote:
Anton Aylward wrote:
I find procmail easier to maintain than SA. SA needs to be compiled;
Uh, compiling is optional. I've never bothered, it's fast enough as it is.
I didn't even know it was possible?
There is a new(ish) function called sa-compile : http://spamassassin.apache.org/full/3.2.x/doc/sa-compile.html Compiling the ruleset will no doubt speed up things, but as you and I have both mentioned, most of the time is spent waiting for DNS replies. I'm planning on playing with it at some point, but I also subscribe to "if it ain't broke, ...."
Same with SA - just add/remove rules such as:
whitelist_from whitelist_from_rcvd whitelist_from_dkim whitelist_from_spf
I edit the file ".spamassassin/user_prefs" and add them. How do you do it?
My setup is slightly different, but in essence I also edit ".spamassassin/user_prefs".
But I have only used whitelist_from and blacklist_from, I don't know what the others are for (I can guess a bit, though).
The problem with the plain whitelist is that sender addresses are so easily forged. When you use whitelist_from_{rcvd,spf,dkim} instead, the rule is a lot more accurate.
As for using "an external dynamically modified list with SA", here's what I do - my whitelist is just an SA ruleset called whitelist.cf (for instance). When I modify it, I copy it to the servers that run SA, and do an "rcspamd reload". (of course this is all automated).
I'd be interesting in learning more details about that ;-)
One correction - for the per-user rules, spamd isn't reloaded, those rules are loaded when needed. In the above, I was talking about my site-wide whitelist. There isn't much to it really - I run spamd as an smtp relay within postfix. For our test-system (this is a corporate setup) I have four old desktop boxes with PentiumII 400MHz and 384Mb. I maintain the SA config on a central system. From here it is rsync'ed to the four servers. Each server monitors (see "incron") the SA config for changes and does an rcspamd reload when there are changes. -- Per Jessen, Zürich (16.7°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 08/23/2014 04:51 AM, Per Jessen wrote:
Same with SA - just add/remove rules such as:
whitelist_from whitelist_from_rcvd whitelist_from_dkim whitelist_from_spf
Yes I have a couple of those but the use-case example have it in the SA config and not a end-user maintainable text file. -- Most of what we call management consists of making it difficult for people to get their work done. --Peter F. Drucker -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-08-23 15:43, Anton Aylward wrote:
On 08/23/2014 04:51 AM, Per Jessen wrote:
Same with SA - just add/remove rules such as:
whitelist_from whitelist_from_rcvd whitelist_from_dkim whitelist_from_spf
Yes I have a couple of those but the use-case example have it in the SA config and not a end-user maintainable text file.
~/.spamassassin/user_prefs -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
Anton Aylward wrote:
On 08/23/2014 04:51 AM, Per Jessen wrote:
Same with SA - just add/remove rules such as:
whitelist_from whitelist_from_rcvd whitelist_from_dkim whitelist_from_spf
Yes I have a couple of those but the use-case example have it in the SA config and not a end-user maintainable text file.
FYI, spamd is perfectly capable of using rules from an end-user home directory. -- Per Jessen, Zürich (17.1°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 08/23/2014 11:06 AM, Per Jessen wrote:
Anton Aylward wrote:
On 08/23/2014 04:51 AM, Per Jessen wrote:
Same with SA - just add/remove rules such as:
whitelist_from whitelist_from_rcvd whitelist_from_dkim whitelist_from_spf
Yes I have a couple of those but the use-case example have it in the SA config and not a end-user maintainable text file.
FYI, spamd is perfectly capable of using rules from an end-user home directory.
*sigh* Please: I'm not saying that isn't so. What I am saying is that if you run spamd then it starts perl when you start the daemon, reads the application (and compiles it if you haven't precompiled it), reads the system rules (and compiles them if you haven't precompiled the) and sits waiting. It is a shared resouce and is not at this point reading the per user config. As I understand it, when you run spamc what happens is that spamd forks off a child process to deal with the per user, per message processing. That child reads the per user setting of which you speak. It reads them and compiles them. What has been saved is the heavy overhead of perl startup and compiling the basic system rules. Whether you consider this 'dynamically reading the whitelist' or not depends on your outlook. I think it involves more than the formail/egrep, but that's my opinion. As I say, all this forking, exec'ing, loading the whitelist code, compiling same, etc is going to vary with they type (as well as the speed) of the CPU. Advances in how perl deals with all this will change the balance. I don't believe there is a 'one size fits all'. That's very much a simplistic approach typical of the way Microsoft delivers, never mind what they actually market. -- /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 08/23/2014 11:06 AM, Per Jessen wrote:
Anton Aylward wrote:
On 08/23/2014 04:51 AM, Per Jessen wrote:
Same with SA - just add/remove rules such as:
whitelist_from whitelist_from_rcvd whitelist_from_dkim whitelist_from_spf
Yes I have a couple of those but the use-case example have it in the SA config and not a end-user maintainable text file.
FYI, spamd is perfectly capable of using rules from an end-user home directory.
*sigh*
Please: I'm not saying that isn't so.
Your last paragraph above sounds like that's what you thought when you wrote it. You seem to be saying that "the SA config" is not an end-user maintainable text file. I just wanted to say that it can be. Sorry if I've misunderstood.
What I am saying is that if you run spamd then it starts perl when you start the daemon, reads the application (and compiles it if you haven't precompiled it), reads the system rules (and compiles them if you haven't precompiled the) and sits waiting.
It is a shared resouce and is not at this point reading the per user config. As I understand it, when you run spamc what happens is that spamd forks off a child process to deal with the per user, per message processing. That child reads the per user setting of which you speak. It reads them and compiles them.
Right.
What has been saved is the heavy overhead of perl startup and compiling the basic system rules.
Yup.
Whether you consider this 'dynamically reading the whitelist' or not depends on your outlook. I think it involves more than the formail/egrep, but that's my opinion.
It quite clearly is 'dynamically reading the whitelist'. The per-user config containing the whitelist directives is read by the spawned spamd child, dynamically. Whether it involves more or less than formail/grep is, well, irrelevant. I mean, we can discuss spamassassin performance if you want, but it means doing the measurements, comparing rulesets etc, and I'm simply not really concerned about it. I'm sure my SA setup can be tuned and pimped, but as long as it runs very well on these ancient Pentium IIs, it's just not a priority.
As I say, all this forking, exec'ing, loading the whitelist code, compiling same, etc is going to vary with they type (as well as the speed) of the CPU.
Sure, a faster CPU will do those bits faster. -- Per Jessen, Zürich (16.8°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 8/22/2014 10:20 PM, Anton Aylward wrote:
The idea is to reduce the load on spamassassin.
Why? Do you also occasionally pound nails with your fist or a frying pan to reduce the load on your hammer? What else were you going to do with those machine cycles? When you offload checking to a boat load of scripts that YOU have to maintain rather than a well trained Spamassassin you end up spending MORE cycles, and more of your time. To what end? I find SA is just about perfect. I might get one spam a month that sneaks through. I have a "Probably Spam folder" that catches the questionable emails, (which gets fifo purged in 4 days), and I outright discard large quantities of high scoring spam. (False positives? Don't even go there! You score above my spam discard level and I don't want your email period, and I don't want you contacting anyone in my company. I'm not so anal about email that I worry about false positives.) -- _____________________________________ ---This space for rent--- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
John Andersen wrote:
On 8/22/2014 10:20 PM, Anton Aylward wrote:
The idea is to reduce the load on spamassassin.
Why?
Do you also occasionally pound nails with your fist or a frying pan to reduce the load on your hammer?
What else were you going to do with those machine cycles? When you offload checking to a boat load of scripts that YOU have to maintain rather than a well trained Spamassassin you end up spending MORE cycles, and more of your time.
John, thanks for putting it so succinctly, my point exactly. -- Per Jessen, Zürich (15.3°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 08/21/2014 11:17 AM, Linda Walsh wrote:
The 'CLEANED' notation in the log, above, indicates the subject has had at least 1 redundant listname in the subject removed.
Ah!
One reason I have spamassassin in procmail rather than up front in Postfix is that I have many procmail rules BEFORE I apply spamassassin.
* if the ISP has already labelled it is SPAM then put it in the SpamBox right away
* if its for any one of a number of lists that I subscribe to, the put it in the list folder right away.
Having bracketed list markers on the subject line is nice :-)
* if its not in English, put it in the ForeignBox right away
* if it meets a pile of garbage conditions them put it in the SpamBox
* if its from a 'whitelist' of senders don't bother with spamassassin (yes I know spamassassin has a whitelist; this is faster)
All this relieves SpamAssassin of a lot of processing. However you cut it, SpamAssassin is a choke-point. Only using it when I have to relieves a big load.
I guess we're down to a matter of "is it worth it?". I don't see SA as a choke-point at all - it's sufficiently fast, even on my ancient hardware and including my own ~1400 extra rules. I sincerely doubt if pre-filtering with <whatever> will reduce the load significantly, so I leave all filtering to spamassassin. Also simplifies the overall setup. I'm curious, how do you determine if something is not in English? -- Per Jessen, Zürich (15.7°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-08-23 10:30, Per Jessen wrote:
Anton Aylward wrote:
I guess we're down to a matter of "is it worth it?". I don't see SA as a choke-point at all - it's sufficiently fast, even on my ancient hardware and including my own ~1400 extra rules. I sincerely doubt if pre-filtering with <whatever> will reduce the load significantly, so I leave all filtering to spamassassin. Also simplifies the overall setup.
That's my idea, yes.
I'm curious, how do you determine if something is not in English?
There is an easy to activate a plugin in SA that tells you the countries of the relays of every email. One of yours: you: X-Spam-Relay-Country: DE ** US ** ** DE DE CH ** Me: X-Spam-Relay-Country: DE ** US ** ** DE ES ES ES ES ** ** ** ES Anton: X-Spam-Relay-Country: DE ** US ** ** DE US ** CA I still have not deciphered the exact meaning, I activated it 2 or 3 days ago. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On 08/23/2014 09:27 AM, Carlos E. R. wrote:
On 2014-08-23 10:30, Per Jessen wrote:
Anton Aylward wrote:
I guess we're down to a matter of "is it worth it?". I don't see SA as a choke-point at all - it's sufficiently fast, even on my ancient hardware and including my own ~1400 extra rules. I sincerely doubt if pre-filtering with <whatever> will reduce the load significantly, so I leave all filtering to spamassassin. Also simplifies the overall setup.
That's my idea, yes.
I'm curious, how do you determine if something is not in English?
There is an easy to activate a plugin in SA that tells you the countries of the relays of every email. One of yours:
you: X-Spam-Relay-Country: DE ** US ** ** DE DE CH **
Me: X-Spam-Relay-Country: DE ** US ** ** DE ES ES ES ES ** ** ** ES
Anton: X-Spam-Relay-Country: DE ** US ** ** DE US ** CA
I still have not deciphered the exact meaning, I activated it 2 or 3 days ago.
makes no sense. There are people, even people whose first language is not English, who can mail me stuff in English from countries far away. There are people here in north America who mail stuff that is not in English; NA ha lots of non-English speakers. -- /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Carlos E. R. wrote:
On 2014-08-23 10:30, Per Jessen wrote:
Anton Aylward wrote:
I guess we're down to a matter of "is it worth it?". I don't see SA as a choke-point at all - it's sufficiently fast, even on my ancient hardware and including my own ~1400 extra rules. I sincerely doubt if pre-filtering with <whatever> will reduce the load significantly, so I leave all filtering to spamassassin. Also simplifies the overall setup.
That's my idea, yes.
I'm curious, how do you determine if something is not in English?
There is an easy to activate a plugin in SA that tells you the countries of the relays of every email.
Yes, I use something similar, but that doesn't tell me which language something is written in.
One of yours:
you: X-Spam-Relay-Country: DE ** US ** ** DE DE CH **
Me: X-Spam-Relay-Country: DE ** US ** ** DE ES ES ES ES ** ** ** ES
Anton: X-Spam-Relay-Country: DE ** US ** ** DE US ** CA
I still have not deciphered the exact meaning, I activated it 2 or 3 days ago.
It's not very complicated - it's just a list of countrycodes of the relays in chronological order (most recent first). -- Per Jessen, Zürich (16.8°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 08/23/2014 04:30 AM, Per Jessen wrote:
I'm curious, how do you determine if something is not in English?
The module which does that is about 300 lines long and checks for a few other similar things and some off boundary-value conditions and contradictory things and the characteristics of know spamming applications. For the most part, it checks for the "charecter-set" line in the header or in the MIME declaration. Sometimes messages with Subject lines that are non-english have language identifiers there as well. All this is from standard procmail recipes out there on the 'Net. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 08/23/2014 04:30 AM, Per Jessen wrote:
I sincerely doubt if pre-filtering with <whatever> will reduce the load significantly
When I measured it on the old hardware it did. As I've commented before, the Closet of Anxieties gives me access to a lot of old hardware. Its not quite the "one machine, one task" of Microsoft, but it does mean that many functions can be off-loaded from my workstation. As a side observation: for some reason those old 500-800MHz machines with old RAM and old 20-30G drives seem rock solid. They don't overheat and even though those drives are more than 10 years old, they have outlived more modern and higher speed 500G, 750G and two 1T drives. The one under my desk doing email was decommissioned as a desktop nearly 10 years ago and has been there under my desk for the last 3 years collecting dust-balls. I upgraded my desktop and kew that my email was still being collected :-) Perhaps some day I'll replace it with this desktop, dual core, gobs of memory, terabyte raid, but lets face it; this box overheats in the summer if the a/c fails and the winter since, this being Canada, it a LOT warmer in the winter than the summer -- indoors. That old box is more reliable than the desktops that replaced it, but wtf, it couldn't run later versions of Windows. It still runs up to date RH and opesuse and others. There's a moral here somewhere. -- /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-08-23 15:41, Anton Aylward wrote:
summer if the a/c fails and the winter since, this being Canada, it a LOT warmer in the winter than the summer -- indoors.
I know that! I have never been so hot anywhere than in Canadian shops in Winter. Snow outside, and me not having a car I used buses and walked, fully equipped for a Real Winter (remember I'm from the south-eastern coast of Spain, at the Mediterranean: so locally bought thick coat, scarf, boots, gloves, the lot), and I started to sweat soon after entering the shops. Houses were even warmer, but there I was soon invited to shear of clothes. I love visiting Ottawa during the Winterlude...
That old box is more reliable than the desktops that replaced it, but wtf, it couldn't run later versions of Windows. It still runs up to date RH and opesuse and others. There's a moral here somewhere.
Just buy one of those gadgets to measure electricity. Sometimes those old boxes use too much. I use old laptops for similar tasks, though... about 50W. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
Anton Aylward wrote:
As a side observation: for some reason those old 500-800MHz machines with old RAM and old 20-30G drives seem rock solid. They don't overheat and even though those drives are more than 10 years old, they have outlived more modern and higher speed 500G, 750G and two 1T drives.
Yup, that is also my experience. A 9.1Gb SCSI drive will easily outlive a moden 600Gb. A 6.4Gb ATA drive will easily outlive an outdated 40Gb drive. The engineers are getting very very good at building stuff that has a very exact lifetime. -- Per Jessen, Zürich (16.9°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 08/23/2014 11:04 AM, Per Jessen wrote:
Anton Aylward wrote:
As a side observation: for some reason those old 500-800MHz machines with old RAM and old 20-30G drives seem rock solid. They don't overheat and even though those drives are more than 10 years old, they have outlived more modern and higher speed 500G, 750G and two 1T drives.
Yup, that is also my experience. A 9.1Gb SCSI drive will easily outlive a moden 600Gb. A 6.4Gb ATA drive will easily outlive an outdated 40Gb drive. The engineers are getting very very good at building stuff that has a very exact lifetime.
Perhaps its because we have those long lived older, smaller drives around that we get annoyed by the up-front "bathtub" failures of modern drives. 50% of the 1T drives I've installed this year have failed one way or another within the first 4 months; one of them within the first week. Did those old, old drives do this? We don't remember; all we remember is that they lasted longer than than the Energizer Bunny. -- /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Anton Aylward wrote:
On 08/23/2014 11:04 AM, Per Jessen wrote:
Anton Aylward wrote:
As a side observation: for some reason those old 500-800MHz machines with old RAM and old 20-30G drives seem rock solid. They don't overheat and even though those drives are more than 10 years old, they have outlived more modern and higher speed 500G, 750G and two 1T drives.
Yup, that is also my experience. A 9.1Gb SCSI drive will easily outlive a moden 600Gb. A 6.4Gb ATA drive will easily outlive an outdated 40Gb drive. The engineers are getting very very good at building stuff that has a very exact lifetime.
Perhaps its because we have those long lived older, smaller drives around that we get annoyed by the up-front "bathtub" failures of modern drives. 50% of the 1T drives I've installed this year have failed one way or another within the first 4 months; one of them within the first week.
I haven't been counting, but I've had quite a few RMAs of 1/2/3/4 Tb drives - all consumer level though.
Did those old, old drives do this? We don't remember; all we remember is that they lasted longer than than the Energizer Bunny.
I don't remember them doing it, and just the fact that I'm still employing 9.1Gb SCSI drives for swap space is a testament to their longevity. I frequently put a RAID1 of two 9.1Gb drives as swap space in older servers that still have U320 SCSI slots. -- Per Jessen, Zürich (17.2°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-08-23 19:09, Per Jessen wrote:
Anton Aylward wrote:
Did those old, old drives do this? We don't remember; all we remember is that they lasted longer than than the Energizer Bunny.
I don't remember them doing it, and just the fact that I'm still employing 9.1Gb SCSI drives for swap space is a testament to their longevity. I frequently put a RAID1 of two 9.1Gb drives as swap space in older servers that still have U320 SCSI slots.
The first HD I bought, a 30 MB (32?) unit, came with a defect list on a glued label to the back. You had to call a program in the BIOS by starting "debug" from MsDOS, load directly the program counter with an address, and tell it to run. A menu appeared. Then you had to low level format the HD, decide on an interleave factor (something about 12 made my disk work way faster than the recommended value of 2 or 3 - yes, I tested all of them, one by one, from 1 to about 15), and enter the defect list. The defect list was a known list of bad sectors. Supposedly you marked them as bad, and the high level dos format would later skip them. Actually something failed, because Dos did detect those sectors as bad and mark them such in the FAT tables, so the BIOS marking did not work as supposed. So disks came with defects. But they seldom developed more. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On Wed, Aug 20, 2014 at 01:03:12PM +0200, Per Jessen wrote:
Carlos E. R. wrote:
spamd/spamc called from procmail takes 1..3 seconds per email in my machine, but often it goes up to 5 seconds per email (I see this in the log), and even more. This is due not to lack of cpu power, but for waiting for online tests to respond or timeout.
Yup.
[snip]
This is a big issue when you get a queue of a thousand emails. It can takes hours if you do one at a time.
Anyone with a lot of email will really want to run spamassassin in parallel - it takes up very little space/cpu because most of the time is spent waiting for the network (DNS queries).
(apologies if this has been mentioned earlier, I haven't followed the entire thread.)
The last time I used spam assassign, aside from pegging my CPU, and creating a huge database file, it also took forever to train. Admitedly, this was a long time ago, but I used to recomend that spam assassign be put on its own machine back then. Ruben
-- Per Jessen, Zürich (15.1°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- So many immigrant groups have swept through our town that Brooklyn, like Atlantis, reaches mythological proportions in the mind of the world - RI Safir 1998 http://www.mrbrklyn.com DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002 http://www.nylxs.com - Leadership Development in Free Software http://www2.mrbrklyn.com/resources - Unpublished Archive http://www.coinhangout.com - coins! http://www.brooklyn-living.com Being so tracked is for FARM ANIMALS and and extermination camps, but incompatible with living as a free human being. -RI Safir 2013 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 8/20/2014 4:40 PM, Ruben Safir wrote:
The last time I used spam assassign, aside from pegging my CPU, and creating a huge database file, it also took forever to train. Admitedly, this was a long time ago, but I used to recomend that spam assassign be put on its own machine back then.
Ruben
Nah. Not unless you have a monstrous mail queue for a large organization. Using Postfix + Amavis spamassassin is run by a binary process rather than sending everything through spamassassin itself. And it gets virus scanned as well. http://www.ijs.si/software/amavisd/ -- _____________________________________ ---This space for rent--- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Ruben Safir wrote:
On Wed, Aug 20, 2014 at 01:03:12PM +0200, Per Jessen wrote:
Carlos E. R. wrote:
spamd/spamc called from procmail takes 1..3 seconds per email in my machine, but often it goes up to 5 seconds per email (I see this in the log), and even more. This is due not to lack of cpu power, but for waiting for online tests to respond or timeout.
Yup.
[snip]
This is a big issue when you get a queue of a thousand emails. It can takes hours if you do one at a time.
Anyone with a lot of email will really want to run spamassassin in parallel - it takes up very little space/cpu because most of the time is spent waiting for the network (DNS queries).
(apologies if this has been mentioned earlier, I haven't followed the entire thread.)
The last time I used spam assassign, aside from pegging my CPU, and creating a huge database file, it also took forever to train.
If you're using Bayes, which is optional.
Admitedly, this was a long time ago, but I used to recomend that spam assassign be put on its own machine back then.
No real need - certainly not today with multiple cores and gigabytes of memory. I have a test system running spamassassin on a (set of) Pentium II 400MHz with 384Mb RAM. -- Per Jessen, Zürich (15.1°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Per Jessen wrote:
Ruben Safir wrote:
On Wed, Aug 20, 2014 at 01:03:12PM +0200, Per Jessen wrote:
Carlos E. R. wrote:
spamd/spamc called from procmail takes 1..3 seconds per email in my machine, but often it goes up to 5 seconds per email (I see this in the log), and even more. This is due not to lack of cpu power, but for waiting for online tests to respond or timeout.
Yup.
[snip]
This is a big issue when you get a queue of a thousand emails. It can takes hours if you do one at a time.
Yes -- vs. 5-10 minutes w/load average in the 20's-30's. Anyone with a lot of email will really want to run spamassassin in parallel - it takes up very little space/cpu because most of the time is spent waiting for the network (DNS queries).
This is one reason for having fetchmail talk to an MTA -- sendmail was designed with multiple queuing options -- for back when various types of mail were sometimes handled in batches. That's one reason why sendmail is so complex -- that and handling different, non-internet addresses and *routing* that are little used today, but w/support for them still in sendmail.
The last time I used spam assassign, aside from pegging my CPU, and creating a huge database file, it also took forever to train.
I keep the past few years of spam primarily for that purpose -- Can train it on a large specific workload.
If you're using Bayes, which is optional.
---- I used to ONLY run Bayes... it's the best part. It used to have a a > 90% accuracy rate (dunno about current rates) as combine them w/the network tests. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 08/20/2014 07:40 PM, Ruben Safir wrote:
The last time I used spam assassign, aside from pegging my CPU, and creating a huge database file, it also took forever to train. Admitedly, this was a long time ago, but I used to recomend that spam assassign be put on its own machine back then.
I'm sorry if "it works for me" and "yes it used to be but we changed all that" and "that was then, this is now" all sound hokey, but the reality is that SpamAssassin of the now is PDQ. Earlier this year I set up a new 13.1 on a new 1T drive and the arrangement Carlos says is inefficient, the 'one at a time" fully interlocked. That is fetchmail -> procmail -> spamassassin -> procmail -> INBOX There were about 4 riles before procmail handed off to spamassassin and about 20 after. Originally I had all logging for everything turned on so I could make sure it was all operating correctly. I had procmail doing 'full' logging, whch is about three lines per message, and I had each rule issue a message saying that it had been invoked and whether or not it did its job. The output of that procmail trace went to a log file, and I ran 'tail -f' on that file to the console. What went by on the console was too fast for me to read in detail, somewhere around 20-30 lines per second. I did see that the invocation of spamassassin was there, it occupied a constant position in the vertical scrolling. Now this may not be fast by corporate standards and I agree with Carlos that the use of a MTA like postfix and doing queueing and letting spamassasin run in parallel (on a multi-core machine) is a good idea. But my point is that for a single user who handles only about 200-500 messages a day, this is no slouch. -- /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2014-08-21 13:54, Anton Aylward wrote:
Now this may not be fast by corporate standards and I agree with Carlos that the use of a MTA like postfix and doing queueing and letting spamassasin run in parallel (on a multi-core machine) is a good idea.
But my point is that for a single user who handles only about 200-500 messages a day, this is no slouch.
There are periods when all the email I get took about 12 seconds to process each one. That's well over one hour... for a single user. It is not normal, but it happens to me, now and then (it doesn't if you disable online tests). That's why I have to make sure that parallelization works. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)
On 08/21/2014 08:39 AM, Carlos E. R. wrote:
On 2014-08-21 13:54, Anton Aylward wrote:
Now this may not be fast by corporate standards and I agree with Carlos that the use of a MTA like postfix and doing queueing and letting spamassasin run in parallel (on a multi-core machine) is a good idea.
But my point is that for a single user who handles only about 200-500 messages a day, this is no slouch.
There are periods when all the email I get took about 12 seconds to process each one. That's well over one hour... for a single user.
It is not normal, but it happens to me, now and then (it doesn't if you disable online tests). That's why I have to make sure that parallelization works.
I think you point earlier in this thread makes sense and hope to have time to convert before the year is out. -- A lot of managers talk about 'thinking out of the box,' but they don't understand the communication process by which that happens. You do not think out of the box by commanding the box! You think out of the box precisely by bringing ideas together that don't allow dominant ideas to continue to dominate. -- Stan Deetz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (7)
-
Anton Aylward
-
Carlos E. R.
-
Carlos E. R.
-
John Andersen
-
Linda Walsh
-
Per Jessen
-
Ruben Safir