Re: [SLE] Spam

3 Feb 2006


      Carlos E. R. wrote:
...
(I forgot to say that many of those false positives are from
newsletters).
Same here.  I'm in the process of building bayes-style filters that are
meant for recognising just newsletters.  That way I'll be able to add
perhaps a couple of points, stopping a newsletter from ending up as a
false positive.
...
Looks good... it must depend on the kind of spam you receive, I
suppose. Also, I suppose you must be using the networks tests: it's
true that they flag a lot of spam, but sometimes they are unfair.
Yep, I'm using network tests, my own blacklists, honeypots etc.
...
BAYES_95           0.0001 0.0001 3.0   3.0
DNS_FROM_RFC_POST  0      1.440  0     1.708  Envelope sender in
postmaster.rfc-ignorant.org
DNS_FROM_RFC_WHOIS 0      0.879  0     1.447  Envelope sender in
whois.rfc-ignorant.org
I don't use rfc-ignorant other than as an indicator of a possibly dodgy
server.  Given that number of poorly configured mail-servers, using
rfc-ignorant is a very agressive step, IMHO.
...
Even lower. SuSE must be using very altered values. And a badly
trained Bayesian database: mine scores that same email at 5%, not 95%.
Bayes is a double-edged sword - you've got to be very particular about
what you record as spam/ham.  Especially if you're not just training
your bayes filters for purely personal use.  And you've got to be
careful with cleaning up the database too.
...
However... it proves my point that the postmaster (ISP) being ignorant
of the RFC doesn't prove that their users send spam,
Totally agree.  


/Per Jessen, Zürich


-- 
http://www.spamchek.com/ - managed anti-spam and anti-virus solution.
Let us analyse your spam- and virus-threat - up to 2 months for free.