Print Page | Close Window

Bayesian Filter still operational?

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
Printed Date: 16 February 2019 at 8:07am

Topic: Bayesian Filter still operational?
Posted By: yapadu
Subject: Bayesian Filter still operational?
Date Posted: 24 June 2017 at 2:42am
I know Bayesian filtering was popular with LogSat several years ago, but we ended up disabling it (can't really remember why).

Today we wanted to turn it back on and see how it performed.

We have it set to activate at 5000 messages ham/spam.

Currently the corpus file says we have much more than that processed:


We have yet to see it catch anything, our logs indicate stuff like this:

06/24/17 06:13:43:788 -- (2664055264)     Token    Good    Spam    Prob is Spam
06/24/17 06:13:43:788 -- (2664055264) EMail from to passes Bayesian filter - 0% spam  (172ms)

Should the token  good spam prob spam say anything?

All messages in the logs say 0% spam, I would have assumed messages have at least some 'spam' component so they should not all be 0%.

I am a user of SF, not an employee. Use any advice offered at your own risk.

Posted By: LogSat
Date Posted: 24 June 2017 at 10:07pm
The Bayesian filter is very selective, and most of the emails it classifies will have probabilities that will be very close to either the upper or lower extremes, meaning very close to either 0% or 100%. That is normal, and is just the nature of the Bayesian filtering.
SpamFilter applies several different filters to incoming emails in a specific order to optimize performance. The Bayesian filter is one of the last ones to be used by SpamFilter, and thus will catch a very small percentage of spam compared to the other filters. In our own ISP for example, the Bayesian filter (when activated) catches only about 0.1% of spam, compared to 99.9% of the other filters (we disabled this filter a few years ago on our own live server). Adding to this, the Bayesian filters were "the thing" 9-10 years ago, and for a while this was the "star" filter in our SpamFilter. However the spammers have since learned how to easily bypass them, making the Bayesian filter even less effective. We often suggest disabling this filter for companies that receive large amounts of emails (~250,000 or more per day) as it does use a lot of resources with only minor gains.

Roberto Franceschetti" rel="nofollow - LogSat Software" rel="nofollow - Spam Filter ISP

Print Page | Close Window