Spam Filter ISP Support Forum

  New Posts New Posts RSS Feed - Bayesian filter is not working properly
  FAQ FAQ  Forum Search   Register Register  Login Login

Bayesian filter is not working properly

 Post Reply Post Reply
Author
Nadir View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Nadir Quote  Post ReplyReply Direct Link To This Post Topic: Bayesian filter is not working properly
    Posted: 16 July 2004 at 7:55pm

Hello,

I've just downloaded the trial version of SPAMFilter ISP and installed it successfully. I've also configured it to work with a SMTP server listening on another IP, on the same machine. Everything seems to be working fine as messages are being accepted by SPAMFilter and forwarded to the SMTP server.

Since the Bayesian filter comes empty, I was wondering how does SPAMFilter ISP know if an incoming message is a spam or a ham? Furthermore, I viewed the log file and saw that it always writes "passes Bayesian filter - 0% spam  (0ms)" for each message that it forwards.

The corpus folder is also empty, and I don't know how to teach the Bayesian filter of SPAMFilter ISP.

Any ideas how to proceed?

Thanks. Nadir.

 

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4105
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 17 July 2004 at 10:29am
Nadir,

The following is an excerpt from the readem.html file in the SpamFilter directory, which tells how the Bayesian filtering will train itself. If the corpus floder is empty, please check that the box labeled "learn new incoming emails" under the Settings - bayesian Filter tab is checked.

Roberto F.
LogSat Software

Bayesian Statistical Filtering
The new v2 release of SpamFilter ISP features statistical DNA fingerprinting of incoming emails. The statistical analysis is performed using Bayesian rules. Tokens within incoming emails are scanned and categorized in a corpus file. The content of all new incoming email is fingerprinted and checked against the historical data. If there is a high statistical probability that the email is spam, it is rejected.

The statistical engine kicks in after 5,000 non-spam and 5,000 spam emails have been received (values customizable by editing the SpamFilter.ini file). This is done to build a valid statistical base to use before emails are rejected. During this period of time, it is critical to avoid false positives. If a good email is quarantined, forcing it's redelivery either thru the web interface or the SpamFilter GUI will "teach" SpamFilter that the fingerprint in that email is a "good" one, and the statistical DNA database will adapt itself to it. It is very important initially to check the quarantine often to force delivery of legitimate email that has been blocked by the "regular" filtering rules.

A slider is used to control the accuracy of the statistical filter. Incoming emails are assigned a probability of being Spam, ranging from 0% (most likely a valid email) to 100% (most likely Spam). Any emails that have a probability of being spam above the value you set will be rejected. Typical threshold values are in the 99.9% range.

Edited by LogSat - 24 January 2009 at 10:12am
Roberto Franceschetti

LogSat Software

Spam Filter ISP
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down



This page was generated in 0.156 seconds.