Bayesian filter is not working properly |
Post Reply ![]() |
Author | |
Nadir ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() Posted: 16 July 2004 at 7:55pm |
Hello, I've just downloaded the trial version of SPAMFilter ISP and installed it successfully. I've also configured it to work with a SMTP server listening on another IP, on the same machine. Everything seems to be working fine as messages are being accepted by SPAMFilter and forwarded to the SMTP server. Since the Bayesian filter comes empty, I was wondering how does SPAMFilter ISP know if an incoming message is a spam or a ham? Furthermore, I viewed the log file and saw that it always writes "passes Bayesian filter - 0% spam (0ms)" for each message that it forwards. The corpus folder is also empty, and I don't know how to teach the Bayesian filter of SPAMFilter ISP. Any ideas how to proceed? Thanks. Nadir.
|
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4105 |
![]() ![]() ![]() ![]() ![]() |
Nadir,
The following is an excerpt from the readem.html file in the SpamFilter directory, which tells how the Bayesian filtering will train itself. If the corpus floder is empty, please check that the box labeled "learn new incoming emails" under the Settings - bayesian Filter tab is checked. Roberto F. LogSat Software Bayesian Statistical Filtering The new v2 release of SpamFilter ISP features statistical DNA fingerprinting of incoming emails. The statistical analysis is performed using Bayesian rules. Tokens within incoming emails are scanned and categorized in a corpus file. The content of all new incoming email is fingerprinted and checked against the historical data. If there is a high statistical probability that the email is spam, it is rejected. The statistical engine kicks in after 5,000 non-spam and 5,000 spam emails have been received (values customizable by editing the SpamFilter.ini file). This is done to build a valid statistical base to use before emails are rejected. During this period of time, it is critical to avoid false positives. If a good email is quarantined, forcing it's redelivery either thru the web interface or the SpamFilter GUI will "teach" SpamFilter that the fingerprint in that email is a "good" one, and the statistical DNA database will adapt itself to it. It is very important initially to check the quarantine often to force delivery of legitimate email that has been blocked by the "regular" filtering rules. A slider is used to control the accuracy of the statistical filter. Incoming emails are assigned a probability of being Spam, ranging from 0% (most likely a valid email) to 100% (most likely Spam). Any emails that have a probability of being spam above the value you set will be rejected. Typical threshold values are in the 99.9% range. Edited by LogSat - 24 January 2009 at 10:12am |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.156 seconds.