Bayesian filter |
Post Reply ![]() |
Author | |
Zoro ![]() Newbie ![]() Joined: 27 October 2005 Status: Offline Points: 7 |
![]() ![]() ![]() ![]() ![]() Posted: 22 December 2005 at 1:01pm |
I've been spamfilter running for 2 months (spam 16453, good 289412) but
I've never seen any mail blocked by bayesian filter. Thinking something
wrong in corpus db, I deleted it to reinitialize.
With new corpus db I've tried this test: - I set the keyword filter to reject 'viagra' - I sent a mail to spamfilter containing 'viagra' in subject - I dump corpus and found the token *Subject*viagra,0,1,0,400000005960464,22/12/2005 - I sent another mail like the 1st one - I dump the corpus and found again *Subject*viagra,0,1,0,400000005960464,22/12/2005 As you can see data are the same, while I supposed to get spam counter=2 and an increased probability value. Notice the most of tokens have the same probability value (0,400000005960464), also in the former significant corpus. - As last proof I pasted the quarantined mail and copied it in the Bayes prob box: the result is 0% spam. Something wrong in my configuration? thank you for any suggestions |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
Zoro,
All new tokens for incoming emails are cached for several minutes, and the main corpus database is updated with the new tokens on regular intervals (I believe it's around 30 minutes or so). Due to this, you will receive incorrect results if you send emails and then check the corpus right away, as the content of those emails will not have been added to the main corpus database. From the spam/good numbers you posted however, we see that your percentage of spam compared to good emails is very, very low, menaing that you receive very little spam (around 5%) in your emails. most installations see between 60% and 80% of spam instead, which is significantly higher. The Bayesian filter usually catches a very small % of spam as it's the last filter to be used. As in your case the % of spam is so low, the % of emails bllocked by the bayesian filter will likely be a small fraction of that, and so it is possible you will not see blocks from that filter. |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.152 seconds.