bayesian filter kicked in after 5000 mails, and is now stopping much n |
Post Reply ![]() |
Author | |
igor.L ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() Posted: 11 June 2004 at 9:18am |
we implemented logsat into production 2 days ago, today 5000 emails went trough, and bayesian filtering kicked in. did something go wrong with the learning process ? what can I do to stop this ? thanks, igor L. |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4105 |
![]() ![]() ![]() ![]() ![]() |
Igor, As more and more emails are received, SpamFilter will adapt to the kind of email traffic and recognize more and more spam as it comes in. But during the initial training period, more or less the first 24 hours / 10000 emails, it is important that the number of false positives be reduced to a minimum so the learining process is accurate. When an email is taken out of the quarantine, SpamFilter will know and will learn that similar emails are probably going to be legitimate. We'd recommend you stop SpamFilter, delete the SpamFilter\corpus directory, then restart SpamFilter. This will clear your existing statistical database, so you may start from scratch. You may also do this during the morning of a regular work-day (non-weekend) as usually most legitimate emails are sent during the daytime. This will allow more "clean" emails to go thru the statistical filter during the initial training period, allowing a better learning process. Roberto F. Roberto F. |
|
![]() |
|
igor L. ![]() Guest Group ![]() |
![]() ![]() ![]() ![]() ![]() |
helo, I did as suggested, on mondaymorning : deleted corpus folder, restarted service. I then closely monitored for false positives, there were maybe 4. today we've passed 5000 mails again, and bayesian filtering started. now nothing is being stopped by the bayesian filter ? also when I paste an obvious spam mail in the bayesian probabillity test, it says 0% spam. this is like the opposite of previous behaviour. as suggested I reduced false positives to a minimum, -did something go wrong with training the filter again ? thanks, igor |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4105 |
![]() ![]() ![]() ![]() ![]() |
Igor, Can you please zip us your corpus directory so we can take a look at your corpus database? Can you also please run the following query on the SpamFilter database and let us know he results? SELECT tblQuarantine.RejectID, tblRejectCodes.RejectDesc, COUNT(tblQuarantine.RejectID) AS Total Roberto F. |
|
![]() |
|
nippe ![]() Newbie ![]() Joined: 03 February 2005 Status: Offline Points: 12 |
![]() ![]() ![]() ![]() ![]() |
You wrote: When an email is taken out of the quarantine, SpamFilter will know and will learn that similar emails are probably going to be legitimate. Q: Deliver and delete. Same thing (in this case)? |
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4105 |
![]() ![]() ![]() ![]() ![]() |
Not sure I understood the question. Email in the quarantine has already been "learned" by the filter as being bad. If it's a false positive, the only way for the filter to "unlearn" about it being bad is when a user (or an admin) forces that email to be delivered to the end user. Only when that happens the filter reverses the score, and then continues to assign different probabilities to the tokens so that they are more likely to be considered "good" in the future. Roberto F. |
|
![]() |
|
nippe ![]() Newbie ![]() Joined: 03 February 2005 Status: Offline Points: 12 |
![]() ![]() ![]() ![]() ![]() |
the only way for the filter to "unlearn" about it being bad is when a user (or an admin) forces that email to be delivered OK - and deleting a mail from the quarantine is not learning (or unlearning) the filter anything? |
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.109 seconds.