Spam Filter ISP Support Forum

  New Posts New Posts RSS Feed - Bayesian Filtering Concerns
  FAQ FAQ  Forum Search   Register Register  Login Login

Bayesian Filtering Concerns

 Post Reply Post Reply
Author
Jason E View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Jason E Quote  Post ReplyReply Direct Link To This Post Topic: Bayesian Filtering Concerns
    Posted: 09 February 2006 at 7:52pm
I have read all of your docs regarding filtering and also understand that the filtering method is the very last thing that it checks but it appears that it is not very aggressive.  I have also communicated with you via email. 

Out of 100,000 emails, 13,283 have gone though which is very nice but it seems that Bayesian Filtering is not working at all even if it is the last thing that it checks.  I have checked the logs and it shows that every email that has gone through shows 0% probability of SPAM.  Which I know is not the case.  Out of that 13,283 emails there are still quite a few of emails that are getting through that should at least show some sort of a probability of SPAM.   It has not even caught one.

We use Cerberus Helpdesk as our internal support desk which has Bayesian Filtering as well and goes through a learning process and captures pretty close to every email at this time that is SPAM that spamfilter ISP misses.  This is great for us but does not help our customers.

Is there something that I am missing in setting up this service?  We have it running as a service.  Learning is activated and we have it set as 99% spam probability.

I know that we can not eliminate all SPAM but I just don’t believe that the Bayesian Filtering is doing it’s job at all or not doing a very good one as far as analyzing emails that are passed to it.

Jason E.

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4105
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 09 February 2006 at 10:05pm
Jason,

During the past few days we've been gathering and examining statistical corpus database from some users and you are correct, there is an issue. Data is not being updated correctly in some cases. We are testing internally a patched build that we believe solves the problem, but it will take another day or two of "live" usage to see if there are improvements. If you would like to test this update before we make it public, please contact us by email and we'll be glad to provide it to you.
Roberto Franceschetti

LogSat Software

Spam Filter ISP
Back to Top
jasone View Drop Down
Newbie
Newbie


Joined: 09 February 2006
Status: Offline
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote jasone Quote  Post ReplyReply Direct Link To This Post Posted: 12 February 2006 at 11:34am
The update seems to be changing values in the corpus database, however, when looking at the corpus.ini file it appears to be stuck and no longer increasing numbers for the mesages.

I reset the database and since then the server has processed more than 5000 emails of both SPAM and Non Spam by the totals on the GUI interface does not jive with the corpus.ini file.
Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4105
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 12 February 2006 at 2:47pm
If you're comparing the values in the corpus.ini file against the ones reported in the GUI under "Emails forwarded" and "Emails blocked", there will be a mismatch as there are some cases where incoming emails are not analyzed. However if the values in the  corpus database are not updated as you said, then yes, that would be a problem.

For the latter, could you check SpamFilter's GUI under the "Settings - Bayesian Filter" tab, and check that the "bayesian Learning Status" indicator on the left panel says "Active". If it is active, could you please zip the contents of the SpamFilter\corpus directory in a zip file, wait about 30-60 minutes, then zip the same directory in a different file, and email us both zips so we can compare them? If the zip file is too large, I'll send you a private forum message with instrucitons on how to upload them to our ftp site.
Roberto Franceschetti

LogSat Software

Spam Filter ISP
Back to Top
jasone View Drop Down
Newbie
Newbie


Joined: 09 February 2006
Status: Offline
Points: 3
Post Options Post Options   Thanks (0) Thanks(0)   Quote jasone Quote  Post ReplyReply Direct Link To This Post Posted: 12 February 2006 at 10:05pm
It seems to be moving again.

It seems that when it is running as a service and the GUI is opened and some changes are made and then closed this seems to happen.  I restarted the service after the GUI was opened and it appears to be working now.  This however could just be a coincidence.  I have opened up the GUI again to see it I can duplicate it or it was just a fluke.

I will give you details tomorrow.  I have an early flight but when I get to my hotel I will log on to see what that stats are.

Thanks for all your help.

Jason

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4105
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 12 February 2006 at 10:40pm
Thanks for the update. Please note that the updated corpus database is flushed to disk every 10 minutes or so, so changes/updates may not show up until the flush has occurred.
Roberto Franceschetti

LogSat Software

Spam Filter ISP
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down



This page was generated in 0.276 seconds.