Print Page | Close Window

Bayesian Filtering Concerns

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=5498
Printed Date: 03 July 2025 at 3:01am


Topic: Bayesian Filtering Concerns
Posted By: Guests
Subject: Bayesian Filtering Concerns
Date Posted: 09 February 2006 at 7:52pm
I have read all of your docs regarding filtering and also understand that the filtering method is the very last thing that it checks but it appears that it is not very aggressive.  I have also communicated with you via email. 

Out of 100,000 emails, 13,283 have gone though which is very nice but it seems that Bayesian Filtering is not working at all even if it is the last thing that it checks.  I have checked the logs and it shows that every email that has gone through shows 0% probability of SPAM.  Which I know is not the case.  Out of that 13,283 emails there are still quite a few of emails that are getting through that should at least show some sort of a probability of SPAM.   It has not even caught one.

We use Cerberus Helpdesk as our internal support desk which has Bayesian Filtering as well and goes through a learning process and captures pretty close to every email at this time that is SPAM that spamfilter ISP misses.  This is great for us but does not help our customers.

Is there something that I am missing in setting up this service?  We have it running as a service.  Learning is activated and we have it set as 99% spam probability.

I know that we can not eliminate all SPAM but I just don’t believe that the Bayesian Filtering is doing it’s job at all or not doing a very good one as far as analyzing emails that are passed to it.

Jason E.




Replies:
Posted By: LogSat
Date Posted: 09 February 2006 at 10:05pm
Jason,

During the past few days we've been gathering and examining statistical corpus database from some users and you are correct, there is an issue. Data is not being updated correctly in some cases. We are testing internally a patched build that we believe solves the problem, but it will take another day or two of "live" usage to see if there are improvements. If you would like to test this update before we make it public, please contact us by email and we'll be glad to provide it to you.


-------------
Roberto Franceschetti

http://www.logsat.com" rel="nofollow - LogSat Software

http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP


Posted By: jasone
Date Posted: 12 February 2006 at 11:34am
The update seems to be changing values in the corpus database, however, when looking at the corpus.ini file it appears to be stuck and no longer increasing numbers for the mesages.

I reset the database and since then the server has processed more than 5000 emails of both SPAM and Non Spam by the totals on the GUI interface does not jive with the corpus.ini file.


Posted By: LogSat
Date Posted: 12 February 2006 at 2:47pm
If you're comparing the values in the corpus.ini file against the ones reported in the GUI under "Emails forwarded" and "Emails blocked", there will be a mismatch as there are some cases where incoming emails are not analyzed. However if the values in the  corpus database are not updated as you said, then yes, that would be a problem.

For the latter, could you check SpamFilter's GUI under the "Settings - Bayesian Filter" tab, and check that the "bayesian Learning Status" indicator on the left panel says "Active". If it is active, could you please zip the contents of the SpamFilter\corpus directory in a zip file, wait about 30-60 minutes, then zip the same directory in a different file, and email us both zips so we can compare them? If the zip file is too large, I'll send you a private forum message with instrucitons on how to upload them to our ftp site.


-------------
Roberto Franceschetti

http://www.logsat.com" rel="nofollow - LogSat Software

http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP


Posted By: jasone
Date Posted: 12 February 2006 at 10:05pm
It seems to be moving again.

It seems that when it is running as a service and the GUI is opened and some changes are made and then closed this seems to happen.  I restarted the service after the GUI was opened and it appears to be working now.  This however could just be a coincidence.  I have opened up the GUI again to see it I can duplicate it or it was just a fluke.

I will give you details tomorrow.  I have an early flight but when I get to my hotel I will log on to see what that stats are.

Thanks for all your help.

Jason



Posted By: LogSat
Date Posted: 12 February 2006 at 10:40pm
Thanks for the update. Please note that the updated corpus database is flushed to disk every 10 minutes or so, so changes/updates may not show up until the flush has occurred.

-------------
Roberto Franceschetti

http://www.logsat.com" rel="nofollow - LogSat Software

http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP



Print Page | Close Window