Spam Filter ISP Support Forum

  New Posts New Posts RSS Feed - 2.0 Beta Question
  FAQ FAQ  Forum Search   Register Register  Login Login

2.0 Beta Question

 Post Reply Post Reply
Author
Dan B View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote Dan B Quote  Post ReplyReply Direct Link To This Post Topic: 2.0 Beta Question
    Posted: 20 October 2003 at 4:29pm

R,

We have been running the 2.0 beta for 4 days of so.  On what percentage does the spam filter mark the email message as spam?  Is there a way to change that percentage possible with the GUI in future release or even turn off the Bayesian portion?   I did read on your website that there is a issue of legitimate emails to be blocked after the corpus has grown to several MB in size.  Any updates on this?  We are seeing legitimate emails being blocked.

Thanks,
Dan B

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4068
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 21 October 2003 at 12:26am

Dan,

The current beta blocks emails if the Bayesian probability is above 0.9 (90%). The second beta we are going to release in a few days will have either this value hardcoded to 0.99 or it will be user-selectable. The final version will definetly have it user-selectable.

The s2nd beta will also automatically prune the corpus database removing old keyword tokens that have not been seen in emails recently. This should decrease the size of the corpus and help with memory leak issues present in the current beta.

We still do not have any valid data on possible high rejections with large corpus databases. Any user input on this, where the corpus db.dat file is > 10MB will be appreciated.

Roberto F.
LogSat Software

Back to Top
Dan B View Drop Down
Senior Member
Senior Member
Avatar

Joined: 09 February 2005
Location: United States
Status: Offline
Points: 104
Post Options Post Options   Thanks (0) Thanks(0)   Quote Dan B Quote  Post ReplyReply Direct Link To This Post Posted: 22 October 2003 at 9:34am

R,

>>We still do not have any valid data on possible high rejections with large corpus databases. Any user input on this, where the corpus db.dat file is > 10MB will be appreciated.

What info do you need?  Some emails that are rejected that are legitimate emails, our corpus files?
Just let me know,

Dan B

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4068
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 22 October 2003 at 11:24pm

Ideally we'd like statistics, not email content itself. We'd like to know what happens to the percentages of false positives (good emails being blocked), and percentage of "misses" (spam slipping thru the filters) as the corpus increases in size.

Roberto F.
LogSat Software

Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down



This page was generated in 0.063 seconds.