Spam Filter ISP Support Forum

  New Posts New Posts RSS Feed - Bayseian Filter
  FAQ FAQ  Forum Search   Register Register  Login Login

Bayseian Filter "Tip"

 Post Reply Post Reply
Author
Desperado View Drop Down
Senior Member
Senior Member
Avatar

Joined: 27 January 2005
Location: United States
Status: Offline
Points: 1143
Post Options Post Options   Thanks (0) Thanks(0)   Quote Desperado Quote  Post ReplyReply Direct Link To This Post Topic: Bayseian Filter "Tip"
    Posted: 26 July 2004 at 4:14pm

All,

On my higher volume ISP mail server, I was getting enough false positives after a couple of days running that I had disabled the filter due to customer complaints.

About 2 weeks ago, I changed the INI setting "CleanUpCorpusIntervalDays=1"

I cleared the corpus and started over.  The result were that I am not getting nearly as many false positives (none that I have personally seen) and I have not heard a peep out of my customers.  I also am getting LOTS of Bayesian matches so it has not hurt my filtering at all.  I am not 100% sure, but I think that because I am also scanning the headers with the setting "ScanReceivedHeaders=1" that the Bayesian filter was getting overly agressive with messages whos headers had some of the same componants as some of the Spam.

I may, in fact be "all wet" on this point but the end results seem to fantastic so I am happy.  On my lower traffic servers,  I did not have this issue.

Regards,

Dan S.

Back to Top
bpogue99 View Drop Down
Groupie
Groupie


Joined: 26 January 2005
Status: Offline
Points: 59
Post Options Post Options   Thanks (0) Thanks(0)   Quote bpogue99 Quote  Post ReplyReply Direct Link To This Post Posted: 29 July 2004 at 11:55am
I had noticed on one of my servers that I get unusually large numbers of false positives. I've tried clearing and restarting the corpus but it doesn't seem to change the numbers. I'll try this tip. Thanks for the pointers Dan!!
Back to Top
fdickey View Drop Down
Guest Group
Guest Group
Post Options Post Options   Thanks (0) Thanks(0)   Quote fdickey Quote  Post ReplyReply Direct Link To This Post Posted: 21 September 2004 at 5:46pm

One thing I am wondering now about the Bayseian filter is when it checks the headers, how does it decide to classify the parts that are constantly repetitive...such as the entry for our mail server...and the entry for the spam filter itself?

The reason I ask is that I was getting a huge amount of false positives the past couple of days since upgrading to 2.1.1.367.  I turned on the option to show the tokens in the logs so I could get a better idea of what the heck is going on and noticed that these entries were tokens in the database.

Now I'm wondering if I shoud turn off the funtion to check the headers.  I want to do whatever is most efficient and effective.

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4065
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 23 September 2004 at 12:16am
Fred,

The statistical filter examines both good and bad emails and assigns scores to each word accordingly. If an entry is in a "good" email it will be assigned a much higher score than the same entry in a "bad" email. It's not only headers that are repetitive, but so are email addresses and IP addresses for example. The combination of *all* these scores counts towards the final outcome, a single possible high spam score for a serer name will not make the difference.

Roberto F.

LogSat Software
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down



This page was generated in 0.063 seconds.