Print Page | Close Window

Bayesian Database

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=3569
Printed Date: 01 July 2025 at 9:16pm


Topic: Bayesian Database
Posted By: Guests
Subject: Bayesian Database
Date Posted: 08 May 2004 at 8:12am

I understand that the Bayesian filter doesn't start until 5000 emails have went through.   Is there a way to apply someone else's database into mine?  It will take a long time for me to hit 5000 emails.  Also, any instructions on how to read the following messages in the log file?

05/08/04 07:00:05:062 -- (5932) Begin Sync Corpus.db
05/08/04 07:00:05:062 -- (5932) Sync Corpus.db pass 0a
05/08/04 07:00:05:062 -- (5932) Sync Corpus.db pass 0b
05/08/04 07:00:05:062 -- (5932) Sync Corpus.db pass 0c
05/08/04 07:00:05:062 -- (5932) Sync Corpus.db - 24488 - 0
05/08/04 07:00:05:062 -- (5932) Sync Corpus.db pass 1 (0)
05/08/04 07:00:05:109 -- (5932) Sync Corpus.db pass 2 (46)
05/08/04 07:00:05:109 -- (5932) Sync Corpus.db pass 3 (46)
05/08/04 07:00:05:109 -- (5932) Sync Corpus.db pass 4 (46)
05/08/04 07:00:05:125 -- (5932) Sync Corpus.db pass 5 (63)
05/08/04 07:00:05:187 -- (5932) Sync Corpus.db pass 6 (124)
05/08/04 07:00:05:187 -- (5932) Sync Corpus.db pass 7 (124)
05/08/04 07:00:05:187 -- (5932) Sync Corpus.db pass 8 (124)
05/08/04 07:00:05:187 -- (5932) End Sync Corpus.db (124)
05/08/04 07:00:06:015 -- (3044) BayesianThread starting
05/08/04 07:00:06:015 -- (3044) TBayesianThread - Begin LoadFromFile for corpus.db (db.dat)
05/08/04 07:00:06:015 -- (3044) TBayesianThread - LoadFromFile for Corpus.db - copied db.dat -> Ind15EE.tmp
05/08/04 07:00:06:015 -- (3044) TBayesianThread - LoadFromFile for Corpus.db - copied db.dat.prb -> Ind15EF.tmp
05/08/04 07:00:06:046 -- (3044) TBayesianThread - LoadFromFile for Corpus.db - loaded files in memory - Ind15EE.tmp
05/08/04 07:00:06:046 -- (3044) TBayesianThread - LoadFromFile for Corpus.db - loaded files in memory - Ind15EF.tmp
05/08/04 07:00:06:109 -- (3044) TBayesianThread - End LoadFromFile for corpus.db (db.dat) (94)

 

 




Replies:
Posted By: LogSat
Date Posted: 08 May 2004 at 5:54pm

Kelsky,

Each company's spam is different, this is what makes our statistical filter so efficient, since it "learns" the spam you receive and adapts to it. If you wish to use someone else's database you can surely do so, but please note that the results may not be as accurate as they could.

All you need to do is to stop SpamFilter, replace the db.dat, db.dat.prb and corpus.ini files in the SpamFilter\corpus directory with the new ones, and restart SpamFilter.

The entries you refer to are logged to indicate that SpamFilter is updating the token corpus with the updated values it has learned in the previous 10-30 minutes. They're there for troubleshooting problems.

Roberto F.
LogSat Software




Print Page | Close Window