SURBL and MAPS good for bayesian learning
Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=5366
Printed Date: 09 May 2025 at 7:41pm
Topic: SURBL and MAPS good for bayesian learning
Posted By: Guests
Subject: SURBL and MAPS good for bayesian learning
Date Posted: 24 October 2005 at 11:45am
The bayesian filter requires 5000 spam and non-spam e-mails
to function. To quicken the learn process I've set Spamfilter to quarantine
mail blocked by MAPS and SUBRL blacklists, then I manually delete or deliver
mail in the quarantine folder. My question is, does manually deleting
MAPS/SUBRL blocked email quicken the bayesian learn process, or is using this
manual method entirely unnecessary?
|
Replies:
Posted By: LogSat
Date Posted: 25 October 2005 at 6:56am
Ivan82,
Deleting entries in the database has no effect on the bayesian filter.
However, if a good email was incorrectly blocked by one of the filters,
forcing its delivery will resubmit the email to the bayesian filter,
tagging it as "good", and this will help train the bayesian filter in
recognizing "good" tokens within that email, so as to lessen the
chances of mistakes in the future.
------------- Roberto Franceschetti
http://www.logsat.com" rel="nofollow - LogSat Software
http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP
|
Posted By: Guests
Date Posted: 25 October 2005 at 12:25pm
OK, so I've removed the maps & subrl quarantine, and I've kept quarantine only for SPF issues. I've also added some keywords(ie, cialis, viagra, penis enlargement, morgage, etc). When checking into the quarantined e-mails I get plenty that have SPF issues and contain those blacklisted keywords. If I pass these emails on to be delivered, will they get blocked due to the keywords they contain? Or must I delete them from quarantine manually?
|
Posted By: Guests
Date Posted: 26 October 2005 at 10:05am
Oh, and the bayesian filter just isn't kicking in..not even after 60,000+ mail attempts discarded, 8000+ mail forwarded, and 17000+ spam blocked...
|
Posted By: Guests
Date Posted: 26 October 2005 at 11:45am
Possible new feature request:
I have a list of authorized e-mails, for which I recieve mail, mail to addresses not on this list is dropped by spamfilter. How about another check that if mail is being sent by unknownusername@mydomain.com to knownusername@mydomain.com, it will get blocked automatically? I suspect domain spoofing would get detected by SPF? But what about a user inside my domain that is attempting to spam all users from my domain?
|
Posted By: LogSat
Date Posted: 26 October 2005 at 4:14pm
Ivan,
Please note that the bayesian filter "learns" emails as they are
quarantined. If you delete them from the quarantine, this has no effect
on the bayesian learning process as they have already been processed.
If however you force the delivery of those quarantined emails, you will
be telling the bayesian filter "this spam that you blocked is not
really spam, reprocess the emails and re-learn that the content is
actually good". The bayesian filter will thus re-eaxmine the email, try
to learn about the email contents so next time similar emails will be
passed, and will then deliver the current email to the intended
recipient.
RE: your statement " so I've removed the maps & subrl quarantine",
we're not sure we understood. The more filters you have active, the
more chances are that SpamFilter will block spam. Both the MAPS an
SURBL filters are very effective, and will thus greatly help the
bayesian filter in the learning process.
As for the Bayesian filter not kicking in, please note that the
Bayesian filter is used as a last resort to check for spam, after all
the other filters have had a chance to do so. Only if they all fail is
the Bayesian filter used. As such, it will have mostly pre-screened
emails to check, and will only tag a very small percentage of them.
As an example we provided a snapshot of our filter stats for 3 days worth of emails on the forum as follows:
94,828 IP found in MAPS search
74,161 IP address is from a blacklisted country
10,810 Invalid sender domain MX record
7,896 SPF Sender Policy Framework match
3,044 Keywords found in content
763 Exceeded maximum number of RCPT TO
526 Mail From and Mail To domains are equal
345 Statistical filter match
27 Mail From and Mail To are equal
According to the above, the Bayesian statistical filter on our own
server only blocked 0.2% of the spam found by the other filters.
However that is still 354 spam emails that were successfully blocked.
To answer your last posting, "if mail is being sent by
unknownusername@mydomain.com to knownusername@mydomain.com", as you
mentioned the SPF filter will take care of that. There is also another
option that rejects emails if the "FROM" domain is the same as the "TO"
domain. That will also cause a reject if the user is within your domain
in the scenario you described.
------------- Roberto Franceschetti
http://www.logsat.com" rel="nofollow - LogSat Software
http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP
|
Posted By: Guests
Date Posted: 27 October 2005 at 5:39am
Thanks for the replies, what I meant when I said I disabled the MAPS and SUBRL quarantine, is that I unchecked the option to quarantine mail blocked by those filters so now it just deletes them automatically, we've been running Spamfilter for 4 days now and it was too time consuming to manually sort through the quarantined list. All filters are active though, my question was, without the manual process enabled of deleting/delivering quarantined possible spam(assuming all email blocked by SPF/MAPS/SUBRL is spam), does the Bayesian filter still get the data it needs to detect possible spam?
|
Posted By: LogSat
Date Posted: 27 October 2005 at 8:03pm
Ivan82,
Yes, SpamFilter will still pass on the emails to the Bayesian learning
engine, even if you disable the quarantining of emails for some filters.
------------- Roberto Franceschetti
http://www.logsat.com" rel="nofollow - LogSat Software
http://www.logsat.com/sfi-spam-filter.asp" rel="nofollow - Spam Filter ISP
|
|