<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="RSS_xslt_style.asp" version="1.0" ?>
<rss version="2.0" xmlns:WebWizForums="http://syndication.webwiz.co.uk/rss_namespace/">
 <channel>
  <title>Spam Filter ISP Forums : "RegEx for fun & profit"</title>
  <link>https://www.logsat.com/spamfilter/forums/</link>
  <description><![CDATA[This is an XML content feed of; Spam Filter ISP Forums : Spam Filter ISP Support : "RegEx for fun & profit"]]></description>
  <pubDate>Sun, 14 Jun 2026 02:56:10 +0000</pubDate>
  <lastBuildDate>Fri, 11 Jul 2003 08:45:00 +0000</lastBuildDate>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Web Wiz Forums 11.04</generator>
  <ttl>360</ttl>
  <WebWizForums:feedURL>https://www.logsat.com/spamfilter/forums/RSS_post_feed.asp?TID=1355</WebWizForums:feedURL>
  <image>
   <title><![CDATA[Spam Filter ISP Forums]]></title>
   <url>https://www.logsat.com/spamfilter/forums/forum_images/web_wiz_forums.png</url>
   <link>https://www.logsat.com/spamfilter/forums/</link>
  </image>
  <item>
   <title><![CDATA["RegEx for fun & profit" :  All,  I thought I would throw...]]></title>
   <link>https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=1355&amp;PID=1355&amp;title=regex-for-fun-profit#1355</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="https://www.logsat.com/spamfilter/forums/member_profile.asp?PF=22">Desperado</a><br /><strong>Subject:</strong> 1355<br /><strong>Posted:</strong> 11 July 2003 at 8:45am<br /><br /><FONT face=Arial size=2>&nbsp;</FONT><DIV><FONT face=Arial size=2>All,</FONT></DIV><DIV>&nbsp;</DIV><DIV><FONT face=Arial size=2>I thought I would throw out a few ideas to convince some of the SpamFilter ISP users of the potential power of RegEx's (Regular Expressions).&nbsp; Perhaps this will help when you are trying to nail some of the more ingenious Spam techniques without "throwing out the baby with the bath water".&nbsp; </FONT></DIV><DIV>&nbsp;</DIV><DIV><FONT face=Arial size=2>Let me preface this with a "Disclaimer".&nbsp; I am no expert with Regular Expressions but having a fair amount of experience with Perl, I have been forced to learn and use them over the years.&nbsp; Each software package has it's own "Engine" to interpret the expressions so you always have to "Play" with them to get them right.&nbsp; I make no claims whatsoever about the accuracy of the information below. DO NOT USE the expressions I have here ... use them only as a starting point. I should also state that I am in no way affiliated with LogSat and as such, LogSat can not take any responsibility for any of my stupid mistakes!</DIV><DIV></FONT>&nbsp;</DIV><DIV>I feel that anything that knocks out a few Spams here and a few Spams there eventually adds up to help but it is important to make sure that any filter is actually doing something useful because the longer your black lists are, the harder the software has to work.&nbsp; I do a log parse run each day to see if my filters are effective and I take anything out that is not helping.</DIV><DIV>&nbsp;</DIV><DIV>OK ... One thing I did was come up with a "standard" expression that&nbsp;will <FONT face=Arial size=2>describe a generic email address construct as:</FONT></DIV><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV><DIV><FONT face=Arial size=2>((&#091;\-a-zA-Z0-9_\.\+&#093;)+@(&#091;\-a-zA-Z0-9_\.\+&#093;+\.)+&#091;a-z&#093;{2,6})</FONT></DIV><DIV>&nbsp;</DIV><DIV>Once you have this, you should be able to use the format to kill off "Bad" addresses.&nbsp;&nbsp;<FONT face=Arial size=2>As an example, Hotmail has announced that any address starting with a digit, is not valid.&nbsp; Therefore, I can construct an expression such as:</FONT></DIV><DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV><DIV><FONT face=Arial size=2>(\b&#091;\d+&#093;+(&#091;\-a-zA-Z0-9_\.\+&#093;)+@hotmail\.com)&nbsp; to detect and block it.&nbsp; WARNING:&nbsp; I believe that if there is one bad address in the "TO" field, the entire message gets blocked so this should only be used in the "From" field.</FONT></DIV><DIV>&nbsp;</DIV><DIV>Here is a list I have come up with that describes some know "Bad" email constructs:</DIV><DIV>&nbsp;</DIV><UL><LI>numeric-only localparts aol.com, msn.com, bellsouth.net, brandeis.edu</LI><LI>localparts starting with a digit from juno.com and hotmail.com</LI><LI>localparts longer than 16 characters from aol or hotbot or canada.com</LI><LI>localparts w/ _ and longer than 16 characters and at least 1 digit&nbsp;@(hotbot|juno|rocketmail|excite|hotmail|mail).com</LI><LI><A HREF="mailto:test*@test.com" CLASS="ASPForums" TITLE="WARNING: URL created by poster. ">test*@test.com</A><BR></LI></UL><DIV>For a good laugh,&nbsp; This is the regular expression that I used in my Sendmail Server to attempt to slow the flood down.&nbsp; I AM NOT RECOMENDING THIS!&nbsp; This EXACT RegEx does, in fact, work with "ActiveState" Perl!</DIV><DIV>&nbsp;</DIV><DIV><FONT face=Arial size=2>&nbsp;</FONT>^(mailer\-daemon&#091;0-9&#093;+.*&lt;@.*|.*(&#091;0-9&#093;.*prsesly|discounts|software&#091;0-9&#093;)&lt;@yahoo\.com|.*(saveonink|printsupplies|inkjet|toner_).*&lt;@.*|subscriber_services&#091;0-9&#093;+&lt;@.*|test.*&lt;@test.*\.com|&#091;0-9&#093;+&lt;@(aol\.com|msn\.com|bellsouth\.net|brandeis\.edu)|&#091;0-9&#093;&#091;^&lt;&#093;*&lt;@(hotmail|juno)\.com|.{16}&#091;^&lt;&#093;+&lt;@(canada|aol|hotbot)\.com|.{10}.*_.{2}.*&#091;0-9&#093;.{2}.*&lt;@(hotmail|juno|rocketmail|hotbot|excite|yahoo|msn|mail)\.com|.*free4you&lt;@.*|.*_...._._._.&lt;@.*brandeis\.edu|INVESTMENT_ALERT-.*|xtrafreeporn.*|Nasdaq_Newsdesk.*|ListsOnSale.*|InvestorInsights__.*|subscriptionssavings_.*|MarketingLists.*&#091;0-9&#093;.*&lt;@.*)\.?&gt;<BR>&nbsp;</DIV><DIV>Wasn't that fun?</DIV><DIV>&nbsp;</DIV><DIV>Dan S.</DIV><DIV>&nbsp;</DIV>]]>
   </description>
   <pubDate>Fri, 11 Jul 2003 08:45:00 +0000</pubDate>
   <guid isPermaLink="true">https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=1355&amp;PID=1355&amp;title=regex-for-fun-profit#1355</guid>
  </item> 
 </channel>
</rss>