To all RegEx Experts |
Post Reply
|
| Author | |
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Topic: To all RegEx ExpertsPosted: 27 July 2003 at 10:29pm |
|
All, Perhaps we should have a contest! This is one that has me going around in circles. Anyone have a good block for this? <html> <CENTER> <body> </body> </html> Dan S.
|
|
![]() |
|
Alan
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 28 July 2003 at 12:23pm |
|
I cannot see of any HTML tags that use more than one concurrent space in a tag or that use the "@" symbol. Why not filter on those? The one that gets me is using common keywords in the tags instead of HTML comments as a way of trying to get around statistical Bayesian filtering. Any good browser ignores them, yet they will not get caught by other filtering methods either. Now this would be a challenge. |
|
![]() |
|
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Posted: 28 July 2003 at 12:57pm |
|
Alan, In this email, I was more concerned by the function that seems to create or decode the tags. I do not see where it is actually being envoked but if I remove it the message gets garbled. Do you understand the function? Dan S.
|
|
![]() |
|
Erik Reed
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 29 July 2003 at 11:08am |
|
Laughing... I know this is not what you are asking but I could not resist! Simple Keywords in Black List: penis,3 inches this works great for me. I have to laugh that they always use 3 inches! Why not 2 or 4, nooooo, there is something about that 3 extra inches, LOL! The real problem that I am having is the newest spam that has almost ZERO text and 1 image tag that pulls the spam image from their website. I HATE THAT and it appears there is nothing you can do about it since many valid newsletters and support emails contain those linked images also... Erik!
|
|
![]() |
|
Gabriel Langen
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 29 July 2003 at 11:50am |
|
Hi, I think you are wrong. I have this one! Increase your penis size by 2 to 5 full inches booth Do you want more information? :-) Gaby |
|
![]() |
|
Alan
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 29 July 2003 at 11:52am |
|
Are you posting the actual email content or just the generated page code? I am unclear what you are removing that is causing it to generate garbled output. I am no expert but to me it doesn't appear to do much of anything (beyond the basic html portion that is.) It looks like there may be some other components missing for it to do what it's supposed to. It looks almost like something that an amateur spammer cobbled together. Maybe this was created using a basic template and this script portion was not properly utilized? If I had to make a guess based on the various hanging fragments, I suspect the intent was to get you to click on the link (uh, yeah right) and you would get a multitude of child windows opening up which all submit your email address and their referal id (sz27t) for credit, but the sender didn't know what they were doing. Either way you had originally asked about blocking this. Seems to me you can still block via the "@" or extra spaces in a tag. So do I get a cupie doll? |
|
![]() |
|
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Posted: 29 July 2003 at 7:34pm |
|
Wow! 5 Inches? (thats gotta be a life changer!) ... Seriously ... the "Function" is my concern. Something that I am not yet able to understand is using that function (I believe) to create or decode the obscured code. Every message I get is totally different EXCEPT the function part. ALso, "Simple Keywords" are out of the question from my vantage point. First, we have a "No Censorship" policy and second, we find that literal keywords have little value due to all the obscured code. Last though ... This for LogSat support also: I wonder if any would be spammers look at any of out posts about filters. Is this a possible concern? Dan S. Dan S.
|
|
![]() |
|
ashley
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 30 July 2003 at 8:42am |
|
I agree, a spam that only has a picture to click on defeats a good regex filter. In that case i resort to a keyword filter: @mags.net They wont put it in the email address but it's always in the link they want you to click on. The down side is this filter only applies to this one spammer. |
|
![]() |
|
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Posted: 30 July 2003 at 8:56pm |
|
Should I take it personally that you used my domain as an example? Dan S.
|
|
![]() |
|
ashley
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 30 July 2003 at 10:29pm |
|
Occam's Razor! ;) |
|
![]() |
|
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Posted: 30 July 2003 at 10:41pm |
|
OK .... Or ... KISS! So I won't read anything into it! (I didn't anyway) das |
|
![]() |
|
Frank Schreier
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 31 July 2003 at 7:41am |
|
Did I anything wrong with (\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com)) maybe a typing error??07.31.03 13:13:29:800 -- (752) String matching error for (mos.187902.33882.gemeindebrief.dbounce@news.messagizer.de --and-- (\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com)) : TRegExpr(exec): Loop Stack Exceeded
07.31.03 13:13:29:800 -- (752) Mail from: mos.187902.33882.gemeindebrief.dbounce@news.messagizer.de
07.31.03 13:13:30:080 -- (752) - MAPS search done... .
07.31.03 13:13:30:080 -- (752) RCPT TO: [del]@brainlift.de accepted
07.31.03 13:13:35:749 -- (752) EMail from mos.187902.33882.gemeindebrief.dbounce@news.messagizer.de to [del]@brainlift.de was queued. Size: 41 KB
07.31.03 13:13:35:769 -- (1004) Sending email from gemeindebrief@geizkragen.de to [del]@brainlift.de
07.31.03 13:13:35:819 -- (752) Disconnect
|
|
![]() |
|
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Posted: 31 July 2003 at 8:44am |
|
Frank, You have an extra close paren on the end. YOUR RegEx: (\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com)) Should be: (\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com) Also, what build are you running? Dan
|
|
![]() |
|
Frank Schreier
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 31 July 2003 at 9:03am |
|
Dan, thanks a lot (to stupid), its build 190, have a nice day...
|
|
![]() |
|
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Posted: 31 July 2003 at 9:19am |
|
Hold on a second ... On build 190 up, I am getting a string match error also. I have been working directly with LogSat on this for another RegEx so I will look into this also. I had ZERO errors in the past and this has been a good block for me so it is a high priority. Dan S.
|
|
![]() |
|
Frank Schreier
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 31 July 2003 at 9:26am |
|
Ok, just as I looked again to fix it I saw there is no extra close paren in the filter, its only the close paren for the hole log text line
(mos.187902.33882.gemeindebrief.dbounce@news.messagizer.de --and-- (\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com))
|
|
![]() |
|
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Posted: 31 July 2003 at 9:32am |
|
Frank, The extra Paren, as you saw, is only in the log. This is what I have so far and what I sent off to LogSat: Using the RegEx test in the SF GUI, the following RegEx In the fromemail bl fails as follows:
(\b[\d+]+([\-a-za-z0-9_\.\+])+(@hotmail|@juno)\.com)
If the chars before the @ are 32 or less, the test works. If the chars before the @ are greater than 32, it fails. Could this be related to the String Matching error.
Dan
|
|
![]() |
|
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Posted: 31 July 2003 at 9:43am |
|
Frank, Do not panic on this yet. LogSat has placed a limit on the string length due to a REALLY bad issue that came up on my server. I have requested that we take another look at the limit setting but in the meantime, it is NOT a failure but the message may "sneak" past the filter. I am looking into that now but I am leavint the RegEx in my server for now. Dan |
|
![]() |
|
Frank Schreier
Guest Group
|
Post Options
Thanks(0)
Quote Reply
Posted: 31 July 2003 at 10:23am |
|
Dan, I am relaxed and leaving it in my server too for the next days. Frank.
|
|
![]() |
|
Desperado
Senior Member
Joined: 27 January 2005 Location: United States Status: Offline Points: 1143 |
Post Options
Thanks(0)
Quote Reply
Posted: 31 July 2003 at 10:41am |
|
Frank, This is from Roberto of LogSat: If the loop stack is exceeded for a RegEx, that single RegEx expression will be skipped, so there won't be a match on it. All other keywords/blacklists are still processed.
I'll [LogSat] see how much I can increase the loop stack without runnning into the problems you [Dan S.] had.
[LogSat's] been working on the statistical filtering for a while now, and will most likely have an alpha version ready for internal testing by this weekend. If all goes well, a public beta will be released within a week or two. This hsould help a LOT in catching more spam.
Dan S. |
|
![]() |
|
Post Reply
|
|
|
Tweet
|
| Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.328 seconds.


Topic Options
Post Options
Thanks(0)


