Print Page | Close Window

iso-8859 encoding

Printed From: LogSat Software
Category: Spam Filter ISP
Forum Name: Spam Filter ISP Support
Forum Description: General support for Spam Filter ISP
URL: https://www.logsat.com/spamfilter/forums/forum_posts.asp?TID=2924
Printed Date: 19 December 2025 at 10:09am


Topic: iso-8859 encoding
Posted By: Guests
Subject: iso-8859 encoding
Date Posted: 10 February 2004 at 12:24pm

Hi there,

I'm wondering how I can decode iso-8859 subject lines? Is there an easy way to bring this back to text?    Any help is appreciated!

Marc

 

 




Replies:
Posted By: Guests
Date Posted: 15 February 2004 at 2:33pm

Marc,

Your question has been out here a few days, so I'll try to respond... but a little more information might be needed to get the exact response you're looking for.  I'll provide a few different possible answers...

The easiest way to decode an iso-8859 subject line is simply to deliver it to a mail client.  Both Outlook and Outlook Express feature automatic decoding.

Another way would be to search for a web page that provides this capability.  There are a few web pages around that provide online utilities that convert between different character sets and encoding methods.  You paste your encoded character string into a text area on the page, click a "convert" button and see the results.  Sorry, I don't have the address of any of these pages handy, but a Google search should produce one for you if this is what you're looking for.

To better answer your question, it would be helpful to know WHY you're wanting to do this.  Are you wanting to perform keyword matching on the decoded text?  If so, this might not be possible in SpamFilter as of yet... Roberto mentioned in this posting http://www.logsat.com/spamfilter/forums/showmessage.asp?messageID=1945" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://www.logsat.com/spamfilter/forums/showmessage.asp?messageID=1945" CLASS="ASPForums" TITLE="WARNING: URL created by poster. - http://www.logsat.com/spamfilter/forums/showmessage.asp?messageID=1945  that the subject line is not currently being decoded.

A more direct approach might be possible for spam filtering of these messages, depending on your locality and the language(s) used in the email messages you process.  iso-8859 subject line encoding is NOT standard practice for text that can be represented in the Latin-1 character set.  On the other hand, this method of subject obfuscation is one of the spammer's favorite tricks.  Consider using a RegEx to quarantine ALL messages containing iso-8859 encoded subject lines.  This can be a very effective spam blocker in some small mail systems where all of the email communication occurs in languages that require only Latin-1 characters (English, French, Spanish, and similar languages).

You need to be careful when doing this, however, to ensure that you don't end up blocking legitimate messages.  If you run a mail system for a small company, and 100% of your email communication occurs within north America and/or western Europe, then you're probably safe in blocking all iso-8859 encoded subject lines.  Monitor your quarantine very closely for a few days, to ensure that you are not trapping legitimate messages.

If any of your users communicate in languages that require umlauts, dipthongs, different accents than customarily used in Spanish, French or English, etc... or if you're an ISP and must remain open to all languages... then this would not be a good idea for you.



Posted By: Desperado
Date Posted: 23 February 2004 at 1:32am

We ARE an ISP and many of our customers are off shore.  We have simply informed all our customers that if they use this type of encoding that they need to keep the subject line SHORT.  Like less than 15 letters.  Anything beyond that and they WILL, not may, get blocked.  So far, we have ZERO reported false positives on our RegEx and the customers would rather have the Spam blocked than write a book in the subject.  The only thing we though might happen is that some foreign list servers would get blocked because the DO tend to have stupidly huge subject lines.

Once Sean & I get together with a secured BBS, I can post the RegEx I am using but it is actually a fairly simple one to write.

Regards,

Dan S.




Print Page | Close Window