Spam Filter ISP Support Forum

  New Posts New Posts RSS Feed - Non Printable ASCII Characters in Log Files
  FAQ FAQ  Forum Search   Register Register  Login Login

Non Printable ASCII Characters in Log Files

 Post Reply Post Reply
Author
pcmatt View Drop Down
Senior Member
Senior Member
Avatar

Joined: 15 February 2005
Location: United States
Status: Offline
Points: 116
Post Options Post Options   Thanks (0) Thanks(0)   Quote pcmatt Quote  Post ReplyReply Direct Link To This Post Topic: Non Printable ASCII Characters in Log Files
    Posted: 27 December 2004 at 12:08pm

There are quite a few non-printable ascii characters scattered throughout the log files which cause Windows programs to malfunction when processing or parsing the log files.  I write a program that parses the SpamFilter logfiles and compiles into a database where each row is a complete record on each email processed.  The non printable codes scattered throughout SpamFilter log files have been very problematic.  There are even carriage return line feeds that show up in the middle of log entries!

For those that need a quick reference on ascii codes:

http://www.cplusplus.com/doc/papers/ascii.html

Questions-

Are the SawMill users having problems with log processing due to non printable ascii characters too?  Typically, once a parser runs into certain codes it thinks it's finished so the symptom is that your entire log files don't get parsed.

Is this only a problem on Windows boxes?  Do SpamFilter logs also contain non-printable characters when running in *nix environments? 

 

 

Back to Top
Desperado View Drop Down
Senior Member
Senior Member
Avatar

Joined: 27 January 2005
Location: United States
Status: Offline
Points: 1143
Post Options Post Options   Thanks (0) Thanks(0)   Quote Desperado Quote  Post ReplyReply Direct Link To This Post Posted: 27 December 2004 at 1:13pm

Matt,

I do not seem to be able to find anything in my logs that fall into your description.  Can you post one or two lines?

Dan S.

Back to Top
pcmatt View Drop Down
Senior Member
Senior Member
Avatar

Joined: 15 February 2005
Location: United States
Status: Offline
Points: 116
Post Options Post Options   Thanks (0) Thanks(0)   Quote pcmatt Quote  Post ReplyReply Direct Link To This Post Posted: 27 December 2004 at 1:20pm

DEC(26) aka HEX(1A) is frequent. Here is the text representation that you can copy and search by:



These chars come from returned values from MAPS and SPF queries.

I've seen just about every code DEC 1-32 scatter throughout, including CR/LF's in the middle of SPF returned values.

On Windows boxes the unprintable chars show up as little boxes like the above. I don't have any *nix boxes to see what the results are like on those.

 

Back to Top
Desperado View Drop Down
Senior Member
Senior Member
Avatar

Joined: 27 January 2005
Location: United States
Status: Offline
Points: 1143
Post Options Post Options   Thanks (0) Thanks(0)   Quote Desperado Quote  Post ReplyReply Direct Link To This Post Posted: 27 December 2004 at 1:29pm

Matt,

I do see the following in my logs:

The IP 63.195.191.12 is Blacklisted by dnsbl.njabl.org. Sender, See Link.relay tested -- 1102398010

BUT ... this is not causing me any problems.  What is it doing to your system?

Dan S.

Back to Top
pcmatt View Drop Down
Senior Member
Senior Member
Avatar

Joined: 15 February 2005
Location: United States
Status: Offline
Points: 116
Post Options Post Options   Thanks (0) Thanks(0)   Quote pcmatt Quote  Post ReplyReply Direct Link To This Post Posted: 27 December 2004 at 1:42pm

Not causing any SpamFilter malfunctions.  

This is something to be concerned about if you are using a log file parser like SawMill or are parsing the log files in your own program, like I am.

The problems begin when you try to parse the log files.   If you saw any symptoms it would be in your parser.  You're using SawMill, right?  So, if SawMill is stripping the nonprintable characters already, you would see few problems. The only problem is when a carriage return/line feed is included, because I don't know you can strip those out of the log file before processing.  If SawMill is not stripping out those characters from the log file, then it's possible that you are not getting complete log file processing by SawMill.

I'm now stripping all non-printable characters out of the entire log file before processing using the program I mentioned where I parse into a SQL database that includes columns Server, Recipient Domain, Date, Time, Full Date & Time, DOW, Month, Thread ID, Source IP Address, Source Host Name, Country, Sender Email Address, Recipient Email Addres, Action Taken, Reason, Message, Completed Time, Keywords, SPF Checked for each individual email processed.

Back to Top
LogSat View Drop Down
Admin Group
Admin Group
Avatar

Joined: 25 January 2005
Location: United States
Status: Offline
Points: 4105
Post Options Post Options   Thanks (0) Thanks(0)   Quote LogSat Quote  Post ReplyReply Direct Link To This Post Posted: 27 December 2004 at 4:27pm
Matt is correct. He provided us samples and we were bale to verify the problem. Some MAPS DNS responses and some SPF DNS searches were returinign invalid characters. Other times instead of sending a sequence, some sites are sending either a or a .

Pre-release build 2.1.2.401 is now available in the registered user area to correct this.

Roberto F. LogSat Software
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down



This page was generated in 0.156 seconds.