Spam Filter ISP Support Forum

  New Posts New Posts RSS Feed - New feature request - counters in database
  FAQ FAQ  Forum Search   Register Register  Login Login

New feature request - counters in database

 Post Reply Post Reply
Author
peet View Drop Down
Newbie
Newbie
Avatar

Joined: 01 August 2007
Location: United States
Status: Offline
Points: 21
Post Options Post Options   Thanks (0) Thanks(0)   Quote peet Quote  Post ReplyReply Direct Link To This Post Topic: New feature request - counters in database
    Posted: 05 March 2010 at 11:58pm
Hi,
I've written an enhanced web GUI(still working on it), using SQL DB from where I pickup the quarantined e-mail data.
As each new e-mail is added by the filter app, I trigger a counter that adds a new e-mail address to a separate table and a counter per day to count how many e-mails come in per day for that e-mail address.
Also do same for a domain name of that e-mail address to get totals.
SQL does all the work using a stored procedure.

But, I can only capture what the filter quarantines.

I'd love to see a more comprehensive counter per e-mail account and domain name.
- incoming e-mail count
- blocked e-mail count
- forwarded on good e-mail count
- quarantined e-mail count (this is all I have)

I build a chart based on the daily count so the user can see how traffic fluctuated over time. It is amazing how an account with average 30-50 quarantined spam per day can all of a sudden for a week drop to under 10 spams, and in another month all of a sudden jump for just one day into the hundreds.

But I'm only seeing quarantined.

So would it be possible, and would others also benefit from this? 

Basically the filter would do a one-way communication to the quarantine DB or a local file or cache and later written to file.
Per email address, per day one record in a counter table for each of the counters.
Perhaps call a storedprocedure and just let that do the updating and counting freeing up the Filter's process of it.

It would be great to know the total of good vs. bad e-mails per e-mail account.

Please, others add your comment/support for this feature if you'd like to see it!)
Back to Top
yapadu View Drop Down
Senior Member
Senior Member


Joined: 12 May 2005
Status: Offline
Points: 297
Post Options Post Options   Thanks (0) Thanks(0)   Quote yapadu Quote  Post ReplyReply Direct Link To This Post Posted: 06 March 2010 at 1:23am
I have actually been working on a stats system for our users, it has been several weeks already that I have been working on it.

I am using sawmill to process the raw logs, then I extract the data I want from the database that sawmill makes and put that data into my own tables.

We generate about 1 gig of logs per day, so the major issue has been the volume of data to deal with.

I am tracking messages received, quarantined, virus, forwarded as good.  This data is broken down by email address (as well as email addresses that are invalid), by domain and by day.  So users can see what is going on.

I would like to also capture country data and inbound email senders... but that is just too much data.
Back to Top
peet View Drop Down
Newbie
Newbie
Avatar

Joined: 01 August 2007
Location: United States
Status: Offline
Points: 21
Post Options Post Options   Thanks (0) Thanks(0)   Quote peet Quote  Post ReplyReply Direct Link To This Post Posted: 06 March 2010 at 1:50am
Wow, processing the entire raw log. I didn't think about that, but on the other hand I didn't want the server processing so much data, just a simple counter.

My current storedprocedure utilizes MS SQL's temporary hold of records being added and based on the event, such as INSERT, it then triggers the storedprocedure and grabs the e-mail address, subtracts the domain and for that day it makes a count.
It is really easy and fast, and also fast to generate a bar-chart from that.
So for example a user can see the daily fluctuation of quarantined e-mails for their domain around 50,000 emails daily, and compare it to a graph next to that to their personal mailbox's quarantine with is around 50 for that particular e-mail account daily.

So if possible I'd like to avoid using raw logs. Server is busy enough already blocking junk e-mails based on the many filters.


Edited by peet - 06 March 2010 at 1:51am
Back to Top
yapadu View Drop Down
Senior Member
Senior Member


Joined: 12 May 2005
Status: Offline
Points: 297
Post Options Post Options   Thanks (0) Thanks(0)   Quote yapadu Quote  Post ReplyReply Direct Link To This Post Posted: 06 March 2010 at 3:36am
I don't know how much additional load it would place on the spamfilter server to have the software do it, but under heavy loads (hundreds of connections) I would imagine the overhead of the stats would be quite a bit.

Even my servers, which usually only have a few connections (maybe 10) I have had to setup two servers for sawmill to process the data.  One for sawmill and one for mysql database to store the stats.  Crunching data on a table with 50 million rows takes some serious power so there is no way I could do it on the spamfilter server itself.

I am actually trying to figure out a way to compress the data and store it in the amazon cloud or something.  There when the users want it but not taking up massive amounts of space on the production servers.
Back to Top
 Post Reply Post Reply
  Share Topic   

Forum Jump Forum Permissions View Drop Down



This page was generated in 0.328 seconds.