Cluster/Load Balance SF |
Post Reply ![]() |
Author | |
lyndonje ![]() Senior Member ![]() ![]() Joined: 31 January 2006 Location: United Kingdom Status: Offline Points: 192 |
![]() ![]() ![]() ![]() ![]() Posted: 04 August 2006 at 8:38am |
I know there have been numerous posts about clustering SF before, but
we are getting to the point where we really need this feature, and for
it to work well for failover purposes.
Would it be possible to configure SF to save all whitelist info etc to a centralised database instead of localised TXT files? As SF loads and updates the TXT files every minute, and as it holds the stuff inbetween in memory, couldn't the SF servers pass this information direct to each other but only have one server commit the data to database to prevent duplication? Not really thinking of the ins and outs in too much detail but seriously, how difficult would it be to achieve this? Regards, Lyndon. |
|
![]() |
|
kspare ![]() Senior Member ![]() Joined: 26 January 2005 Location: Canada Status: Offline Points: 334 |
![]() ![]() ![]() ![]() ![]() |
I kinda have it setup.... I run a script against the database to make every message appear to come from server id 2. This will cause only one server to update the autowhitelist. I then have a script that will copy the autowhitelist to a repository for all the other servers to download. The server that generates the whitelist doesn't write to the proper file name. I've had problems where the server can overwrite the data in the file due to a copy conflict, so the that servers sync script copies the autowhitelist to the repository as the proper name for the other servers to download. The repository has the master copy of all the other files, domain names etc. The only thing I don't replicate is the honeypot. Hasn't been a big deal to do anyway. |
|
![]() |
|
WebGuyz ![]() Senior Member ![]() Joined: 09 May 2005 Location: United States Status: Offline Points: 348 |
![]() ![]() ![]() ![]() ![]() |
What I would would be to create a script to update any black/whitelist info directly to a db (not SFI) and do all the updates to a it and if anything was updated set a bit in the db. On each server running SFI run a script to have it check all the tables for any updates bit that are set and build that one list as a txt file. I would set this script to run every 10-15 minutes(or sooner). Bayes is the biggest problem as you can't share corpus Blacklists are a problem because they are global You almost have to rely on autowhitelistdelivery.txt but its going to get very big very quick. Still think we need SFE (SpamFilter Enterprise) While SFI speed is excellent I think most of those having larger requirements would trade speed for being able to do things like have authorizedto list in a db instead of a file as well as all the other lists. The autowhitelistdelivery.txt file can get immense and an alternative would be to loading from db, but caching in memory for x amount of time would make it superb without giving up a lot of speed. Same could be true for all the other lists as well. Load from db but keep in memory and reload upon file update. Roberto, I want to beta this when you get started on it. SFI has most of the pieces in place and with just a little work (easy for me to say) could be made into something really scalable.
Edited by WebGuyz |
|
http://www.webguyz.net
|
|
![]() |
|
lyndonje ![]() Senior Member ![]() ![]() Joined: 31 January 2006 Location: United Kingdom Status: Offline Points: 192 |
![]() ![]() ![]() ![]() ![]() |
When I've mentioned this in the past, people have come up with similar
suggestions, which are viable but I aren't too keen on them because of
the way you a bodging pieces together.
I was thinking the same about the soon to be released SFI Enterprise ![]() kspare: Are there any implications in your script setting the SF server to id 2? You lost me a bit on the below paragraph, could you clarify so I understand? "The server that generates the whitelist doesn't write to the proper file name. I've had problems where the server can overwrite the data in the file due to a copy conflict, so the that servers sync script copies the autowhitelist to the repository as the proper name for the other servers to download" WebGuyz: What do you mean when you say "Blacklists are a problem because they are global"? |
|
![]() |
|
WebGuyz ![]() Senior Member ![]() Joined: 09 May 2005 Location: United States Status: Offline Points: 348 |
![]() ![]() ![]() ![]() ![]() |
If a customer asks us to blacklist *@somedomain.com and we do and another customer (i.e mydomain.com) happens to want email from *@somedomain.com then they won't get it unless we whitelist using the autowhitelistdelivery.txt file: *@somedomain.com|*@mydomain.com Would be great if there was a blacklist that worked the same as the autowhitelistdelivery.txt file did, but in reverse. If there were a SFE, how else could it centralize all the functions other than in the DB and have each server access that data other than thru the DB. Its just a matter of how the lists could be handled and I've strained by brain to come up with something more efficient and can't. But I am open to suggestions.
|
|
http://www.webguyz.net
|
|
![]() |
|
WebGuyz ![]() Senior Member ![]() Joined: 09 May 2005 Location: United States Status: Offline Points: 348 |
![]() ![]() ![]() ![]() ![]() |
I just remembered why I thought the .txt files were a good idea. Lets say you have a enterprise version with a centralized db and getting all the info from there. Lets say the DB goes down and your SFS (SpamFilter Satellites If however your SFS's get the info and copy it to a local text file then update every minute like they do now they can keep working autonamously until the SF main db comes back up. Roberto, we are doing the hard part, coming up with the names for the compenents in SFE. All you have to do is code it.... |
|
http://www.webguyz.net
|
|
![]() |
|
LogSat ![]() Admin Group ![]() ![]() Joined: 25 January 2005 Location: United States Status: Offline Points: 4104 |
![]() ![]() ![]() ![]() ![]() |
You guys are really hammering us on this!
Ok ok ok... Joking aside, we *are* listening to all comments on the forum, even if sometimes we don't answer to the "wish requests". Going back to this, some admins are using Windows' file replication to replicate the directory with the various whitelist/blacklist files between servers. This works fine except for the fact that SpamFilter will only check for file updates every 60 seconds. So it may skip a beat if two files get updated by different servers within 60 seconds. This does mean that yes, a database needs to be involved. We've always opted to use text files instead of database for (1) performance, as our text files are extremely fast, and (2) reliability, as we still want SpamFilter to work if the database server goes down. Keeping in mind that no matter where we go with the database, we will always be exporting the data used by SpamFilter to local files, as to not risk downtime in case of DB problems, what (oh geez now I'm going to open Pandora's box...) exactly would you all like for SpamFilter to do at this point, in regards to multipel servers/multiple domains? ....please have mercy... be gentle in the flow of requests ![]() |
|
![]() |
|
WebGuyz ![]() Senior Member ![]() Joined: 09 May 2005 Location: United States Status: Offline Points: 348 |
![]() ![]() ![]() ![]() ![]() |
What I think a scalable SF would look like: 1) Keep master copy of all current text files in DB. Local SF server would download each list from DB and create text files and read them into memory as it does now. Instead of checking the text file every 60 seconds to see if it has been updated it would check to see if a db master file has been updated and if so dowload that one file to a textfile and read it into memory.(if the DB were down keep working off the local text file copy). If any customization by us was done it could be done right to the db and SF would propagate it, no more worrying about copying files back and forth or syncronizing them multiple SF servers. 2)The autowhitelistdelivery.txt file is powerfull. But the achilles heel is that it can get too big over time. Find a way to let each domain have its own autowhitelist. For example, if mydomain.com's autowhitelist gets updated in the SQL db, then just reload mydomains.com's autowhitelist to disk and read it into memory. The structure on the disk could look like: \autowhitelist 3)Make the equivelant of autowhitelist but for blacklists. a) do not allow any mail from anyone at somedomain.com to reach anyone at mydomain.com. Roberto, SFI is very good as it is now but I'm thinking of areas that could be bottlenecks as your cutomers (everbody on this forum ;-) go forth and prosper and need to scale upward and want to bring SF along. |
|
http://www.webguyz.net
|
|
![]() |
|
Web123 ![]() Newbie ![]() ![]() Joined: 26 January 2005 Location: Finland Status: Offline Points: 31 |
![]() ![]() ![]() ![]() ![]() |
Why not skip the local textfiles? We use localDBs that we sync with masterDB(and the generate the textfiles locally) This way we can also quarantine to localdb when master db is down! /Web123 |
|
![]() |
|
WebGuyz ![]() Senior Member ![]() Joined: 09 May 2005 Location: United States Status: Offline Points: 348 |
![]() ![]() ![]() ![]() ![]() |
The overhead for the db queries would be slower unless your saying read them directly into memory from the DB which would be OK. But keeping multiple db's synced could be a big pain in the rear. How do you keep local db copies synced with a master db? do you look for differences in time stamps or what? How often does it run?
|
|
http://www.webguyz.net
|
|
![]() |
|
Web123 ![]() Newbie ![]() ![]() Joined: 26 January 2005 Location: Finland Status: Offline Points: 31 |
![]() ![]() ![]() ![]() ![]() |
yep, each record has its own create date and time. We have a small vb. app that sync the DBs every 10min. The "best" part of this is that we can quarantine to localDb when masterDb is down and then when it is up again we sync the DBs, and users only have to check quarantined e-mails from one Db. We only manage about 60 domains so the load is not that big. /Web123 |
|
![]() |
|
WebGuyz ![]() Senior Member ![]() Joined: 09 May 2005 Location: United States Status: Offline Points: 348 |
![]() ![]() ![]() ![]() ![]() |
I forgot to add: The ability to add a check of our own into the datastream so that we can use products like SA SpamC. Kind of like what you doing with the anti-virus product but make it generic so that we can run a command line scanner against it. I know this would slow down SF, but the ability to do SA would be worth it. Also the ability to use a anti-virus like clam-v or f-prot thorugh that entry point as well. |
|
http://www.webguyz.net
|
|
![]() |
Post Reply ![]() |
|
Tweet
|
Forum Jump | Forum Permissions ![]() You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |
This page was generated in 0.430 seconds.