Son of spam: 4 spam filtering packages tested



 Spam filtering sofware

 Anti-spam software:

 GFI MailEssentials
 NetIQ MailMarshal
 NAI/McAfee SpamKiller
 SurfControl

 Specifications
 How we tested
 Look out for...
 Final words
 About RMIT

Son on Spam: 4 Spam Filtering Packages Tested Can you trust software to block all the spam your company receives and let all your legitimate e-mail through? We evaluate four top spam filtering packages for their accuracy.

Three months ago in the July edition of Technology & Business we compiled an overview of five anti-spam filtering applications that were available at the time. That initial review addressed the introduction and overview of spam and its concepts and also the individual usability and technical implementations of those applications. However it did not look at the actual accuracy of those individual packages in filtering e-mail.

We are therefore now re-visiting the anti-spam issue with a more results-based review. We invited the same five vendors back for a head-to-head shootout to show the packages' accuracy in filtering unwanted e-mail while keeping as much useful e-mail as possible. All vendors accepted this challenge except for Clearswift, who cited the imminent release of a new redesigned application. We hope that in the next similar accuracy review, Clearswift will be involved.

As you will see in this review, testing these packages for accuracy is a tricky business and to do so fairly and accurately took several months.

As detailed in the previous review, anti-spam filters can be set up in any number of ways, utilising black lists, white lists, and custom made rule sets. Some applications come configured with basic rules, others come as a blank slate. Some also employ quite advanced learning techniques (touted by some vendors as heuristics or Bayesian analysis).

Not so simple
When we actually sat down to work out this test and what results we could achieve that were really correct, several issues presented themselves. Spam and spammers are dynamic, constantly evolving, and are always look to develop different techniques to get past the filters and deliver their message of "lose weight by eating more" and "XXX wholesale". Therefore the tests could only be a snapshot of the particular given period in time that the test was performed (naturally with some legacy "classic" spam messages thrown in for good measure). With this in mind, we collected a static data set over a couple of weeks to ensure that we had the "latest" in the spammers' arsenal.

In addition to running tests on this set of static data, we also needed to run the software on some live e-mail data to ensure similar results were achieved by the products, given the static test data may not be filtered exactly the same as it may be in a live environment.

In order to do this, each vendor needed to have their own test rig so that the live tests could be run simultaneously. Therefore we needed a domain name, sub domain name records in the name servers, and live public IP addresses etc to setup before the testing could commence.

The human factor
The last part of the testing which is often the most difficult--certainly when you consider the rules-based nature of these applications--is the human factor. This is why we took the measure of inviting the individual vendors to send their own engineers to the Labs, to install and configure the applications on the servers.

Sure, from a basic installation and administration point of view the Labs staff could have installed and configure the rule sets for all these applications as they did in the previous review. However, this is a far cry from being an "expert" in each application.

It is one thing to do a usability test to ensure that a person with a reasonable level of technical competency can install and configure an application to get it running. That's nothing like the skill of an engineer working for the company, who creates and maintains that application, and knows of the many little nuances and tweaks needed to be applied to achieve the best possible results. Remember, these are not basic antivirus applications that you can just install and download the latest definition file. The rules on many of these filtering systems are highly complex and evolved.

This is not particularly different from how it works in the real world, anyway. Because the anti-spam market is very competitive, vendors invest a great deal in keeping their products working efficiently. For instance, some vendors run training courses for your staff on the best ways to configure their product. And for your average medium-to-large installation, it's not at all out of the ordinary to have a technician come in to help you install and configure the product.

What we looked for
We designed this test with two overall tests in mind: firstly a static or controlled test using content we had gathered over a period of time that included:

  • defined unwanted e-mail (spam),
  • unsolicited circulars/newsletters (news spam),
  • legitimate e-mails (ham), and
  • solicited circulars/newsletters (news ham).

This ran to some 1800+ items of mail that we sent to each vendor's application. This static test was run through at least twice to ensure accuracy.

The second test was a "live" test combining several real world e-mail boxes into one and then splitting that box to each of the anti-spam filtering servers that the vendors had configured. This test ran for over two weeks, and we then took several days worth of collection and manually went through each e-mail that had arrived and sorted it according to its status.

This live testing period was useful to ensure that the static testing was doing it job correctly in a controlled environment. Naturally, if any large differences occurred, then that application and the testing methodology would need to come under closer scrutiny to find out where and why the differences had occurred. One would act as basically a validation of the other--but as it turned out there were no discrepancies.

Scoring
Once we ran through the static tests, we applied scores and the total overall score achieved at the end as follows:

    +1 point for each spam, e-mail, and solicited newsletter filtered correctly

    -2 points for every unwanted spam message allowed through (false negatives)

    -3 points for every unsolicited newsletter allowed through (false negatives) and

    -5 points for every legitimate e-mail blocked incorrectly (false positives).

The rationale behind this scoring is simple: spam allowed through is an annoyance, but legitimate e-mail blocked can have very serious repercussions. Ironically, it is the false negatives that are more likely to get administrators in trouble--especially if the boss receives a pornographic spam or the like--rather than the false positives, which can be a much more serious matter. But then how are people supposed to know they didn't receive an e-mail if they didn't receive it? While newsletters may be important, we acknowledge that they are more difficult to filter correctly and therefore have less points deducted for improper handling.

Live testing
As intended, the live testing did indeed prove that the static/controlled test results were correct. The live test results basically were identical given the volume of messages sent via both methods.

Due to the very nature of live testing there are also several variables that could be introduced, which potentially are beyond our control especially the "human" factor with counting and classifying the number of messages. Naturally the live testing could only be run once.

Interestingly, the vendors who noted that their applications apply "learning" principles to their filtering did indeed sometimes record different results during the course of the static testing when the same data sets were sent through. However since the captured test data was limited to less than 2000 messages, the variation would not have been sufficient to show any great differences in the test results here. However, this is a good sign that over the course of several months and thousands of messages, these packages may well get better at learning your e-mail pattens and filter better.

With that in mind, these applications did not always produce better results when the "smarts" were activated. In a couple of cases, the results went the other way, but only by one or two messages, and we're confident that with a combination of learning and tweaking, you could improve the accuracy of filtering.

Talkback

Add your opinion

In order to post a comment, you need to be registered. (Sign In or register below)

Post your comment

Terms of Service - As a ZDNet registrant, and by using this service, you indicate that you agree to our Terms and Conditions and have read and understand our Privacy Policy.

ZDNet Australia Live

Before accusing me of fudging the figures, that was the percentage in April, the latest available. It seems that as the advantage of the ...

2 minutes ago by GregoryB1 on NBN FUD: will Abbott ever learn?

Currently about 50% of connections are at the 100Mb/s rate.
As a consequence, ARPU is significantly higher than the projected figures.

10 minutes ago by GregoryB1 on NBN FUD: will Abbott ever learn?

Currently about 50% of connections are at the 100Mb/s rate.
As a consequence, ARPU is significantly higher than the projected figures.

10 minutes ago by GregoryB1 on NBN FUD: will Abbott ever learn?

Wireless currently carries less than 2% of total internet data traffic. Simply to carry the existing traffic, we would need 50 times the ...

25 minutes ago by GregoryB1 on Blowing the digital dividend on wireless NBN

The stupidest part about a wireless solution for the burbs is that it will actually cost more to put an antenna on the roof to get the si...

46 minutes ago by GregoryB1 on Blowing the digital dividend on wireless NBN

The problem is not range of the cell in the urban areas where Turnbull wants LTE instead of fibre, it is the number of users. In urban ar...

49 minutes ago by GregoryB1 on Blowing the digital dividend on wireless NBN

After the Second World War, the pursuit of pleasure domains the entire world atmosphere, Lancel (Lancel) to adapt rapidly into the demand...

1 hour ago by PokArrackpask on Spam sees Westnet blocked by BigPond

RT @DellEnterprise: Dell Secureworks talks with ZDNet about Android's biggest #security flaws - http://t.co/LSFLQVFq #infosec

NBN users opt for 100Mbps: Customers are picking the top fibre plan that is available on the National Broadband ... http://t.co/sjtFSU3g

"Customers are picking the top fibre plan that is available on the National Broadband Network (NBN), more than a... http://t.co/M3P24Htn

Another thing I found so misleading here is the step on how you assume to make the USB bootable . (The NTLDR needs to be renamed to USBNT...

3 hours ago by WindowsAnalyzer on Boot Windows XP from a USB flash drive

You can also use the help of these links, just incase your stuff failed, I probably got Windows build by using the Pebuilder as per the i...

3 hours ago by WindowsAnalyzer on Boot Windows XP from a USB flash drive

RT @CorrieB: An iPad for every child: Inevitable or impossible? http://t.co/I7uS8l9s Thx to @timbuckteeth for this; http://t.co/jxkqIRIp

RT @MADinMelbourne: roxon "will enable more families to access credit" @MLolderandwiser: Privacy Act amendments http://t.co/Mv4c7PC2 via @zdnetaustralia

NBN users opt for 100Mbps - ZDNet Australia http://t.co/fLfHMzPn #australia #technews

RT @konradski: Whaddayaknow - turns out Wi-Fi CAN interfere with a plane's navigation systems http://t.co/ospQCU2S

This story has been voted 5 times in the last 24 hours!

7 hours ago, NBN's Tassie upgrade to cost $1.3 million

Sorry no deal Cinders, I'd rather send my money to someone and watch them desperately try to stop the NBN as this has much better enterta...

7 hours ago by Hubert Cumberdale on NBN users opt for 100Mbps

What else can you expect from a Dodo customer?

7 hours ago by Hubert Cumberdale on NBN users opt for 100Mbps

NBN users opt for 100Mbps - Communications - News - ZDNet Australia: NBN users opt for 100Mbps - Communications ... http://t.co/btB9gKWg

NBN users opt for 100Mbps http://t.co/xKqEb4bE via @zdnetaustralia

Biometric bugs too dangerous for public? http://t.co/8JLz5tdF via @zdnetaustralia

Oh please dont be unkind, I gotta have some fan's. btw I agree I dont set the standard, but who does I wonder?

9 hours ago by Doubt on NBN users opt for 100Mbps

You agree but give him thumbs down... I think you'd better take the medication before one of your alter ego's Fred/Frank/Frergers appear...

9 hours ago by Beta on NBN users opt for 100Mbps

Exploring: http://t.co/rT7RPZLA

+1

9 hours ago by Beta on NBN users opt for 100Mbps

War talk dominates #AusCERT 2012 - http://t.co/SlBpMj0c - #security #cyber

So we agree it was a stupid idea and even stupider comment then ;-)

9 hours ago by Beta on NBN users opt for 100Mbps

Not you obviously ;-)

And stop giving yourself thumbs up FFS.

9 hours ago by Beta on NBN users opt for 100Mbps

Ok Beta, understand now, just one point who sets the standard?

9 hours ago by Doubt on NBN users opt for 100Mbps

Oh no Beta you misunderstand me. I like my waterfront home and deep water jetty, it's those "other" people who can move to Willunga.

9 hours ago by Doubt on NBN users opt for 100Mbps

I agree with you Magnus, but really most people like living on the coastal fringe.

9 hours ago by Doubt on NBN users opt for 100Mbps

Travel Tech Q&A: Skyscanner's Ewan Gray http://t.co/vYexrDwu #ipad

Exploring: http://t.co/YNVjdrct

Exploring: Travel Tech Q and A: Skyscanner's Ewan Gray: Ewan Gray, Skyscanner's director for Asia ... http://t.co/bNLCyobv #ICTChallenge

Exploring: Travel Tech Q and A: Skyscanner's Ewan Gray: Ewan Gray, Skyscanner's director for Asia ... http://t.co/HEPuJgyt #ICTChallenge

#NewSouthWales ditches registration stickers 4 light #vehicles in favour of #technology http://t.co/xX5N0Rp9

Another use is city based top surgeons using 8K resolution monitors to provide real-time assistance to country surgeons and doctors to op...

10 hours ago by Magnus on NBN users opt for 100Mbps

Anonymous hacks Reliance's Internet filtering server - ZDNet (blog) http://t.co/uObU1HBP http://t.co/0UBXxwX4

Which Windows will make for a better tablet? http://t.co/4mAHg850

Listening to @stilgherrian cover AusCERT and cyberwar, http://t.co/6lGUEz8H

Travel Tech Q and A: Skyscanner's Ewan Gray http://t.co/VN5tGJzC

#Westpac Board goes paperless with #Ipads with #Tabula #App http://t.co/duxuj2fd #Cybersecurity #Bank

Microsoft is serious about open source??? http://t.co/mqQGgta7

@joedamato just try varying caps randomly. Maybe they do this http://t.co/1FN5FwYv

NSW outlines datacentre migration plans - Hardware - News - ZDNet Australia http://t.co/OQfUl0D1

"on the new fast Internets everyone wants the fast plan" #orly #nareally #yarly http://t.co/kvfCa84A

Chrome overtakes IE: does it matter? http://t.co/e4SILk8a

A ZDNet study showed that British Facebook users are drunk in 76 percent of their photos.

The HDMI cable ripoff and why retail is really dying http://t.co/eFT7zEW7

Travel Tech Q and A: Skyscanner's Ewan Gray http://t.co/IUysbyKf

Travel Tech Q and A: Skyscanner's Ewan Gray http://t.co/V7vL5QB9

ZDNet reports Microsoft launches its own social service http://t.co/VJS5BkwF

by http://t.co/vmlLt4bh: Travel Tech Q and A: Skyscanner's Ewan Gray: Ewan Gray, Skyscanner's director for Asia P... http://t.co/4bfDRXo4

Travel Tech Q and A: Skyscanner's Ewan Gray http://t.co/CtNlVWN7

Travel Tech Q and A: Skyscanner's Ewan Gray: Ewan Gray, Skyscanner's director for Asia Pacific, shares some of h... http://t.co/ZxjpmqiM

This story has been voted 12000 times in the last 24 hours!

2 days ago, Is Bill Gates a great leader?

Facebook Activity

Keep up with ZDNet Australia

ZDNet Events Calendar

ZDNet Events Calendar