Friday, July 13, 2012

The 2012 Web Application Scanner Benchmark


Top 10:
The Web Application Vulnerability Scanners Benchmark, 2012
Commercial & Open Source Scanners
An Accuracy, Coverage, Versatility, Adaptability, Feature and Price Comparison of 60 Commercial & Open Source Black Box Web Application Vulnerability Scanners

By Shay Chen
Information Security Consultant, Researcher and Instructor
sectooladdict-$at$-gmail-$dot$-com
July 2012
Assessment Environments: WAVSEP 1.2, ZAP-WAVE (WAVSEP integration), WIVET v3-rev148

Table of Contents
1. Introduction
2. List of Tested Web Application Scanners
3. Benchmark Overview & Assessment Criteria
4. A Glimpse at the Results of the Benchmark
5. Test I - Scanner Versatility - Input Vector Support
6. Test II – Attack Vector Support – Counting Audit Features
7. Introduction to the Various Accuracy Assessments
8. Test III – The Detection Accuracy of Reflected XSS
9. Test IV – The Detection Accuracy of SQL Injection
10. Test V – The Detection Accuracy of Path Traversal/LFI
11. Test VI – The Detection Accuracy of RFI (XSS via RFI)
12. Test VII - WIVET - Coverage via Automated Crawling
13. Test VIII – Scanner Adaptability - Crawling & Scan Barriers
14. Test IX – Authentication and Usability Feature Comparison
15. Test X – The Crown Jewel - Results & Features vs. Pricing
16. Additional Comparisons, Built-in Products and Licenses
17. What Changed?
18. Initial Conclusions – Open Source vs. Commercial
19. Verifying The Benchmark Results
20. So What Now?
21. Recommended Reading List: Scanner Benchmarks
22. Thank-You Note
23. FAQ - Why Didn't You Test NTO, Cenzic and N-Stalker?
24. Appendix A – List of Tools Not Included In the Test

1. Introduction
Detailed Result Presentation at
Tools, Features, Results, Statistics and Price Comparison
(Delete Cache)
A Step by Step Guide for Choosing the Right Web Application Vulnerability Scanner for *You*
A Perfectionist Guide for Optimal Use of Web Application Vulnerability Scanners
[Placeholder]

Getting the information was the easy part. All I had to do was to invest a couple of years in gathering the list of tools, and a couple of more in documenting their various features. It's really a daily routine - you read a couple of posts in news groups in the morning, and couple blogs at the evening. Once you get used to it, it's fun, and even quite addictive.

Then came the "best" fantasy, and with it, the inclination to test the proclaimed features of all the web application vulnerability scanners against each other, only to find out that things are not that simple, and finding the "best", if there is such a tool, was not an easy task.
Inevitably, I tried searching for alternative assessment models, methods of measurements that will handle the imperfections of the previous assessments.

I tried to change the perspective, add tests (and hundreds of those - 940+, to be exact),  examine different aspects, and even make parts of the test process obscure, and now, I'm finally ready for another shot.

In spite of everything I had invested in past researches, due to the focus I had on features and accuracy, and the policy I used when interacting with the various vendors, it was difficult, especially for me, to gain insights from the mass amounts of data that will enable me to choose, and more importantly, properly use the various tools in real life scenarios.

Is the most accurate scanner necessarily the best choice for a point and shoot scenario? and what good will it do if it can't scan an application due to a specific scan barrier it can't handle, or because if does not support the input delivery method?

I needed to gather other pieces of the puzzle, and even more importantly, I needed a method, or more accurately, a methodology.

I'm sorry to disappoint you, dear reader, so early in the article, but I still don't have a perfect answer or one recommendation... But I sure am much closer than I ever was, and although I might not have the answer, I have many answers, and a very comprehensive, logical and clear methodology for employing the use of all the information I'm about to present.

In the previous benchmarks , I focused on assessing  3 major aspects of web application scanners, which revolved mostly around features & accuracy, and even though the information was very interesting, it wasn't necessarily useful, at least not in all scenarios.

So  decided to take it to the edge, but since I already reached the number of 60 scanners, it was hard to make an impression with a couple of extra tools, so instead, I focused my efforts on aspects.

This time, I compared 10 different aspects of the tools (or 14, if you consider non competitive charts), and chose the collection with the aim of providing practical tools for making a decision, and getting a glimpse of the bigger picture.

Let me assure you - this time, the information is presented in a manner that is very helpful, is easy to navigate, and is supported by presentation platforms, articles and step by step methodologies.

Furthermore, I wrapped it all in a summary that includes the major results and features in relation to the price, for those of us that prefer the overview, and avoid the drill down.  Information and Insights that I believe, will help testers invest their time in better-suited tools, and consumers in properly investing their money, in the long term or the short term (but not necessarily both*).

As mentioned earlier, this research covers various aspects for the latest versions of 11 commercial web application scanners, and the latest versions of most of the 49 free & open source web application scanners. It also covers some scanners that were not covered in previous benchmarks, and includes, among others, the following components and tests:

A Price Comparison - in Relation to the Rest of the Benchmark Results
Scanner Versatility - A Measure for the Scanner's  Support of Protocols & Input Delivery Vectors
Attack Vector Support - The Amount & Type of Active Scan Plugins (Vulnerability Detection)
Reflected Cross Site Scripting Detection Accuracy
SQL Injection Detection Accuracy
Path Traversal / Local File Inclusion Detection Accuracy
Remote File Inclusion Detection Accuracy (XSS/Phishing via RFI)
WIVET Score Comparison - Automated Crawling / Input Vector Extraction
Scanner Adaptability - Complementary Coverage Features and Scan Barrier Support
Authentication Features Comparison
Complementary Scan Features and Embedded Products
General Scanning Features and Overall Impression
License Comparison and General Information

And just before we delve into the details, one last tip: don't focus solely on the charts - if you want to really understand what they reflect, dig in.
Lists and charts first, detailed description later.

2. List of Tested Web Application Scanners

The following commercial scanners were included in the benchmark:
The following new free & open source scanners were included in the benchmark:
IronWASP v0.9.1.0

The updated versions of the following free & open source scanners were re-tested in the benchmark:
Zed Attack Proxy (ZAP) v1.4.0.1, sqlmap v1.0-Jul-5-2012 (Github), W3AF 1.2-rev509 (SVN), Acunetix Free Edition v8.0-20120509, Safe3WVS v10.1 FE (Safe3 Network Center) WebSecurify v0.9 (free edition - the new commercial version was not tested), Syhunt Mini (Sandcat Mini) v4.4.3.0, arachni v0.4.0.3, Skipfish 2.07b, N-Stalker 2012 Free Edition v7.1.1.121 (N-Stalker), Watobo v0.9.8-rev724 (a few new WATOBO 0.9.9 pre versions were released a few days before the publication of the benchmark, but I didn't managed to test them in time)

Different aspects of the following free & open source scanners were tested in the benchmark:
VEGA 1.0 beta (Subgraph), Netsparker Community Edition v1.7.2.13, Andiparos v1.0.6, ProxyStrike v2.2, Wapiti v2.2.1, Paros Proxy v3.2.13, Grendel Scan v1.0

The results were compared to those of unmaintained scanners tested in previous benchmarks:
PowerFuzzer v1.0, Oedipus v1.8.1 (v1.8.3 is around somewhere), Scrawler v1.0, WebCruiser v2.4.2 FE (corrections), Sandcat Free Edition v4.0.0.1, JSKY Free Edition v1.0.0, N-Stalker 2009 Free Edition v7.0.0.223, UWSS (Uber Web Security Scanner) v0.0.2, Grabber v0.1, WebScarab v20100820, Mini MySqlat0r v0.5, WSTool v0.14001, crawlfish v0.92, Gamja v1.6, iScan v0.1, LoverBoy v1.0, DSSS (Damn Simple SQLi Scanner) v0.1h, openAcunetix v0.1, ScreamingCSS v1.02, Secubat v0.5, SQID (SQL Injection Digger) v0.3, SQLiX v1.0, VulnDetector v0.0.2, Web Injection Scanner  (WIS) v0.4, Xcobra v0.2, XSSploit v0.5, XSSS v0.40, Priamos v1.0, XSSer v1.5-1 (version 1.6 was released but I didn't manage to test it), aidSQL 02062011 (a newer revision exists in the SVN but was not officially released)
For a full list of commercial & open source tools that were not tested in this benchmark, refer to the appendix.

3. Benchmark Overview & Assessment Criteria
The benchmark focused on testing commercial & open source tools that are able to detect (and not necessarily exploit) security vulnerabilities on a wide range of URLs, and thus, each tool tested was required to support the following features:
·         The ability to detect Reflected XSS and/or SQL Injection and/or Path Traversal/Local File Inclusion/Remote File Inclusion vulnerabilities.
·         The ability to scan multiple URLs at once (using either a crawler/spider feature, URL/Log file parsing feature or a built-in proxy).
·         The ability to control and limit the scan to internal or external host (domain/IP).

The testing procedure of all the tools included the following phases:
Feature Documentation
The features of each scanner were documented and compared, according to documentation, configuration, plugins and information received from the vendor. The features were then divided into groups, which were used to compose various hierarchal charts.
Accuracy Assessment
The scanners were all tested against the latest version of WAVSEP (v1.2, integrating ZAP-WAVE), a benchmarking platform designed to assess the detection accuracy of web application scanners, which was released with the publication of this benchmark. The purpose of WAVSEP’s test cases is to provide a scale for understanding which detection barriers each scanning tool can bypass, and which common vulnerability variations can be detected by each tool.
·         The various scanners were tested against the following test cases (GET and POST attack vectors):
o   816 test cases that were vulnerable to Path Traversal attacks.
o   108 test cases that were vulnerable to Remote File Inclusion (XSS via RFI) attacks.
o   66 test cases that were vulnerable to Reflected Cross Site Scripting attacks.
o   80 test cases that contained Error Disclosing SQL Injection exposures.
o   46 test cases that contained Blind SQL Injection exposures.
o   10 test cases that were vulnerable to Time Based SQL Injection attacks.
o   7 different categories of false positive RXSS vulnerabilities.
o   10 different categories of false positive SQLi vulnerabilities.
o   8 different categories of false positive Path Travesal / LFI vulnerabilities.
o   6 different categories of false positive Remote File Inclusion vulnerabilities.
·        The benchmark included 8 experimental RXSS test cases and 2 experimental SQL Injection test cases, and although the scan results of these test cases were documented in the various scans, their results were not included in the final score, at least for now.
·         In order to ensure the result consistency, the directory of each exposure sub category was individually scanned multiple times using various configurations, usually using a single thread and using a scan policy that only included the relevant plugins.
In order to ensure that the detection features of each scanner were truly effective, most of the scanners were tested against an additional benchmarking application that was prone to the same vulnerable test cases as the WAVSEP platform, but had a different design, slightly different behavior and different entry point format, in order to verify that no signatures were used, and that any improvement was due to the enhancement of the scanner's attack tree.



Attack Surface Coverage Assessment
In order to assess the scanners attack surface coverage, the assessment included tests that measure the efficiency of the scanner's automated crawling mechanism (input vector extraction) , and feature comparisons meant to assess its support for various technologies and its ability to handle different scan barriers.
This section of the benchmark also included the WIVET test (Web Input Vector Extractor Teaser), in which scanners were executed against a dedicated application that can assess their crawling mechanism in the aspect of input vector extraction. The specific details of this assessment are provided in the relevant section.
Public tests vs. Obscure tests
In order to make the test as fair as possible, while still enabling the various vendors to show improvement, the benchmark was divided into tests that were publically announced, and tests that were obscure to all vendors:
·         Publically announced tests: the active scan feature comparison, and the detection accuracy assessment of the SQL Injection and Reflected Cross Site Scripting, composed out of tests cases which were published as a part of WAVSEP v1.1.1)
·         Tests that were obscure to all vendors until the moment of the publication: the various new groups of feature comparisons, the WIVET assessment, and the detection accuracy assessment of the Path Traversal / LFI and Remote File Inclusion (XSS via RFI), implemented as 940+ test cases in WAVSEP 1.2 (a new version that was only published alongside this benchmark).

The results of the main test categories are presented within three graphs (commercial graph, free & open source graph, unified graph), and the detailed information of each test is presented in a dedicated section in benchmark presentation platform at http://www.sectoolmarket.com.

Now that were finally done with the formality, let's get to the interesting part... the results.

4. A Glimpse to the Results of the Benchmark
This presentation of results in this benchmark, alongside the dedicated website (http://www.sectoolmarket.com/) and a series of supporting articles and methodologies ([placeholder]), are all designed to help the reader to make a decision - to choose the proper product/s or tool/s for the task at hand, within the borders of the time or budget.

For those of us that can't wait, and want to get a glimpse to the summary of the unified results, there is a dedicated page available at the following links:

Price & Feature Comparison of Commercial Scanners
http://sectoolmarket.com/price-and-feature-comparison-of-web-application-scanners-commercial-list.html
Price & Feature Comparison of a Unified List of Commercial, Free and Open Source Products


Some of the sections might not be clear to some of the readers at this phase, which is why I advise you to read the rest of the article, prior to analyzing this summary.

5. Test I - Scanner Versatility - Input Vector Support
The first assessment criterion was the number of input vectors each tool can scan (and not just parse).

Modern web applications use a variety of sub-protocols and methods for delivering complex inputs from the browser to the server. These methods include standard input delivery methods such as HTTP querystring parameters and HTTP body parameters,  modern delivery methods such as JSON and XML, and even binary delivery methods for technology specific objects such as AMF, Java serialized objects and WCF.
Since the vast majority of active scan plugins rely on input that is meant to be injected into client originating parameters, supporting the parameter (or rather, the input) delivery method of the tested application is a necessity.

Although the charts in this section don't necessarily represent the most important score, it is the most important perquisite for the scanner to comply with when scanning a specific technology.

Reasoning: An automated tool can't detect a vulnerability in a given parameter, if it can't scan the protocol or mimic the application's method of delivering the input. The more vectors of input delivery that the scanner supports, the more versatile it is in scanning different technologies and applications (assuming it can handle the relevant scan barriers, supports necessary features such as authentication, or alternatively, contains features that can be used to work around the specific limitations).

The detailed comparison of the scanners support for various input delivery methods is documented in detail in the following section of sectoolmarket (recommended - too many scanners in the chart):

The following chart shows how versatile each scanner is in scanning different input delivery vectors (and although not entirely accurate - different technologies):

The Number of Input Vectors Supported – Commercial Tools




The Number of Input Vectors Supported – Free & Open Source Tools


The Number of Input Vectors Supported – Unified List



6. Test II – Attack Vector Support – Counting Audit Features
The second assessment criterion was the number of audit features each tool supports.

Reasoning: An automated tool can't detect an exposure that it can't recognize (at least not directly, and not without manual analysis), and therefore, the number of audit features will affect the amount of exposures that the tool will be able to detect (assuming the audit features are implemented properly, that vulnerable entry points will be detected, that the tool will be able to handle the relevant scan barriers and scanning perquisites,  and that the tool will manage to scan the vulnerable input vectors).

For the purpose of the benchmark, an audit feature was defined as a common generic application-level scanning feature, supporting the detection of exposures which could be used to attack the tested web application, gain access to sensitive assets or attack legitimate clients.

The definition of the assessment criterion rules out product specific exposures and infrastructure related vulnerabilities, while unique and extremely rare features were documented and presented in a different section of this research, and were not taken into account when calculating the results. Exposures that were specific to Flash/Applet/Silverlight and Web Services Assessment (with the exception of XXE) were treated in the same manner.

The detailed comparison of the scanners support for various audit features is documented in detail in the following section of sectoolmarket:

The Number of Audit Features in Web Application Scanners – Commercial Tools




The Number of Audit Features in Web Application Scanners – Free & Open Source Tools


The Number of Audit Features in Web Application Scanners – Unified List



So once again, now that were done with the quantity, let's get to the quality…

7. Introduction to the Various Accuracy Assessments
The following sections presents the results of the detection accuracy assessments performed for Reflected XSS, SQL Injection, Path Traversal and Remote File Inclusion (RXSS via RFI) - four of the most commonly supported features in web application scanners. Although the detection accuracy of a specific exposure might not reflect the overall condition of the scanner on its own, it is a crucial indicator for how good a scanner is at detecting specific vulnerability instances.
The various assessments were performed against the various test cases of WAVSEP v1.2, which emulate different common test case scenarios for generic technologies.
Reasoning: a scanner that is not accurate enough will miss many exposures, and might classify non-vulnerable entry points as vulnerable. These tests aim to assess how good is each tool at detecting the vulnerabilities it claims to support, in a supported input vector, which is located in a known entry point, without any restrictions that can prevent the tool from operating properly.

8. Test III – The Detection Accuracy of Reflected XSS
The third assessment criterion was the detection accuracy of Reflected Cross Site Scripting, a common exposure which is the 2nd most commonly implemented feature in web application scanners, and the one in which I noticed the greatest improvement in the various tested web application scanners.

The comparison of the scanners' reflected cross site scripting detection accuracy is documented in detail in the following section of sectoolmarket:


Result Chart Glossary
Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).

The Reflected XSS Detection Accuracy of Web Application Scanners – Commercial Tools



The Reflected XSS Detection Accuracy of Web Application Scanners – Open Source & Free Tools



The Reflected XSS Detection Accuracy of Web Application Scanners – Unified List



9. Test IV – The Detection Accuracy of SQL Injection
The fourth assessment criterion was the detection accuracy of SQL Injection, one of the most famous exposures and the most commonly implemented attack vector in web application scanners.

The evaluation was performed on an application that uses MySQL 5.5.x as its data repository, and thus, will reflect the detection accuracy of the tool when scanning an application that uses similar data repositories.

The comparison of the scanners' SQL injection detection accuracy is documented in detail in the following section of sectoolmarket:

Result Chart Glossary
Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).


The SQL Injection Detection Accuracy of Web Application Scanners – Commercial Tools



The SQL Injection Detection Accuracy of Web Application Scanners – Open Source & Free Tools



The SQL Injection Detection Accuracy of Web Application Scanners – Unified List



Although there are many changes in the results since the last benchmark, both of these exposures (SQLi, RXSS) were previously assessed, so, I believe it's time to introduce something new... something none of the tested vendors could have prepared for in advance...

10. Test V – The Detection Accuracy of Path Traversal/LFI
The fifth assessment criterion was the detection accuracy of Path Traversal (a.k.a Directory Traversal), a newly implemented feature in WAVSEP v1.2, and the third most commonly implemented attack vector in web application scanners.

The reason it was tagged along with Local File Inclusion (LFI) is simple - many scanners don't make the differentiation between inclusion and traversal, and furthermore, a few online vulnerability documentation sources don't. In addition, the results obtained from the tests performed on the vast majority of tools lead to the same conclusion - many plugins listed under the name LFI detected the path traversal plugins.

While implementing the path traversal test cases and consuming nearly every relevant piece of documentation I could find on the subject, I decided to take the current path, in spite of some acute differences some of the documentation sources suggested (but did implemented an infrastructure in WAVSEP for "true" inclusion exposures).

The point is not to get into a discussion of whether or not path traversal, directory traversal and local file inclusion should be classified as the same vulnerability, but simply to explain why in spite of the differences some organizations / classification methods have for these exposures, they were listed under the same name (In sectoolmarket - path traversal detection accuracy is listed under the title LFI).

The evaluation was performed on a WAVSEP v1.2 instance that was hosted on windows XP, and although there are specific test cases meant to emulate servers that are running with a low privileged OS user accounts (using the servlet context file access method), many of the test cases emulate web servers that are running with administrative user accounts.

[Note - in addition to the wavsep installation, to produce identical results to those of this benchmark, a file by the name of content.ini must be placed in the root installation directory of the tomcat server- which is different than the root directory of the web server]

Although I didn't perform the path traversal scans on Linux for all the tools, I did perform the initial experiments on Linux, and even a couple of verifications on Linux for some of the scanners, and as weird as it sounds, I can clearly state that the results were significantly worse, and although I won't get the opportunity to discuss the subject in this benchmark, I might handle it in the next.

In order to assess the detection accuracy of different path traversal instances, I designed a total of 816 OS-adapting path traversal test cases (meaning - the test cases adapt themselves to the OS they are executed in, and to the server they are executed in, in the aspects of file access delimiters and file access paths). I know it might seem a lot, and I guess I did got carried away with the perfectionism, but you will be surprised too see that these tests really represent common vulnerability instances, and not necessarily super extreme scenarios, and that results of the tests did prove the necessity.

The tests were deigned to emulate various combination of the following conditions and restrictions:



If you will take a closer look at the detailed scan-specific results at www.sectoolmarket.com, you'll notice that some scanners were completely unaffected by the response content type and HTTP code variation, while other scanners were dramatically affected by the variety (gee, it's nice to know that I didn't write them all for nothing... :) ).

In reality, there were supposed to more test cases, primarily because I intended to test injection entry points in which the input only affected the filename without the extension, or was injected directly into the directory name. However, due to the sheer amount of tests and the deadline I had for this benchmark, I decided to delete (literally) the test cases that handled these anomalies, and focus on test cases in which the entire filename/path was affected. That being said, I might publish these test cases in future versions of wavsep (they amount to a couple of hundreds).

The comparison of the scanners' path traversal detection accuracy is documented in detail in the following section of sectoolmarket:

Result Chart Glossary
Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).


The Path Traversal / LFI Detection Accuracy of Web Application Scanners – Commercial Tools



The Path Traversal / LFI Detection Accuracy of Web Application Scanners – Open Source & Free Tools



The Path Traversal / LFI Detection Accuracy of Web Application Scanners – Unified List



And what of LFI's evil counterpart, Remote File Inclusion?
(yeah yeah, I know, it was path traversal...)

11. Test VI – The Detection Accuracy of RFI (XSS via RFI)
The sixth assessment criterion was the detection accuracy of Remote File Inclusion (or more accurately, vectors of RFI that can result in XSS or Phishing - and currently, not necessarily in server code execution), a newly implemented feature in WAVSEP v1.2, and the one of most commonly implemented attack vector in web application scanners.
I didn't originally plan to assess the detection accuracy of RFI in this benchmark, however, since I implemented a new structure to wavsep that enables me to write a lot of test cases faster, I couldn't resist the urge to try it... and thus, found a new way to decrease the amount of sleep I get each night.
The interesting thing I found was that although RFI is supposed to work a bit differently than LFI/Path traversal, many LFI/Path traversal Plugins effectively detected RFI exposures, and in some instances, the tests for both of these vulnerabilities were actually implemented in the same plugin (usually named "file inclusions"); thus, while scanning for Traversal/LFI/RFI, I usually activated all the relevant plugins in the scanner, and low and behold - got results from the LFI/Path Traversal plugins that even the RFI dedicated plugins did not detect.
In order to assess the detection accuracy of different remote file inclusion exposures (again, RXSS/Phishing via RFI vectors), I designed a total of 108 remote file inclusion test cases.
The tests were deigned to emulate various combination of the following conditions and restrictions:



Just like the case of path traversal, In reality, there were supposed to be more XSS via RFI test cases, primarily because I intended to test injection entry points in which the input only affected the filename without the extension, or was injected directly into the directory name. However, due to the sheer amount of tests and the deadline I had for this benchmark, I decided to delete (literally) the test cases that handled these anomalies, and focus on test cases in which the entire filename/path was affected. That being said, I might publish these test cases in future versions of wavsep (they amount to dozens).

[Note: Although the tested versions of Appscan and Nessus contain RFI detection plugins, they did not support the detection of XSS via RFI.]

The comparison of the scanners' remote file inclusion detection accuracy is documented in detail in the following section of sectoolmarket:

Result Chart Glossary
Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).


The RFI (XSS via RFI) Detection Accuracy of Web Application Scanners – Commercial Tools



The RFI (XSS via RFI) Detection Accuracy of Web Application Scanners – Open Source & Free Tools



The RFI (XSS via RFI) Detection Accuracy of Web Application Scanners – Unified List


And after covering all those accuracy aspects, it's time to cover a totally different subject - Coverge.

12. Test VII - WIVET - Coverage via Automated Crawling
The seventh assessment criterion was the scanner's WIVET score, which is related to coverage.

The concept of coverage can mean a lot of things, but in general, what I'm referring to is the ability of the scanner to increase the attack surface of the tested application - to locate additional resources and input delivery methods to attack.

Although a scanner can increase the attack surface in a number of ways, from detecting hidden files to exposing device-specific interfaces, this section of the benchmark focuses on automated crawling and an efficient input vector extraction.

This aspect of a scanner is extremely important in point-and-shoot scans, scans in which the user does not "train" the scanner to recognize the application structure, URLs and requests, either due to time/methodology restrictions, or when the user is not a security expert that knows how to properly use manual crawling with the scanner.

In order to evaluate these aspects in scanners, I used a wonderful OWASP turkey project called WIVET (Web Input Vector Extractor Teaser); The WIVET project is a benchmarking project that was written by an application security specialist by the name of Bedirhan Urgun, and released under the GPL2 license.

The project is implemented as a web application which aims to "statistically analyze web link extractors", by measuring the amount of input vectors extracted by each scanner while crawling the WIVET website, in order to assess how well each scanner can increase the coverage of the attack surface.

Plainly speaking, the project simply measures how well a scanner is able to crawl the application, and how well can it locate input vectors, by presenting a collection of challenges that contain links, parameters and input delivery methods that the crawling process should locate and extract.

Although WIVET used to have an online instance, with my luck, by the time I decided to use it the online version was already gone... so I checked-out the latest subversion revision from the project's google code website (v3-revision148), installed FastCGI on an IIS server (Windows XP), copied the application files to a directory called wivet under the C:\Inetpub\wwwroot\ directory, and started the IIS default website.

In order for WIVET to work, the scanner must crawl the application while consistently using the same session identifier in its crawling requests, while avoiding the 100.php logout page (which initializes the session, and thus the results). The results can then be viewed by accessing the application index page, while using the session identifier used during the scan.

A very nice idea that makes the assessment process easy and effective, however, for me, things weren't that easy. Although some scanners did work properly with the platform, many scanners did not receive any score, even though I configured them exactly according to the recommendations (valid session identifier and logout URL exclusion), so after a careful examination, I discovered the source of my problem: some of the scanners don't send the predefined session identifier in their crawling requests (even though it's explicitly defined in the product), and others simply ignore URL exclusions (in certain conditions).

Since even without these bugs, not all the scanners supported URL exclusions (100.php logout page) and predefined cookies, I had to come up with a solution that will enable me to test all of them... so I changed the WIVET platform a little bit by deleting the link to the logout page (100.php) from the main menu page (menu.php), forwarded the communication of the vast majority of scanners through a fiddler instance, in which I defined a valid WIVET session identifier (using the filter features), and in extreme scenarios in which an upstream proxy was not supported by the scanner, defined the WIVET website as a proxy in an IE browser, loaded fiddler (so it will forward the communication to the system defined proxy - WIVET), defined burp as a transparent proxy that forwards the communication to fiddler (upstream proxy), and scanned burp instead of the WIVET application (the scanner will scan burp which will forward the communication to fiddler which will forward the communication to the system defined proxy - the WIVET website).

These solutions seemed to be working for most vendors, that is until I discovered two more bugs that caused these solutions not to work for another small group of products...

The first bug was related to the emulation of modern browser behavior when interpreting the relative context of links in a frameset (browsers use the link's target frame as the path basis, but some scanners used the path basis of the links origin page), and the other bug was related to another browser emulation issue - some scanners that did not manage to submit forms without an action property (while a browser usually submits such a form to the same URL that form originated from).

I managed to solve the first bug by editing the menu page and manually adding additional links with an alternate context  (added "pages/" to all URLs) to the same WIVET pages , while the second bug was reported to some vendors (and was handled by them).

Finally, some of the scanners had bugs that I did not manage to isolate in the given timeframe, and thus, I didn't manage to get any WIVET score for them (a list of these products will presented at the end of this section).
However, the vast majority of the scanners did got a score, which can be viewed in the following charts and links.

The comparison of the scanners' WIVET score is documented in detail in the following section of sectoolmarket:
http://sectoolmarket.com/wivet-score-unified-list.html

The WIVET Score of Web Application Scanners – Commercial Tools


The WIVET Score of Web Application Scanners – Free and Open Source Tools


The WIVET Score of Web Application Scanners – Unified List


It is important to clarify that due to these scanner bugs (and the current WIVET structure) - low scores and non-existing scores might differ once minor bugs are fixed, but the scores presented in this chart are currently all I can offer.

The following scanners didn't manage to get a WIVET score at all (even after all the adjustments and enhancements I tried), and although this does not mean that their score is necessarily low, or that there isn't any possible way to execute them in-front of WIVET, simply that there isn't a simple method of doing it (at least not one that I discovered):
Syhunt Mini (Sandcat Mini), Webcruiser, IronWASP, Safe3WVS free edition, N-Stalker 2012 free edition, Vega, Skipfish.
In addition, I didn't try scanning WIVET with various unmaintained scanners, scanners that didn't have a spider feature (WATOBO in the assessed version, Ammonite, etc), or with the following assessed tools: Nessus, sqlmap.
It's crucial to note that scanners with burp-log parsing features (such sqlmap and IronWASP) can effectively be assigned with the WIVET score of burp, that scanners with internal proxy features (such as ZAP, Burpsuite, Vega, etc) can be used with the crawling mechanisms of other scanners (such as Acunetix FE), and that as a result of both of these conclusions, any scanner that supports any of those features can be assigned the WIVET score of any scanner in the possession of the tester (by using the crawling mechanism of a scanner through a proxy such as burp, in order to generate scan logs).

13. Test VIII – Scanner Adaptability - Crawling & Scan Barriers
By using the seemingly irrelevant term "adaptability" in relation to scanners, I'm actually referring to the scanner's ability to adapt and scan the application, despite different technologies, abnormal crawling requirements and varying scan barriers, such as Anti-CSRF tokens, CAPTCHA mechanisms, platform specific tokens (such as required viewstate values) or account lock mechanisms.

Although not necessarily a measurable quality, the ability of the scanner to handle different technologies and scan barriers is an important perquisite, and in a sense, almost as important as being able to scan the input delivery method.

Reasoning: An automated tool can't detect a vulnerability in a point and shoot scenario if it is can't locate & scan the vulnerable location due to the lack of support in a certain a browser add-on, the lack of support for extracting data from certain non-standard vectors, or the lack of support in overcoming a specific barrier, such as a required token or challenge. The more barriers the scanner is able to handle, the more useful it is when scanning complex applications that employ the use of various technologies and scan barriers (assuming it can handle the relevant input vectors, supports the necessary features such as authentication, or has a feature that can be used to work around the specific limitations).

The following charts shows how many types of barriers does each scanner claim to be able to handle (these features were not verified, and the information currently relies on documentation or vendor supplied information):

The Adaptability Score of Web Application Scanners – Commercial Tools


The Adaptability Score of Web Application Scanners – Free and Open Source Tools


The Adaptability Score of Web Application Scanners – Unified List


The detailed comparison of the scanners support for various barriers is documented in detail in the following of sectoolmarket:



14. Test IX – Authentication and Usability Feature Comparison
Although supporting the authentication required by the application seems like a crucial quality, in reality, certain scanner chaining features can make-up for the lack of support in certain authentication methods, by employing the use of a 3rd party proxy to authenticate on the scanner's behalf.

For example, if we wanted to use a scanner that does not support NTLM authentication (but does support an upstream proxy), we could have defined the relevant credentials in burpsuite FE, and define it as an upstream proxy for the tested scanner.

However, chaining the scanner to an external tool that supports the authentication still has some disadvantages, such as potential stability issues, thread limitation and inconvenience.

The following comparison table shows which authentication methods and features are supported by the various assessed scanners:

15. Test X – The Crown Jewel - Results & Features vs. Pricing
Finally, after reading through all the sections and charts, and analyzing the different aspects  in which each scanner was measured, it's time to expose the price (at least for those of you that did manage to resist the temptation to access this link at the beginning).

The important thing to notice, specifically in relation to commercial scanner pricing, is that each product might be a bundle of several semi-independent products that cover different aspects of the assessment process, which are not necessarily related to the web application security. These products currently include web service scanners, flash application scanners and CGI scanners (SAST and IAST features were not included on purpose).

In short, the scanner price might reflect (or not) a set of products that might have been priced separately as an independent product.

Another issue to pay attention to is the type of license acquired. In general, I did not cover non commercial prices in this comparison, and in addition, did not include any vendor specific bundles, sales, discounts and sales pitches. I presented the base prices listed in the vendor website or provided to me by the vendor, according to a total of 6 predefined categories, which are in fact, combinations of the following concepts:
Consultant Licenses: although there isn't a commonly accepted term, I defined "Consultant" licenses as licenses that fit the common requirements of a consulting firm - scanning an unrestricted amount of IP addresses, without any boundaries or limitations.

Limited Enterprise Licenses: Any license that allowed scanning an unlimited but restricted set of addresses (for example - internal network addresses or organization-specific assets) was defined as an enterprise license, which might not be suited for a consultant, but will usually suffice for an organization interested in assessing its own applications.
Website/Year - a license to install the software on a single station and use it for a  single year against a single IP address (the exception to this rule is Netsparker, in which the per website price reflects 3 Websites).
Seat/Year - a license to install the software on a single station and use it for a single year.
Perpetual Licenses - pay once, and it's yours (might still be limited by seat, website, enterprise or consultant restrictions). The vendor's website usually includes additional prices for optional support and product updates.

The various prices can be viewed in the dedicated comparison in sectoolmarket, available in the following address:

It is important to remember that this prices might change, vary or be affected by numerous variables, from special discounts and sales to a strategic conscious decision of a vendors to invest in you as a customer or a beta testing site.

16. Additional Comparisons, Built-in Products and Licenses
While in the past I used to present additional information in external PDF files, with the new presentation platform I am now able to present the information in a media that is much easier to use and analyze. Although anyone can access the root URL of sectoolmarket and search the various sections on his own, I decided to provide a short summary of additional lists and features that were not covered in a dedicated section of this benchmark, but were still documented and published in sectoolmarket.

List of Tools
The list of tools tested in this benchmark, and in the previous benchmarks, can be accessed through the following link:
Additional Features
Complementary scan features that were not evaluated or included in the benchmark:
·         Complementary Scan Features
·         General Scanner Features

In order to clarify what each column in the report table means, use the following glossary table:
Title
Possible Values
Configuration & Usage Scale
Very Simple - GUI + Wizard
Simple - GUI with simple options, Command line with scan configuration file or simple options
Complex - GUI with numerous options, Command line with multiple options
Very Complex - Manual scanning feature dependencies, multiple configuration requirements
Stability Scale
Very Stable - Rarely crashes, Never gets stuck
Stable - Rarely crashes, Gets stuck only in extreme scenarios
Unstable - Crashes every once in a while, Freezes on a consistent basis
Fragile – Freezes or Crashes on a consistent basis, Fails performing the operation in many cases
Performance Scale
Very Fast - Fast implementation with limited amount of scanning tasks
Fast - Fast implementation with plenty of scanning tasks
Slow - Slow implementation with limited amount of scanning tasks
Very Slow - Slow implementation with plenty of scanning tasks

Scan Logs
In order to access the scan logs and detailed scan results of each scanner, simply access the scan-specific information for that scanner, by clicking on the scanner version in the various comparison charts:

17. What Changed?
Since the latest benchmark, many open source & commercial tools added new features and improved their detection accuracy.

The following list presents a summary of changes in the detection accuracy of commercial tools that were tested in the previous benchmark (+new):
·         IBM AppScan - no significant changes, new results for Path Traversal and WIVET.
·         WebInspect - a dramatic improvement in the detection accuracy of SQLi and XSS (fantastic result!), new results for Path Traversal, RFI (fantastic result!), and WIVET (fantastic result!)
·         Netsparker - no significant changes, new results for Path Traversal and WIVET.
·         Acunetix WVS - a dramatic improvement in the detection accuracy of SQLi (fantastic result!) and XSS (fantastic result!), and new results for Path Traversal, RFI and WIVET.
·         Syhunt Dynamic - a dramatic improvement in the detection accuracy of XSS (fantastic result!) and SQLi, and new results for Path Traversal, RFI and WIVET.
·         Burp Suite - a dramatic improvement in the detection accuracy of XSS and SQLi (fantastic result!), and new results for Path Traversal and WIVET.
·         ParosPro - New results for Path Traversal and WIVET.
·         JSky - New results for RFI, Path Traversal and WIVET.
·         WebCruiser - No significant changes.
·         Nessus - a dramatic improvement in the detection accuracy of Reflected XSS, potential bug in the LFI/RFI detection features.
·         Ammonite - New results for RXSS, SQLi, RFI and Path Traversal (fantastic result!)
The following list presents a summary of changes in the detection accuracy of free and open source tools that were tested in the previous benchmark (+new):
·         Zed Attack Proxy (ZAP) – a dramatic improvement in the detection accuracy of Reflected XSS exposures (fantastic result!), in addition to new results for Path Traversal and WIVET.
·         IronWASP - New results for SQLi, XSS, Path Traversal and RFI (fantastic result!).
·         arachni – an improvement in the detection accuracy of Reflected XSS exposures (mainly due to the elimination of false positives), but a decrease in the accuracy of SQL injection exposures (due to additional false positives being discovered). There's also new results for RFI, Path Traversal (incomplete due to a bug), and WIVET.
·         sqlmap – a dramatic improvement in the detection accuracy of SQL Injection exposures (fantastic result!).
·         Acunetix Free Edition – a dramatic improvement in the detection accuracy of Reflected XSS exposures, in addition to a new WIVET result.
·         Syhunt Mini (Sandcat Mini) - a dramatic improvement in the detection accuracy of both XSS (fantastic result!) and SQLi. New results for RFI.
·         Watobo – Identical results, in addition to new results for Path Traversal and WIVET. The author did not test the latest Watobo version, which was released a few days before the publication of this benchmark.
·         N-Stalker 2012 FE – no significant changes, although it seems that the decreased accuracy is actually an unhandled bug in the release (unverified theory).
·         Skipfish –  insignificant changes that probably result from the testing methodology and/or testing environment. New results for Path Traversal, RFI and WIVET.
·         WebSecurify – a major improvement in the detection accuracy of RXSS exposures, and new results for Path Traversal and WIVET.
·         W3AF – a slight increase in the SQL Injection detection accuracy. New results for Path Traversal (fantastic result!), RFI and WIVET.
·         Netsparker Community Edition – New results for WIVET.
·         Andiparos & Paros – New results for WIVET.
·         Wapiti – New results for Path Traversal, RFI and WIVET.
·         ProxyStrike – New results for WIVET (Fantastic results for an open source product! again!)
·         Vega - New results for Path Traversal, RFI and WIVET.
·         Grendel Scan – New results for WIVET.

18. Initial Conclusions – Open Source vs. Commercial
The following section presents my own personal opinions on the results, and is not based purely on accurate statistics, like the rest of the benchmark.

After testing various versions of over 51 open source scanners on multiple occasions, and after comparing the results and experiences to the ones I had after testing 15 commercial ones (including tools tested in the previous benchmarks and tools I did not reported), I have reached the following conclusions:
·         As far as accuracy & features, the distance between open source tools and commercial tools is insignificant, and open source already rival, and in some rare cases, even exceed the capabilities of commercial scanners (and vice versa).

·         Although most open source scanners have not yet adjusted to support applications that use new technologies (AJAX, JSON, etc), recent advancement in the crawler of ZAP proxy (not tested in the benchmark, and might be reused by other projects), and the input vectors supported by a new project named IronWASP are a great beginning to the process. On the other hand, most of the commercial vendors already adjusted themselves to some of the new technologies, and can be used to scan them in a variety of models.

·         The automated crawling capability of most commercial scanners is significantly better than those of open source projects, making these tools better for point and shot scenarios... the difference however, is not significant for some open source projects, which can "import" or employ the crawling capabilities of the a free version of a commercial product (requires some experience with certain tools - probably more suited for a consultant then a QA engineer).

·         Some open source tools, even the most accurate ones, are relatively difficult to install & use, and still require fine-tuning in various fields, particularly stability. Other open source projects however, improved over the last year, and enhanced their user experience in many ways.

19. Verifying The Benchmark Results
The results of the benchmark can be verified by replicating the scan methods described in the scan log of each scanner, and by testing the scanner against WAVSEP v1.2 and WIVET v3-revision148.
The same methodology can be used to assess vulnerability scanners that were not included in the benchmark.
The latest version of WAVSEP can be downloaded from the web site of project WAVSEP (binary/source code distributions, installation instructions and the test case description are provided in the web site download section):

The latest version of WIVET can be downloaded from the project web site, or preferably, checked-out from the project subversion repository:
svn checkout http://wivet.googlecode.com/svn/trunk/ wivet-read-only

20. So What Now?
So now that we have all those statistics, it's time to analyze them properly, and see which conclusions we can get to. I already started writing a couple of articles that will make the information easy to use, and defined a methodology that will explain exactly how to use it. Analyzing the results however, will take me some time, since most of my time in the next few months will be invested in another project I'm working on (will be released soon), one I've been working on for the past year.

Since I didn't manage to test all the tools I wanted, I might update the results of the benchmark soon with additional tools (so you can think of it as a dynamic benchmark), and I will surely update the results in sectoolmarket (made some promises).

If you want to get notifications on new scan results, follow my blog or twitter account, and i'll do my best to tweet notification when I find the time to perform some major updates.

Since I have already been in the situation in the past, then I know what's coming… so I apologize in advance for any delays in my responses in the next few weeks, especially during august.

21. Recommended Reading List: Scanner Benchmarks
The following resources include additional information on previous benchmarks, comparisons and assessments in the field of web application vulnerability scanners:
·         "SQL Injection through HTTP Headers", by Yasser Aboukir (an analysis and enhancement of the 2011 60 scanners benchmark, with a different approach for interpreting the results, March 2012)
·         "The Scanning Legion: Web Application Scanners Accuracy Assessment & Feature Comparison", one of the predecessors of the current benchmark, by Shay Chen (a comparison of 60 commercial & open source scanners, August 2011)
·         "Building a Benchmark for SQL Injection Scanners", by Andrew Petukhov (a commercial & opensource scanner SQL injection benchmark with a generator that produces 27680 (!!!) test cases, August 2011)
·         "Webapp Scanner Review: Acunetix versus Netsparker", by Mark Baldwin (commercial scanner comparison, April 2011)
·         "Effectiveness of Automated Application Penetration Testing Tools", by Alexandre Miguel Ferreira and Harald Kleppe (commercial & freeware scanner comparison, February 2011)
·         "Web Application Scanners Accuracy Assessment", one of the predecessors of the current benchmark, by Shay Chen (a comparison of 43 free & open source scanners, December 2010)
·         "State of the Art: Automated Black-Box Web Application Vulnerability Testing" (Original Paper), by Jason Bau, Elie Bursztein, Divij Gupta, John Mitchell (May 2010) – original paper
·         "Analyzing the Accuracy and Time Costs of Web Application Security Scanners", by Larry Suto (commercial scanners comparison, February 2010)
·         "Why Johnny Can’t Pentest: An Analysis of Black-box Web Vulnerability Scanners", by Adam Doup´e, Marco Cova, Giovanni Vigna (commercial & open source scanner comparison, 2010)
·         "Web Vulnerability Scanner Evaluation", by AnantaSec (commercial scanner comparison, January 2009)
·         "Analyzing the Effectiveness and Coverage of Web Application Security Scanners", by Larry Suto (commercial scanners comparison, October 2007)
·         "Rolling Review: Web App Scanners Still Have Trouble with Ajax", by Jordan Wiens (commercial scanners comparison, October 2007)
·         "Web Application Vulnerability Scanners – a Benchmark" , by Andreas Wiegenstein, Frederik Weidemann, Dr. Markus Schumacher, Sebastian Schinzel (Anonymous scanners  comparison, October 2006)

22. Thank-You Note
During the research described in this article, I have received help from plenty of individuals and resources, and I’d like to take the opportunity to thank them all.

I might be reusing the texts, due to the late night hour and the constant lack of sleep I have been through in the last couple of months, but I mean every word that is written here.

For all the open source tool authors that assisted me in testing the various tools in unreasonable late night hours and bothered to adjust their tools for me, discuss their various features and invest their time in explaining how I can optimize their use,
For the kind souls that helped me obtain evaluation licenses for commercial products, for the CEO's, Marketing Executives, QA engineers, Support and Development teams of commercial vendors, which saved me tons of time, supported me throughout the process, helped me overcome obstacles and proved to me that the process of interacting with a commercial vendor can be a pleasant one, and for the various individuals that helped me contact these vendors.
I can't thank you enough, and wish you all the best.

For the information sources that helped me gather the list of scanners over the years, and gain knowledge, ideas, and insights, including (but not limited to) information security sources such as Security Sh3ll (http://security-sh3ll.blogspot.com/), PenTestIT (http://www.pentestit.com/), The Hacker News (http://thehackernews.com/), Toolswatch (http://www.vulnerabilitydatabase.com/toolswatch/), Darknet (http://www.darknet.org.uk/), Packet Storm (http://packetstormsecurity.org/), Google (of course), Twitter (my latest addiction) and many others great sources that I have used over the years to gather the list of tools.

I hope that the conclusions, ideas, information and payloads presented in this research (and the benchmarks and tools that will follow) will contribute to all the vendors, projects and most importantly, testers that choose to rely on them.

23. FAQ - Why Didn't You Test NTO, Cenzic and N-Stalker?
Prior to the benchmark, I made an important decision. I decided to go through official channels, and either contact vendors and work with them, or use public evaluation versions of relatively simple products. I had a huge amount of tasks, and needed the support to cut the learning curve of understanding how optimize the tools. I was determined to meet my deadline, didn't have any time to spare, and was willing to make certain sacrifices to meet my goals.

As for why specific vendors were not included, this is the short answer:
NTO: I only managed to get in touch with NTO about two weeks before the benchmark publication. I didn't have luck contacting the guys I worked with in the previous benchmarks, but was eventually contacted by Kim Dinerman. She was nice and polite, and apologized for the time the process took. After explaining to her which timeframe they have for enhancing the product (an action performed by other commercial vendors as well, in order to prepare for the publically known tests of the benchmark), they decided that the timeframe and circumstances don't provide an even opportunity and decided not to participate.
I admit that by the time they contacted me, I was so loaded with tasks, that it was somewhat relieved, even though I was curious and wanted to assess their product. That being said, I decided prior to the benchmark that I will respect the decisions of vendors, even if will cause me to not get to a round scanner number.

N-Stalker: I finally received a valid N-Stalker license one day before the publication of the benchmark - a couple of days after the final deadline I had for accepting any tool. I decided to give it a shot, just in case it will be a simple process, however, with my luck, I immediately discovered a bug that prevented me from properly assessing the product and it's features, and unlike the rest of tests which were performed with a sufficient timeframe... this time, I had no time to find a workaround. I decided not to publish the partial results I had (I did not want to create the wrong impression or hurt anyone's business), and notified the vendor on the bug and on my decision.
The vendor, from his part, thanked me for the bug report, and promised to look up the issue. Sorry guys... I wanted to test them too... next benchmark.

Cenzic: the story of Cenzic is much simpler than the rest. I simply didn't manage to get in touch, and even though I did have access to a license, I decided prior to the benchmark not to take that approach. As I mentioned earlier, I decided to respect the vendor decisions, and not to assess their product without their support.

24. Appendix A – List of Tools Not Included In the Test
The following commercial web application vulnerability scanners were not included in the benchmark, due to deadlines and time restrictions from my part, and in the case of specific vendors, for other reasons.
Commercial Scanners not included in this benchmark
·         N-Stalker Commercial Edition (N-Stalker)
·         Hailstorm (Cenzic)
·         NTOSpider (NTO)
·         McAfee Vulnerability Manager (McAfee / Foundstone)
·         Retina Web Application Scanner (eEye Digital Security)
·         SAINT Scanner Web Application Scanning Features (SAINT co.)
·         WebApp360 (NCircle)
·         Core Impact Pro Web Application Scanning Features (Core Impact)
·         Parasoft Web Application Scanning Features (a.k.a WebKing, by Parasoft)
·         MatriXay Web Application Scanner (DBAppSecurity)
·         Falcove (BuyServers ltd, currently Unmaintained)
·         Safe3WVS 13.1 Commercial Edition (Safe3 Network Center)
The following open source web application vulnerability scanners were not included in the benchmark, mainly due to time restrictions, but might be included in future benchmarks:
Open Source Scanners not included in this benchmark
·         Vanguard
·         WebVulScan
·         SQLSentinel
·         XssSniper
·         Rabbit VS
·         Spacemonkey
·         Kayra
·         2gwvs
·         Webarmy
·         springenwerk
·         Mopset 2
·         XSSFuzz 1.1
·         Witchxtoolv
·         PHP-Injector
·         XSS Assistant
·         Fiddler XSSInspector/XSRFInspector Plugins
·         GNUCitizen JAVASCRIPT XSS SCANNER - since WebSecurify, a more advanced tool from the same vendor is already tested in the benchmark.
·         Vulnerability Scanner 1.0 (by cmiN, RST) - since the source code contained traces for remotely downloaded RFI lists from locations that do not exist anymore.
The benchmark focused on web application scanners that are able to detect either Reflected XSS or SQL Injection vulnerabilities, can be locally installed, and are also able to scan multiple URLs in the same execution.
As a result, the test did not include the following types of tools:
·         Online Scanning Services – Online applications that remotely scan applications, including (but not limited to) Appscan On Demand (IBM), Click To Secure, QualysGuard Web Application Scanning (Qualys), Sentinel (WhiteHat), Veracode (Veracode), VUPEN Web Application Security Scanner (VUPEN Security), WebInspect (online service - HP), WebScanService (Elanize KG), Gamascan (GAMASEC – currently offline), Cloud Penetrator (Secpoint),  Zero Day Scan, DomXSS Scanner, etc.
·         Scanners without RXSS / SQLi detection features:
o   Dominator (Firefox Plugin)
o   fimap
o   lfimap
o   DotDotPawn
o   lfi-rfi2
o   LFI/RFI Checker (astalavista)
o   CSRF Tester
o   etc
·         Passive Scanners (response analysis without verification):
o   Watcher (Fiddler Plugin by Casaba Security)
o   Skavanger (OWASP)
o   Pantera (OWASP)
o   Ratproxy (Google)
o   etc
·         Scanners of specific products or services (CMS scanners, Web Services Scanners, etc):
o   WSDigger
o   Sprajax
o   ScanAjax
o   Joomscan
o   wpscan
o   Joomlascan
o   Joomsq
o   WPSqli
o   etc
·         Web Application Scanning Tools which are using Dynamic Runtime Analysis:
o   PuzlBox (the free version was removed from the web site, and is now sold as a commercial product named PHP Vulnerability Hunter)
o   Inspathx
o   etc
·         Uncontrollable Scanners - scanners that can’t be controlled or restricted to scan a single site, since they either receive the list of URLs to scan from Google Dork, or continue and scan external sites that are linked to the tested site. This list currently includes the following tools (and might include more):
o   Darkjumper 5.8 (scans additional external hosts that are linked to the given tested host)
o   Bako's SQL Injection Scanner 2.2 (only tests sites from a google dork)
o   Serverchk (only tests sites from a google dork)
o   XSS Scanner by Xylitol (only tests sites from a google dork)
o   Hexjector by hkhexon – also falls into other categories
o   d0rk3r by b4ltazar
o   etc
·         Deprecated Scanners - incomplete tools that were not maintained for a very long time. This list currently includes the following tools (and might include more):
o   Wpoison (development stopped in 2003, the new official version was never released, although the 2002 development version can be obtained by manually composing the sourceforge URL which does not appear in the web site- http://sourceforge.net/projects/wpoison/files/ )
o   etc
·         De facto Fuzzers – tools that scan applications in a similar way to a scanner, but where the scanner attempts to conclude whether or not the application or is vulnerable (according to some sort of “intelligent” set of rules), the fuzzer simply collects abnormal responses to various inputs and behaviors, leaving the task of concluding to the human user.
o   Lilith 0.4c/0.6a (both versions 0.4c and 0.6a were tested, and although the tool seems to be a scanner at first glimpse, it doesn’t perform any intelligent analysis on the results).
o   Spike proxy 1.48 (although the tool has XSS and SQLi scan features, it acts like a fuzzer more then it acts like a scanner – it sends payloads of partial XSS and SQLi, and does not verify that the context of the returned output is sufficient for execution or that the error presented by the server is related to a database syntax injection, leaving the verification task for the user).
·         Fuzzers – scanning tools that lack the independent ability to conclude whether a given response represents a vulnerable location, by using some sort of verification method (this category includes tools such as JBroFuzz, Firefuzzer, Proxmon, st4lk3r, etc). Fuzzers that had at least one type of exposure that was verified were included in the benchmark (Powerfuzzer).
·         CGI Scanners: vulnerability scanners that focus on detecting hardening flaws and version specific hazards in web infrastructures (Nikto, Wikto, WHCC, st4lk3r, N-Stealth, etc)
·         Single URL Vulnerability Scanners - scanners that can only scan one URL at a time, or can only scan information from a google dork (uncontrollable).
o   Havij (by itsecteam.com)
o   Hexjector (by hkhexon)
o   Simple XSS Fuzzer [SiXFu] (by www.EvilFingers.com)
o   Mysqloit (by muhaimindz)
o   PHP Fuzzer (by RoMeO from DarkMindZ)
o   SQLi-Scanner (by Valentin Hoebel)
o   Etc.
·         Vulnerability Detection Assisting Tools – tools that aid in discovering a vulnerability, but do not detect the vulnerability themselves; for example:
o   Exploit-Me Suite (XSS-Me, SQL Inject-Me, Access-Me)  
o   XSSRays (chrome Addon)
·         Exploiters - tools that can exploit vulnerabilities but have no independent ability to automatically detect vulnerabilities on a large scale. Examples:
o   MultiInjector
o   XSS-Proxy-Scanner
o   Pangolin
o   FGInjector
o   Absinth
o   Safe3 SQL Injector (an exploitation tool with scanning features (pentest mode) that are not available in the free version).
o   etc
·         Exceptional Cases
o   SecurityQA Toolbar (iSec) – various lists and rumors include this tool in the collection of free/open-source vulnerability scanners, but I wasn’t able to obtain it from the vendor’s web site, or from any other legitimate source, so I’m not really sure it fits the “free to use” category.


14 comments:

  1. I am security guy, too. While planing to pen test, I found your excellent article. I really appreciate it for your work!

    ReplyDelete
  2. Hello, I am am Co-Founder of Orvant. I think our Securus vulnerability scanner would make a worthy addition to the list. One thing that is unique about Securus is that we leverage many of these tools as well as add our own special sauce on top. Our intent is to provide you with the greates test ant threat coverage as possible. As well as the flexability to decide what tools are worth running and being able to run a side by side comparison helps.

    ReplyDelete
    Replies
    1. Will take a look at the next benchmark, somewhere around May.

      Delete
    2. Thanks you can contact me via email dan - orvant.com if you have any question or comments when you take a look.

      Delete
  3. Shay,
    Your research is comprehensive and was really helpful for me in evaluating both commercial and open-source tools. Your selection of assessment criteria was useful for the majority of vulnerabilities/features and it makes comparing the results a bit easier.

    One recent update that I found was regarding ZAP, which extended the results using ZAP 2.0.0 (released in January 2013) against WAVSEP, as reported in the following link:

    http://code.google.com/p/zaproxy/wiki/TestingWavsep

    I look forward reading your updates and analysis on this research and which conclusions you will reach.

    Thanks,

    Itay

    ReplyDelete
  4. Hello,

    thank you for your excellent article, Do you have a benchmarking or vision of Source Code Security Analyzers (HP fortify static code analyser,IBM security Appscan Source, Find Bugs, ...) and what is the product that you recommend

    Thanks
    Hocine

    ReplyDelete
  5. Shay,
    Excellent analysis. I was starting out looking for the same answer, is it value for money to have a commercial Web Vulnerability Scanner rather than an open source? Comparing scanners is like going to a dance and meeting very attractive people, picking one is hard. The long term future is a decider. Keeping up to date with the forks is also difficult. ZAP is a fork of version 3.2.13 of the open source variant of Paros. Vega looks good. IronWasp impressive. Tough choices. The bit I liked is your ability to put yourself in the Consultants role. - scanning an unrestricted amount of IP addresses. Commercial suppliers have trouble with this role.
    Thanks

    ReplyDelete
  6. Shay,

    Thank you for this extremely in-depth analysis of the different types of web application security scanners available. I personally prefer Veracode for application security testing (which is #20 on the list of Forbe's most promising companies in America) because of their dynamic analysis tool and clear reporting. Black Diamond Solutions is actually offering a free application security scan on the Veracode platform. Hope this helps!

    ReplyDelete
  7. It is so good that I found this post. Now I have the ideal how to check my site security.

    Thank you.

    ReplyDelete
  8. Shay,

    My name is Riaan Gouws and I am the CTO of Quatrashield. First, I think you deserve much credit for the important service that you provide our industry. This detailed article is testament to your passion in this field.

    I would like to ask you to also consider including our web application vulnerability scanner – QuatraScan - in your next benchmark study. Based on our own testing, we believe that our false positive rate puts us in the first tier of vendors and we are hopeful that sectooladdict can validate this as well.

    I am happy to provide as much info as is needed.

    Thanks, Riaan.

    ReplyDelete
  9. Hi Shay,

    Great article! I have a question to interpret the list the right way. In which relation do the accuracies stand to the WIVET? For example for the w3af: Are those 35.29% the accuracy from the whole application or only from those 19% of WIVET?

    Hope you understand my question :) Thanks a lot!

    ReplyDelete
    Replies
    1. Hi Thomas,
      first of all - a new and more updated benchmark was published last week - you can access it through the following link:
      http://sectooladdict.blogspot.co.il/2014/02/wavsep-web-application-scanner.html

      The WIVET score is good to determine how good the scanner will identify the structure of the application *automatically* - at the worst case scenario.

      So, if for example the WIVET score is 10%, the application has 100 web pages which are all vulnerable to a number of URLs that the scanner can identify, and crawling the application is very difficult due to the technology,
      the scanner will be able to crawl about 10% of the pages, and scan them for vulnerabilities... all the rest will not be tested.

      Please take into consideration that this explanation *highly* simplifies the meaning of the WIVET score for the purpose of associating value to it, and in reality, the scanner may crawl anything from 0% to 100%, depending on technology. WIVET is a great score to measure how well it will adapt to different technologies - and isn't related directly to accuracy, more to coverage.

      Delete
  10. WebCruiser Web Vulnerability Scanner 3

    http://lobatandawgs.com/104-webcruiser-web-vulnerability-scanner-3.html

    http://shanghaiblackgoons.com/107-webcruiser-web-vulnerability-scanner-3.html

    ReplyDelete