Machine Learning For Web Vulnerability Detection: The Case of Cross-Site Request Forgery
Machine Learning For Web Vulnerability Detection: The Case of Cross-Site Request Forgery
Machine Learning For Web Vulnerability Detection: The Case of Cross-Site Request Forgery
We propose a methodology to leverage machine learning (ML) for the detection of web application
vulnerabilities. We use it in the design of Mitch, the first ML solution for the black-box detection
of cross-site request forgery vulnerabilities. Finally, we show the effectiveness of Mitch on real
software.
2 May/June 2020 Copublished by the IEEE Computer and Reliability Societies 1540-7993/20©2020IEEE
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
www.computer.org/security 3
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
weak syntactic structure, and custom programming 1. Use supervised learning to automatically train a
practices abound. For example, there are many differ- classifier that partitions selected web objects of
ent plausible ways to implement a “like” button for some interest, such as HTTP requests, HTTP responses,
content identified by the unique string 3aa5bf, including or cookies, based on the web application semantics.
For example, in the case of CSRF detection, the clas-
■■ a GET request to the page like.php with a single sifier would be used to identify security-sensitive
parameter id = 3aa5bf HTTP requests.
■■ a GET request to the page manage.php with a param- 2. For each possible class returned by the classifier,
eter id = 3aa5bf and a parameter action = like define a heuristic for vulnerability detection. Even
■■ a POST request to the page manage.php including trivial heuristics marking every object in a given
a JavaScript Object Notation object {id: 3aa5bf, class as nonvulnerable are plausible. For exam-
action: upvote}. ple, insensitive requests cannot be exploited for
All of these requests look semantically similar to CSRF; hence, they can be immediately marked as
experienced security testers, yet they are syntactically nonvulnerable.
different, and it might be hard to identify all of the 3. Use the classifier to choose the appropriate vulner-
most common ways to encode the same information ability detection heuristic to run on each web object
in the wild. of interest, such as part of a browser extension.
generate new HTML elements in the extension ori- dynamically generated elements, which might realis-
gin, which allows for replaying them. The security tically differ even when the same idempotent opera-
tester then authenticates to the website as Bob, and tion is performed multiple times. Thus, Mitch builds
Mitch exploits the generated HTML to automati- on the notion of dissimilar HTTP responses. In gen-
cally replay the detected sensitive requests from a eral, the dissimilarity of HTTP responses is much
cross-site position, which simulates a CSRF attack. easier to check than their similarity, for example, due
Finally, the responses collected for Alice and Bob are to the use of different status codes or content types to
compared: if a response received by Bob “matches” denote failures (e.g., status codes 401 Unauthorized
the one received by Alice, it means that Alice was able and 403 Forbidden are typical ways to denote unau-
to forge a valid request for Bob’s session; hence, the thorized access). When Bob’s response is dissimilar
attack is considered successful, and Mitch reports a from Alice’s response, it is likely that Alice’s request
potential CSRF vulnerability. failed in Bob’s session, which might indicate the use
of a CSRF protection mechanism.
Challenges
The proposed CSRF detection heuristic is intuitive, yet Changes in Session State
there are several challenges to solve to make it work in Since the state of Alice and Bob at the website might be
practice. We provide a high-level view of these issues different, matching the response received by Bob against
and our proposed solutions. the one received by Alice might be an improper way to
detect a CSRF vulnerability. For instance, Bob might
Changes in HTTP Responses not be able to perform a sensitive operation because it
Defining a suitable notion of “matching” HTTP does not have access to the file foo, yet a CSRF attack
responses for Alice’s and Bob’s sessions is gener- would work if it targeted the file bar. When comparing
ally hard, because HTTP responses may include the response received by Bob against the one received
Credentials
Trace Generator
Sensitive
HTTP
Requests
ML
Ta T′a Tb Tu
Classifier
Insensitive
CSRF Detection
Algorithm
Mitch
www.computer.org/security 5
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
by Alice, Mitch does not immediately consider their dis- have a very weak structure, and it is hard to come up
similarity as a definite evidence that the request of Bob with general yet accurate typing techniques for them.
had a different outcome than the one of Alice due to the
use of a CSRF protection mechanism. Rather, since dif- Textual
ferent outcomes might come from a difference in the This category of features captures textual characteris-
state of Alice’s and Bob’s sessions, Mitch also replays tics of HTTP requests and is based on a small manually
Alice’s original request in a fresh Alice session: if the curated vocabulary of keywords V that may occur in the
new response received by Alice is dissimilar to the origi- request, resulting from a manual inspection of sensitive
nal one, it is likely that session-dependent information requests from a sample of real-world websites consid-
is required to process the request, which might indicate ered in our data set. More specifically, we consider only
the adoption of an anti-CSRF token. binary features of the following forms:
ML Classifier Functional
The ML classifier used by Mitch was trained from a This category of features indicates the HTTP method
data set of approximately 6,000 HTTP requests from associated with the request. We consider just the follow-
existing websites, collected and labeled by two human ing two binary features:
experts. The feature space X of the classifier has 49
dimensions, each one capturing a specific property of ■■ isGET: the HTTP request method is GET
HTTP requests. Those can be organized into three cat- ■■ isPOST: the HTTP request method is POST.
egories: structural, textual, and functional.
There are no additional alternatives, because our
Structural data set includes only GET and POST requests. All
This category of features describes structural properties other requests can be easily labeled as sensitive or not
of an HTTP request. More precisely, we define the fol- just based on their method; for example, OPTIONS
lowing set of numerical features: requests are always insensitive.
websites. To estimate this important aspect, we keep in Table 1 and is discussed in the following sections.
track of all of the sensitive requests returned by the ML Many of the attacks we found targeted the social
classifier embedded into Mitch, and we focus our manual functionalities of the websites we tested, such as cast-
testing on those cases. This is a reasonable choice to make ing votes on public contents, adding or removing items
the analysis tractable, because we first showed that the from favorite lists, and posting comments under the
classifier performs well using standard validity measures. identity of the victim. Therefore, most of these attacks
may affect recommender systems, lead to social embar-
Assessment on Existing Websites rassment, and compromise user reputation. Worse, we
To test how effective Mitch is on existing websites, we were also able to find a number of attacks that seriously
sampled 20 websites from the Alexa Top 10,000 rank- compromised the website functionality; we responsibly
ing. We considered only websites with single sign-on disclosed all of the vulnerabilities to the respective web-
access via a major social network website so that we site owners. We discuss a few interesting cases here.
could leverage just two existing social accounts to per-
form our security testing. Bombas
Overall, Mitch found 191 sensitive requests and Bombas is an e-commerce website selling socks. It pro-
reported 47 potential CSRF vulnerabilities: we were able vides a functionality to store a list of shipping addresses
to immediately exploit 35 of them, exposing major secu- to simplify purchases, so that shipping details do not
rity issues in a few cases. We estimated only seven false need to be entered for each transaction. The form used
negatives in total, which means that our heuristics are to store a new shipping address is vulnerable to CSRF,
accurate enough to capture most of the vulnerabilities. so an attacker can force any address into the victim’s
The full breakdown of the individual websites is shown account to hijack deliveries. By default, the latest added
www.computer.org/security 7
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
address is the one that is used, which makes the attack severe, because it affects the form used to set the email
even worse in terms of practical impact. address of user profiles. By exploiting this vulnerability,
Bombas is a customer of Shopify, which is a major the attacker can set the victim’s email address to his/her
e-commerce platform, so this attack may also affect own address and then use the password reset functional-
many other websites. We reported the issue to Shopify, ity of Starnow to get a fresh password for the victim in
which acknowledged the attack and is working on a fix her inbox, thus taking possession of the victim’s account.
but marked our report as duplicate due to the existence
of a previous independent disclosure. Assessment on Production Software
As a second set of experiments, we decided to run Mitch
Indeed on the testbed of open source web applications used to
Indeed is one of the largest websites hosting job offers. evaluate Deemon, a state-of-the-art automated detection
Registered users can send their CVs and apply to different tool for CSRF vulnerabilities.15 Since Deemon works
open positions around the world. We found three CSRF only on Hypertext Preprocessor applications whose
vulnerabilities that give an attacker the ability to fully source code is available for dynamic analysis, we could not
manage job offers associated with the account, including test it on the closed source websites from our first set of
the possibility of storing new offers and archiving exist- experiments. Out of the 10 applications considered in the
ing ones. Indeed also suffers from a CSRF vulnerabil- original testbed, we were able to find only three applica-
ity on the form used to set user preferences, which can tions at the same version: Oxid e-shop, Prestashop, and
severely affect the visibility of job offers. An attacker can Simple Machine Forums. No CSRF vulnerability was
exploit this vulnerability to hide job offers, for instance, detected by Deemon on these applications, according to
by restricting the search radius and changing the desired the experimental evaluation by Sudhodanan et al.15 The
publication date for displayed offers. results of the analysis performed by Mitch on the applica-
We find these vulnerabilities particularly interest- tions in their default configuration are shown in Table 2.
ing, because Indeed is making wide use of anti-CSRF Mitch was extremely effective on the tested applica-
tokens, and all of the vulnerable forms have their own tions, because it reported only two false positives and
token. However, it seems that not all of the tokens are it was able to catch three CSRF vulnerabilities on Oxid
correctly checked by the website, which may suggest e-shop that were not reported by Deemon.15 These vul-
a manual, error-prone placement of the tokens. More nerabilities allow an attacker to corrupt the integrity of
generally, this shows that checking the presence of the shopping cart, force the use of vouchers, and change
anti-CSRF tokens is not sufficient to say that a website the preferred payment method. All of the correspond-
is protected against CSRF and that the actual website ing functionalities are supposed to be protected by an
behavior should be tested instead. The security team anti-CSRF token, which however, is not checked by the
of Indeed acknowledged the issue and rewarded us Oxid back end. We reported the issues to the Oxid secu-
US$100 for the finding. rity team, who acknowledged the problem and worked
on a fix.
Starnow
Starnow is an Australian website designed to discover Freeware and Open Source Software
new talents, such as singers and actors. Users who are Penetration testers have been using a range of differ-
interested in pursuing an artistic career can register on ent tools to detect CSRF vulnerabilities in web appli-
the website to get access to a number of auditions and cations. Based on extensive research on blogs, forums,
job interviews. The first two CSRFs we found allow and resources for security practitioners, including the
an attacker to arbitrarily manipulate the watchlists of OWASP Testing Guide, we classified existing tools in
authenticated users, thus compromising a functionality the following categories.
offered by the website.
However, there are two much worse attacks. A CSRF ■■ Intercepting proxies allow penetration testers to inter-
vulnerability affects the form used to store the phone cept and modify arbitrary HTTP traffic, which can be
number associated with user profiles; this can be used used for an essentially manual detection of web vul-
for scams or to disrupt the functionality of the website, nerabilities, including CSRF. Popular tools in this cat-
such as by making it impossible to contact the victim for egory are Burp, ZAP, and WebScarab.
an audition. The request used to set the phone number ■■ Exploit generators simplify the generation of proof of
contains an anti-CSRF token, but it is not checked by concepts for attack finding, based on human guidance
the website, confirming that this type of mistake is not on the set of HTTP requests that need to be tested for
confined to Indeed but is apparently more widespread. CSRF. Examples tools in this category include CSRF-
The last CSRF vulnerability is definitely the most Tester and pinata-csrf-tool.
Web application Sensitive requests Detected CSRFs False positives False negatives
Oxid e-shop 4.9.8 21 4 1 0
Prestashop 1.6.1.2 12 1 1 0
Simple Machine Forums 2.0.12 9 0 0 0
Total 42 5 2 0
www.computer.org/security 9
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
6. A. Doupé, M. Cova, and G. Vigna, “Why Johnny can’t 17. Portswigger Web Security, “Using Burp to Test for
pentest: An analysis of black-box web vulnerability scan- Cross-Site Request Forgery (CSRF),” Knutsford, UK.
ners,” in Proc. 7th Int. Conf. Detection of Intrusions and Mal- [Online]. Available: https://support.portswigger.net/
ware, and Vulnerability Assessment (DIMVA 2010), Bonn, customer/portal/articles/1965674-using-burp-to-test-
Germany, July 8–9, 2010. pp. 111–131. for-cross-site-request-forgery-csrf-
7. A. Barth, C. Jackson, and J. C. Mitchell, “Robust defenses 18. A. Rancho, “w3af,” GitHub. Accessed on: Jan. 15, 2020.
for cross-site request forgery,” in Proc. 2008 ACM Conf. [Online]. Available: https://github.com/andresriancho/
Computer and Communications Security (CCS 2008), Alex- w3af/issues/120
andria, VA, pp. 75–88. doi: 10.1145/1455770.1455782.
8. M. Mohri, A. Rostamizadeh, and A. Talwalkar, Foundations Stefano Calzavara is a tenure-track assistant professor
of Machine Learning. Cambridge, MA: MIT Press, 2012. at Università Ca’ Foscari Venezia, Italy. His research
9. M. W. Kattan, D. A. Adams, and M. S. Parks, “A compari- interests include formal methods and web security.
son of machine learning with human judgment,” J. Man- Calzavara received a Ph.D. in computer science from
age. Inf. Syst., vol. 9, no. 4, pp. 37–57, Mar. 1993. doi: the Università Ca’ Foscari Venezia, Italy, in 2013.
10.1080/07421222.1993.11517977. Contact him at [email protected].
10. D. A. Ferrucci, “Introduction to ‘This is Watson’,” IBM J.
Res. Develop., vol. 56, no. 3.4, pp. 1:1–1:15, May 2012. doi: Mauro Conti is a full professor at the University of
10.1147/JRD.2012.2184356. Padua, Italy. His research interests include computer
11. D. Silver et al., “Mastering the game of Go with deep neu- security and privacy. Conti received a Ph.D. in com-
ral networks and tree search,” Nature, vol. 529, no. 7587, puter science from Sapienza University of Rome,
pp. 484–489, Jan. 2016. doi: 10.1038/nature16961. Italy, in 2009. Contact him at [email protected].
12. M. Bugliesi, S. Calzavara, R. Focardi, and W. Khan,
“CookiExt: Patching the browser against session hijacking Riccardo Focardi is a full professor at Università Ca’ Fos-
attacks,” J. Comput. Security, vol. 23, no. 4, pp. 509–537, cari Venezia, Italy. His research interests include com-
2015. doi: 10.3233/JCS-150529. puter security and formal methods. Focardi received
13. S. Calzavara, G. Tolomei, A. Casini, M. Bugliesi, and S. a Ph.D. in computer science from the University of
Orlando, “A supervised learning approach to protect cli- Bologna, Italy, in 1999. Contact him at focardi@
ent authentication on the web,” ACM Trans. Web, vol. 9, unive.it.
no. 3, pp. 15:1–15:30, 2015. doi: 10.1145/2754933.
14. S. Calzavara, M. Conti, R. Focardi, A. Rabitti, and G. Alvise Rabitti is a security officer at Università Ca’ Fos-
Tolomei, “Mitch: A machine learning approach to the cari Venezia, Italy. His research interests include web
black-box detection of CSRF vulnerabilities,” in Proc. security and privacy. Rabitti received a B.S. in com-
IEEE European Symp. Security and Privacy (EuroS&P puter science from the Università Ca’ Foscari Venezia,
2019), Stockholm, Sweden, June 17–19, 2019, pp. 528– Italy, in 2013. Contact him at [email protected].
543. doi: 10.1109/EuroSP.2019.00045.
15. G. Pellegrino, M. Johns, S. Koch, M. Backes, and C. Ros- Gabriele Tolomei is an associate professor at Sapienza
sow, “Deemon: Detecting CSRF with dynamic analysis University of Rome, Italy. His research interests
and property graph,” in Proc. 2017 ACM SIGSAC Conf. include machine learning and web search. Tolomei
Computer and Communications Security (CCS 2017), received a Ph.D. in computer science from the Univer-
Dallas, TX, Oct. 30–Nov. 3, 2017, pp. 1757–1771. doi: sità Ca’ Foscari Venezia, Italy, in 2011. Contact him at
10.1145/3133956.3133959. [email protected].
16. S. Calzavara, M. Conti, R. Focardi, A. Rabitti, and G.
Tolomei, “mitch,” GitHub. Accessed on: Jan. 15, 2020.
[Online]. Available: https://github.com/alviser/mitch