Testing and Securing Web Applications 1nbsped 0367333759 9780367333751 Compress
Testing and Securing Web Applications 1nbsped 0367333759 9780367333751 Compress
Testing and Securing Web Applications 1nbsped 0367333759 9780367333751 Compress
Web Applications
Testing and Securing
Web Applications
By Ravi Das
and Greg Johnson
First edition published 2020
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
and by CRC Press
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
Reasonable eforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. Te authors
and publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any future
reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microflming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com or
contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-
8400. For works that are not available on CCC please contact [email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identifcation and explanation without intent to infringe.
Acknowledgments ........................................................................................xiii
About the Authors ......................................................................................... xv
1 Network Security....................................................................................1
Introduction ................................................................................................1
A Chronological History of the Internet ......................................................5
Te Evolution of Web Applications..............................................................7
Te Fundamentals of Network Security – Te OSI Model ........................13
Te OSI Model .....................................................................................13
What Is the Signifcance of the OSI Model to Network Security?.........15
Te Classifcation of Treats to the OSI Model.....................................15
Te Most Probable Attacks....................................................................17
Assessing a Treat to a Web Application ....................................................18
Network Security Terminology..................................................................19
Te Types of Network Security Topologies Best Suited for Web
Applications...............................................................................................20
Te Types of Attack Tat Can Take Place against Web Applications.........21
How to Protect Web Applications from DDoS Attacks .............................27
Defending against Bufer Overfow Attacks ..........................................28
Defending against IP Spoofng Attacks.................................................28
Defending against Session Hijacking ....................................................30
Defending Virus and Trojan Horse Attacks ..........................................31
Viruses..............................................................................................31
How a Virus Spreads Itself................................................................31
Te Diferent Types of Viruses..........................................................31
Defending Web Applications at a Deeper Level .........................................33
Te Firewall ..........................................................................................33
Types of Firewalls................................................................................. 34
Blacklisting and Whitelisting................................................................36
How to Properly Implement a Firewall to Safeguard the
Web Application ........................................................................................37
vii
viii ◾ Contents
Site Validity...........................................................................................75
Proving Your Web App Is What It Says It Is .....................................75
Testing Your Web App’s Confdentiality and Trust ......................... 77
What Kind of Trust? ....................................................................... 77
Spoofng and Related Concerns........................................................79
Conclusion ............................................................................................82
Resources ...................................................................................................82
References ..................................................................................................82
2 Cryptography .......................................................................................83
An Introduction to Cryptography............................................................. 84
Message Scrambling and Descrambling.....................................................85
Encryption and Decryption.......................................................................86
Ciphertexts ................................................................................................86
Symmetric Key Systems and Asymmetric Key Systems..............................87
Te Caesar Methodology ...........................................................................87
Types of Cryptographic Attacks ............................................................88
Polyalphabetic Encryption .........................................................................88
Block Ciphers ............................................................................................89
Initialization Vectors................................................................................. 90
Cipher Block Chaining ............................................................................. 90
Disadvantages of Symmetric Key Cryptography ........................................91
Te Key Distribution Center .....................................................................92
Mathematical Algorithms with Symmetric Cryptography .........................93
Te Hashing Function ...............................................................................94
Asymmetric Key Cryptography .................................................................95
Public Keys and Public Private Keys ..........................................................95
Te Diferences Between Asymmetric and Symmetric Cryptography........96
Te Disadvantages of Asymmetric Cryptography ......................................97
Te Mathematical Algorithms of Asymmetric Cryptography ....................98
Te Public Key Infrastructure....................................................................99
Te Digital Certifcates............................................................................100
How the Public Key Infrastructure Works...............................................101
Public Key Infrastructure Policies and Rules ...........................................101
Te LDAP Protocol .................................................................................102
Te Public Cryptography Standards ........................................................103
Parameters of Public Keys and Private Keys.............................................104
How Many Servers? .................................................................................105
Security Policies .......................................................................................105
Securing the Public Keys and the Private Keys ........................................106
Message Digests and Hashes....................................................................106
Security Vulnerabilities of Hashes............................................................106
A Technical Review of Cryptography ......................................................107
x ◾ Contents
4 Treat Hunting...................................................................................175
Not-So-Tall Tales ..................................................................................... 176
Nation-State Bad Actors: China and Iran ................................................ 181
Treat Hunting Methods.........................................................................182
MITRE ATT&CK..................................................................................183
Technology Tools.....................................................................................183
Te SIEM............................................................................................183
EDR....................................................................................................184
EDR + SIEM ......................................................................................185
IDS .....................................................................................................185
When 1 + 1 + 1 = 1: Te Visibility Window ............................................185
Treat Hunting Process or Model............................................................186
On Becoming a Treat Hunter ................................................................188
Treat Hunting Conclusions....................................................................189
Resources .................................................................................................189
I would like to thank John Wyzalek, our editor, for his guidance to the comple-
tion of this book. Many special thanks go to Greg Johnson, co-author, and David
Pearson, contributor.
Ravi Das
As my friend Ravi Das will attest, writing a book is a painstaking labor of love
which is only accomplished by the love, support, and assistance of many. Te
phrase, “It takes a village” is so true here. Tere are many who played a role and
without whom this work would not have come about.
First and foremost, the majority of the credit goes to my supportive family, par-
ticularly my wife Kelly of 35 years who has supported me in so many endeavors – those
that failed as well as those that didn't – and who rarely complains about anything.
Nobody's perfect, but I have to say she's pretty darn close!
Next, this work couldn't possibly have materialized without Curt Jeppson,
dear friend and colleague of many years, consultant, and VP of engineering for
Webcheck Security, as well as his own consultancy, Cyrilliant. Curt is brilliant,
honest, kind, a hard worker, willing, and oh, did I say brilliant? It's easy to build a
technology business when you're surrounded by people like Curt, without whom
my current success in business would be nil.
Signifcant gratitude goes to Secuvant Security for the support and encourage-
ment of my dear colleagues, EVP and advisor Jef Smith, and CEO Ryan Layton,
and also to the prolifc SOC manager and cyclist, Eric Peterson, along with the
brilliance of senior analyst, Chris Signorino. You guys are smart, cooperative, and
it's no wonder Secuvant is so successful with you two leading the SOC.
Finally, nothing happens in life without many doors and windows having been
opened by the divine. I am thankful to a loving Heavenly Father who really is
there, who hears and answers prayers, and who blesses me with so many undeserved
opportunities and blessings.
Greg Johnson
xiii
About the Authors
Ravi Das is a business development specialist for the AST Cybersecurity Group,
Inc., a leading cybersecurity content frm located in the greater Chicago area.
Ravi holds a Master of Science degree in agribusiness economics (thesis in interna-
tional trade) and a Master of Business Administration in management information
systems.
Ravi has authored fve books, with two more upcoming ones on artifcial intel-
ligence in cybersecurity and cybersecurity risk and its impact on cybersecurity
insurance policies.
◾ PCI
◾ HIPAA
◾ ISO 27001
◾ NIST
◾ SOC 1 and SOC 2
◾ GDPR/CCPA
◾ FedRAMP
xv
xvi ◾ About the Authors
When Greg is not providing cyber solutions for his clients, he can be found
spending time with his amazing wife Kelly, playing with his grandchildren, or
rehearsing or performing with the world-renowned Tabernacle Choir on Temple
Square.
Having used Wireshark ever since it was Ethereal, David Pearson has been
analyzing network trafc for well over a decade. He has spent the majority of his
professional career understanding how networks and applications work. David
holds computer security degrees from the Rochester Institute of Technology (BS)
and Carnegie Mellon University (MS).
Chapter 1
Network Security
Introduction
Everybody remembers at the 1990s quite well. Te stock market was at an all-time
high back then, and jobs were plentiful. I even remember the comment that one
recruiter made: “If this candidate can even breathe, he is hired.” Tose were good
times for sure. But probably the one thing that will be remembered the most is the
era of the .com businesses. It seemed that for any new idea that would pop up, it
had to be branded as such.
Of course, the domain extension was also quite popular. In fact, during that
time frame, this domain extension could easily fetch $20,000 or more if the actual
domain name was in huge demand. It seemed like venture capitalists and angel
investors were literally pumping in money into newly founded companies when
they didn’t even have a business plan or even a business model. If it had a .com in
its name, it was well-funded, and no further questions were asked. Because of this,
even the NASDAQ reached record new highs. Also, who could forget that famous
slogan by Sun Microsystems: “We’re the dot in .com”? Tere were other marketing
advertisements like this, and all of the tech companies were riding on a literal high.
For example, the tech giants like Microsoft, Cisco, Adobe, and Oracle all pros-
pered greatly. With Microsoft, all of their software platforms saw a huge uptick,
especially their Exchange, Ofce, and SQL Server product lines. Even their cer-
tifcations were in huge demand, most notably the Microsoft Certifed Systems
Engineer (MCSE).
Cisco beneftted primarily from their network oferings, Adobe was most noted
(and still is) for their Portable Data File (PDF) structure, and Oracle was probably
the most widely used and deployed database for all sorts of software applications.
Te .com boom also gave birth to a new concept: Rather than having to go to a
1
2 ◾ Testing and Securing Web Applications
brick-and-mortar store to buy products and services, one could now purchase these
easily via online commerce.
Of course, this concept was still in its infancy, and it was nothing like to the
point where it is today, where you see some of the giants of the retail industry hav-
ing the largest e-commerce storefronts ever imagined. A prime example of this
is that of Walmart, Costco, Ace Hardware, etc. Te primary advantages of this
became quite obvious to the customer.
As mentioned, frst, the customer did not have to waste time traveling to a store
to purchase the products that they needed. With just a few simple clicks of the
mouse, they could select whatever they chose and enter in the credit card informa-
tion. Within just a matter of minutes, the customer would be checked out, and
there was no waiting in line at the checkout lane. Tis would become the second
primary advantage of online shopping.
Te third primary advantage that would be realized from these online stores
would be that these products could be delivered straight to the doors of the cus-
tomer that purchased them. Tere was no need to carry heavy and large boxes in
shopping carts to trunks of their cars, in just a matter of days, they would all appear
on the doorstep.
Te fourth advantage of the online store as it evolved was the products that were
purchased could be sent directly to another recipient. For example, this became a
huge boon during the holiday season. Once again, rather than having to fght in
checkout lines during the last-minute Christmas shopping, the products (or gifts)
that were purchased online could be sent to the recipient directly, even completely
gift wrapped. So yes, the birth of the online store, or as it would eventually become
known, electronic commerce (or simply just e-commerce), would grow to become
a powerful asset to the marketing tools of any retailer, whether it was the largest
of the large Fortune 500 or all the way down to the smallest of the mom-and-pop
stores.
But keep in mind that these e-commerce storefronts were quite simple in design
from a technical perspective. Simply put, while they could handle a large volume
of shopping and fnancial transactions, they were nothing at all like how they are
today.
Tese e-commerce sites had literally just a simple front end, which was what the
end user would see. Tis included pictures and pricing of the various product lines,
as well as any downloadable brochures or catalogs. Ten there was the back end,
which was essentially the database. Tis was where either SQL Server or an Oracle
type of database was designed, implemented, and made use of.
Essentially, these databases would contain the personally identifable infor-
mation (PII) of the customer, their transaction history, and, if applicable, even
their respective credit card and/or banking information so that the customer
would not have to repeatedly enter this each and every time they visited an
e-commerce site. Back then, these applications were still small enough in nature
Network Security ◾ 3
and did not occupy a good chunk of an organization’s entire information tech-
nology (IT) infrastructure.
Also the processing power and the bandwidth that were required were well
lower than what is required today. In addition, these various e-commerce sites were
referred to as “web applications.” After all, they were still applications residing on a
server somewhere (whether it was on-premises or hosted through a cloud provider)
and could only be accessed through an Internet connection (for example, back
then, it was either a dial-up modem or Ethernet) – thus, the term “web” became
quite applicable as well.
But today’s web applications (or “apps” for short) have become both exponen-
tially and gargantuanly complex in nature. For example, literally millions of busi-
ness transactions can take place within a matter of just a few seconds, and the
database size has exploded in terms of the capacity of data and information that it
can store. Tis is in part due to the fact that today, a business can use the tools of
data warehousing and even big data not only to track the buying trends of custom-
ers today but even predict their future buying patterns using the tools of artifcial
intelligence (AI) and machine learning (ML).
Te web applications today have become so sophisticated that they can even
create and customize an automatic shopping experience for each customer every
time they visit. For example, if a customer likes products in the XYZ category,
then the e-commerce site will advertise not only the products in that category but
related ones as well, and even ofer various coupons and discounts – some are even
delivered straight to the customer’s smartphone.
It should also be noted that the days of using a traditional computer to launch
and view these kinds of web applications are pretty much a thing of the past;
everything is now delivered straight to the smartphone or wireless device, which
requires much further demands in terms of a user experience and user inter-
face (also known as UI/UX) of the design and development of a particular web
application.
Because of all of this, web applications now occupy a much larger space within
the IT infrastructure of a business or a corporation. Tey simply just don’t touch
a front end or a back end; today’s web apps afect just about every corner of it.
Keep in mind that back in the late 1990s as these e-commerce started to evolve,
nobody really paid too much attention to a topic that has become critical today:
cybersecurity.
For example, the issues of Bitcoin, cryptojacking, ransomware, phishing, busi-
ness email compromise (BEC), data leaks, data hacks, etc., were simply not heard
of or even conceived of back then. True, there were other cyberattacks that were
known, such as the traditional Trojan horses and SQL injection attacks, but they
did not precipitate to the level of gravity that we know today.
Back then, if an e-commerce site simply had Secure Sockets Layer (SSL)
installed, that was good enough. But as mentioned, today’s web apps have become
4 ◾ Testing and Securing Web Applications
crazy complex, which has made them a prime target for the sophisticated cyberat-
tacker of today. As a result of all this, the web apps of today have to be literally
tested from the inside out in terms of security before they can be deployed and
launched to the public for business transactions to occur.
Tat is the primary objective of this book – to address those specifc areas that
have to be tested before a web app can be considered and deemed to be 100%
secure. As mentioned, since the web apps of today occupy a much larger space with
regard to the IT infrastructure, the number of areas that have to be tested have also
increased greatly as well.
In this regard, specifcally, fve key areas need to be targeted:
1. Network Security:
Tis encompasses the various network components that are involved in order
for the end user to access the particular web app from the server where it is
stored to where it is being transmitted to, whether it is a physical computer
itself or a wireless device (such as a smartphone).
2. Cryptography:
Tis area includes not only securing the lines of network communications
between the server upon which the web app is stored and from where it
is accessed from but also ensuring that all PII (most notably the fnancial
information that is being used, such as credit card numbers) that is stored
remains in a ciphertext format and that its integrity remains intact while in
transmission.
3. Penetration Testing:
Tis involves literally breaking apart a web app from the external environ-
ment and going inside of it in order to discover all weaknesses and vulner-
abilities and making sure that they are patched before the actual web app is
launched into a production state of operation.
4. Treat Hunting:
Tis is the same as penetration testing, but instead this involves completely
breaking down a web app from the internal environment to the external one
in order to discover all security holes and gaps.
5. Te Dark Web:
Tis is that part of the Internet that is not openly visible to the public. As
its name implies, this is the “sinister” part of the Internet, and in fact, where
much of the PII that is hijacked from a web app cyberattack is sold to other
cyberattackers in order to launch more covert and damaging threats to a
potential victim, such as that on the Internet.
Since this first chapter deals with network security, obviously, its major
component is that of the Internet. Thus, it is imperative to take a chronological
Network Security ◾ 5
1965:
Two computers at the MIT Lincoln Lab communicate with one another
using packet-switching technology.
1968:
Beranek and Newman, Inc. (BBN) unveil the fnal version of the Interface
Message Processor (IMP) specifcations. Te work on ARPANET now starts.
1969:
On October 29, UCLA’s Network Measurement Center, the Stanford Research
Institute (SRI), University of California-Santa Barbara, and University of Utah
install various network nodes. Te frst message is “LO,” which was an attempt
by student Charles Kline to “LOGIN” to the SRI computer from the univer-
sity. However, the message failed because the SRI system crashed.
1972:
BBN’s Ray Tomlinson introduces network email. Te Internetworking
Working Group (INWG) forms to address the need for establishing standard
email protocols.
1973:
Global networking becomes a reality as the University College of London
(England) and the Royal Radar Establishment (Norway) connect to the
ARPANET. Te term Internet is now born.
1974:
Te frst Internet service provider (ISP) comes into being with the introduc-
tion of a commercial version of ARPANET known as Telnet.
1974:
Vinton Cerf and Bob Kahn (publish “A Protocol for Packet Network
Interconnection,” which details the design of Transmission Control Protocol
[TCP]).
6 ◾ Testing and Securing Web Applications
1979:
USENET forms to host news and discussion groups.
1981:
Te National Science Foundation (NSF) provides a grant to establish the
Computer Science Network (CSNET) to provide networking services to
university-based computer scientists.
1982:
TCP and Internet Protocol (IP), as the protocol suite commonly known as
TCP/IP, emerge as the main protocol for ARPANET. TCP/IP remains the
standard protocol for the Internet.
1983:
Te Domain Name System (DNS) establishes the domain extensions of .edu,
.gov, .com, .mil, .org, .net, and .int system for naming websites.
1985:
Symbolics.com, the website for Symbolics Computer Corporation in
Massachusetts, becomes the frst registered domain.
1986:
Te NSF’s NSFNET goes online to connected supercomputer centers at
56,000 bits per second – the speed of a typical dial-up computer modem. Te
NSFNET was essentially a network of networks that connected academic
users along with ARPANET.
1987:
Te number of server hosts on the Internet exceeds 20,000. Cisco ships its
frst router.
1989:
World.std.com becomes the frst commercial provider of dial-up access to the
Internet.
1990:
Tim Berners-Lee, a scientist at CERN, the European Organization for
Nuclear Research, develops Hypertext Markup Language (HTML).
1991:
CERN introduces the World Wide Web to the public for the very frst time.
1992:
Te frst audio and video are downloaded over the Internet. Te phrase “surf-
ing the Internet” is now born.
1993:
Te White House and United Nations go online.
1994:
Netscape Communications is born. Microsoft creates a web browser for
Windows 95, known as “Internet Explorer.”
1995:
CompuServe, America Online, and Prodigy begin to provide the frst Internet
access.
Network Security ◾ 7
1996:
A 3D animation dubbed “Te Dancing Baby” becomes the frst video online
to go viral.
1998:
Te Google search engine is born.
IP version 6 is introduced to allow for future growth of Internet addresses.
Te current most widely used protocol is version 4. IPv4 uses 32-bit addresses,
allowing for 4.3 billion unique addresses; IPv6, with 128-bit addresses, will
allow 3.4 × 1038 unique addresses or 340 trillion trillion trillion.
1999:
Peer-to-peer fle sharing is born with the launch of Napster.
2000:
Te frst cyberattack is launched, as Yahoo! and eBay are hit by a large-scale
distributed denial of service (DDoS) attack.
2003:
Te SQL Slammer worm is launched and spreads itself worldwide in just 10
minutes.
Te blog publishing platform WordPress is launched.
2004:
Facebook goes online, and the era of social networking is now launched.
2005:
YouTube is launched.
2006:
Twitter is ofcially launched.
2010:
Te social media sites Pinterest and Instagram are launched.
2013:
Fifty-one percent of U.S. adults report that they bank online, according to a
survey conducted by the Pew Research Center.
2015:
Instagram, the photo-sharing site, reaches 400 million users, outpacing
Twitter, which would go on to reach 316 million users by the middle of the
same year.
2016:
Te frst virtual personal assistants (VPAs) are launched, with Google’s Alexa,
Siri from Apple, and Cortana from Microsoft.
A lot of web apps now demand some degree of end-user interaction. In many
instances, this simply involves inputting some data, primarily in the way of a con-
tact form. For example, most static sites include the following line of code when
creating their contact page:
In the case of a data validation error, this meant responding with the same form
page. Because Hypertext Transfer Protocol (HTTP) is stateless, the values that
the end user entered would be lost. A common solution for this was to populate
the data input into the value attributes of the form felds when constructing the
response. As a result, when the user loaded the form the frst time, a given input
might look like this:
Te end user would then fll out the form and include a business or personal
email address. Tey would the reenter the information, press Submit, and if there
was some sort of data validation error, the form would then be re-created with the
following:
Tis approach was used for more than 10 years, even through the usage of web
domain-specifc languages like PHP and the classic ASP.
In some cases, the HTML responses were constructed with simple concatena-
tion, but as technologies progressed, often they used inline code:
But there was a big security problem with this. For example, the emailAddr
of [mic” onclick=”alert(1);] would be inlined as follows (square brackets are only
included to delineate the user-supplied input):
Of course, this could be made safe with input sanitization and/or output encod-
ing, but the primary caveat was that the developer had to remember to implement
this kind of syntax in each and every location where it was required in the source
code that powered the web application.
Network Security ◾ 9
Tis was one factor, but certainly not the only one, that catalyzed the move
toward server-side web-specifc templates such as ASP.NET WebForms and Java
Server Pages (JSP):
As web applications further evolved, especially on the client side, there was the
controller, which handled logic related to presentation, for example, event handlers
for user interactions such as button clicks.
Tere was also a viewmodel, the JavaScript object with a two-way binding to
the view, which was generally a template or partial template that was rendered
on the client side. Te two-way binding part, meaning that if the viewmodel is a
JavaScript object, looks like this in terms of code:
var user = {
userId: 1,
name: ’mic’,
email: ’[email protected]’
};
Tis source would then be bound to a contact form, which resembled this:
<form>
<input type="hidden" name="id" value="{{userId}}" />
Username: <input type="text" name="username" value="{{name}}" />
Email: <input type="text" name="email" value="{{email}}" />
<button name="save" />
</form>
As of today, the three most popular front-end frameworks are React, Vue, and
Angular. Te general trend today in web application development is to build UIs
and UXs as self-contained components, often with hierarchical nesting.
Here is a more detailed history with regard to the evolution of the specifc tech-
nologies and tools that are used in creating web applications today:
1990:
HTML is launched.
1993:
Table-based websites are born. Tey ofer a better content arrangement and
navigation style.
1994:
Te World Wide Web Consortium (W3C) is born in an efort to create a
common set of best standards.
10 ◾ Testing and Securing Web Applications
1996:
Cascading Style Sheets (CSS) and Macromedia (eventually bought by Adobe)
are launched.
1997:
HTML Version 4.0 comes out.
1998:
CSS3 and the Hypertext Preprocessor (PHP) are launched.
2000:
Te usage of web content editors and content management systems are born.
In this regard, the design components are actually specifed in the CSS
source code rather than in the HTML itself.
2001–2002:
In a major move for web application development, the navigation bars start
to move to the top of websites and drop-down menus become the de facto
standard.
2003:
WordPress (a content management system) ofcially launches.
JavaScript is now used for animation purposes without using Flash.
2005:
Git (a version control system for source code development) is born, and web
applications now start to move to smartphones.
2006:
jQuery (a JavaScript library) is launched.
2008:
For web application appearance, thin and tall layouts are preferred over wide
and short layouts.
2009:
Node.js (which is an open-source server environment) is released.
2010:
Angular JS (a structural framework for building dynamic web applications)
is launched.
NPM (Node Package Manager, an online repository for the publishing of
open-source Node.js projects) is launched.
Te use of WordPress-based plugins and themes gains widespread
popularity.
2011:
Bootstrap (an HTML, CSS, and JS framework for developing responsive,
mobile-based web applications) is launched.
Laravel (a PHP web framework) is launched.
2012:
Webpack (a module bundler) is launched.
Grunt (a build/task manager written on top of NodeJS) is launched.
Composer (a tool for dependency management in PHP) is launched.
Network Security ◾ 11
2013:
React (a JavaScript library created by Facebook) is released.
2014:
Vue.js (a progressive framework for building UI/UX interfaces) is launched.
2017:
Yarn (a new JavaScript package manager built by Facebook, Google,
Exponent, and Tilde) is released.
In today’s software development world, gone are the days when a developer
would be sitting with others in front of a computer screen simply churning out
code in order to create a particular web application. Rather, the web apps of
today take large project management teams, which are located worldwide and
virtually.
Tus, the use of project management methodologies is now the norm, and it is
important to look at the timeline of the evolution of this and how they have con-
tributed to creating efcient and large-scale web applications.
Te 1950s:
Tis era saw the birth of structured programming. Block structures, subrou-
tines, and FOR and WHILE loops are extensively used in creating source
code.
Te 1960s:
Tis era saw the launch of the waterfall methodology. Tis is a sequential,
noniterative process that has the following steps:
– Requirement Analysis
– Design
– Implementation
– Source Code Verifcation
– Maintenance
Te 1970s:
Tis era saw the rise of iterative and incremental methodology. Te purpose
of this framework is to develop source code through repeated cycles. Tis is
done in smaller chunks at a time in order to make sure that any development
issues have been resolved in previous iterations.
Te 1980s:
During this time frame, three major software development methodologies
were created, as follows:
1. Prototyping:
Tis involves creating various prototypes of software applications before
they are launched into a production environment. Te steps include the
following:
• Identify the basic requirements
• Develop the initial prototype
12 ◾ Testing and Securing Web Applications
4. Extreme Programming:
Short software development cycles and quick releases are the norm here,
as well as many checkpoints in order to confrm the validity of the source
code.
5. Crystal:
Tis is designed to be a lightweight approach, with a specifc set of poli-
cies, procedures, and processes in the source code development.
6. Feature-Driven Development:
Te goal here is to deliver source code modules in a repetitive and timely
manner. It consists of the following components:
• Develop the overall model
• Build the feature list
• Plan by the feature
• Design by the feature
• Build the feature
7. Te Agile Unifed Process:
Tis methodology applies to the following:
• Test-driven development (TDD)
• Agile modeling (AM)
• Agile change management (ACM)
• Database refactoring
8. Te Disciplined Agile Delivery:
Tis is a process-driven framework in which a decision-making process is
enabled around incremental and iterative solution delivery.
9. Te Scaled Agile Framework (SAFe):
Tis is a software development methodology that consists of integrated
patterns meant for enterprise-scaled lean-agile–based source code
development.
10. Large-Scale Scrum (LeSS):
Tis is the scrum methodology, but applied to enterprise-scale web appli-
cation development.
Application This layer interfaces directly to the POP, SMTP, DNS, FTP,
applications and performs common Telnet
services for the application processes.
Physical This supports the physical properties IEEE 1394, DSL, ISDN
of the various communications
media, as well as the electrical
properties and the interpretation of
the exchanged signals. These refer
to the network interface card (NIC),
Ethernet cabling, etc.
Network Security ◾ 15
1. Te Data:
After the data packets leave a network infrastructure, they are extremely vul-
nerable in terms of interception and even loss of integrity by a malicious third
party, such as a cyberattacker.
2. Te Network Connection Points:
Anywhere that devices are connected together via a network medium
(whether it is hard wired or wireless) is prone to a cyberattack. As a result,
these weak spots must obviously be protected to the greatest extent possible,
and this is where the role of endpoint security and threat hunting comes into
play. Tis will be reviewed in much greater detail in Chapter 4 of this book.
3. Te Individuals Involved:
Individuals, especially those employed by a business or a corporation, can
pose the greatest security threat. Tis can be through sheer ignorance of
the security policies that have been set, having a malicious intent (in this
instance, insider attacks are very common and very hard to detect), or even
just by simple, nonintentional errors.
1. Intrusion:
Tis category includes the various forms of cyberattacks that are meant to
breach the lines of defense and gain unauthorized access to a particular
system.
2. Blocking:
Tis category includes cyberattacks that are designed to prevent legitimate
end-user access to a particular system. Tese kinds of cyberattacks are also
known as DDoS. Te purpose of this kind of cyberattack is not to actually
cause any sort of damage to your network infrastructure per se, but the intent
is to simply completely block legitimate end users from accessing the shared
resources that are available on a server(s).
3. Malware:
Tis is a general, all-purpose term for a piece of software that has any mali-
cious intent built into it. For instance, this can include viruses, Trojan horses,
and even spyware. Tis has been deemed to be the most common threat to
any network infrastructure, largely because they have been designed to spread
16 ◾ Testing and Securing Web Applications
themselves on their own, quickly and covertly. Te following are the most
common forms of malware:
– Viruses:
Tese are specifcally defned as “… a program that can infect other pro-
grams by modifying them to include a possibly evolved copy of itself.”
Te most common method by which viruses are spread is via email, when
the end-user’s address book has been hacked into by the cyberattacker.
– Trojan Horses:
As a brief background, this specifc term is borrowed from an ancient tale.
Te city of Troy was under attack for a long time, but the intruders for
some reason or another could not gain entrance through the main gates.
Because of this, the attackers thus constructed a large wooden horse and
left it one night in the city. Te citizens of the city assumed that this was
some sort of gift and rolled into the main square. But unbeknownst to
these citizens, this horse actually contained a few of the intruders that
were trying to gain access. When the citizens were not actually looking,
these intruders then left the horse and then opened up the gates so that
all of the intruders could enter the city of Troy. Tis is how the electronic
version of the Trojan horse works. For example, the intended victim is
ofered an enticing gift that gets installed onto your computer or wireless
device. But in the end, this is an actual piece of malicious software.
– Spyware:
Tis is yet another form of a Trojan horse, but is more devastating and
covert. For example, probably the simplest of this is the cookie. Tis is
a simple text fle that your web browser creates and stores on your hard
drive. Tese are designed so that you can access the same website more
quickly, rather than having to type in the same URL or domain name
over and over again. But in order for this to actually work, this text fle
must read by the website in question, which in turn means that this can
be even be read by other websites, and because of that, your entire brows-
ing history can thus be tracked, causing a grave network security risk.
– Key Loggers:
Tis is another form of malware, but it is a piece of malicious software
that covertly records your keystrokes. It can even take secret screenshots
of your computer or wireless device. Te information and data that are
recorded are subsequently sent back to the cyberattacker. It is impor-
tant to note here that every single thing you type on your computer is
recorded.
4. Intrusions:
Tese are cyberattacks that are actually trying to intrude into a system in
your network infrastructure. It is important to note that it could be hacker
breaking in from the external environment, or it very well could be an insider
attack from within your business or corporation. Tese kinds of cyberattacks
Network Security ◾ 17
run the gamut from simply denying users access to a particular system (this is
known as “blocking”) or those kinds of hacks that are not too focused, such
as that of viruses and worms, as previously reviewed. Tose types of intrusion
attacks that are much more targeted towards a specifc system are typically
referred to as “hacking.” However, the cyberattacker has their own term for
this, and it is called “cracking,” which simply means invading a particular
system without any sort of explicit level of permission. In most of these cases,
the idea is to exploit some kind of software faw or vulnerability in order to
gain covert access. However, another form of intrusion that does not require
that much technology is known as social engineering. In these instances, the
cyberattacker gains preliminary information and data about the target orga-
nization. From this point, the goal is to use this knowledge in order to gain
more information by tricking the employees. Te bottom line is that social
engineering is based upon how well the cyberattacker can manipulate people
and actually has very little to do with possessing deep levels of technological
skills. Another example of an intrusion attack is known as “war driving.” Tis
kind of scenario takes advantage of exploiting the weaknesses and vulner-
abilities that are found in a wireless network. War driving is actually a subset
of “war dialing.” Tis is when a cyberattacker sets up a computer to call cell
phone numbers in a sequential fashion until a computer actually picks up on
the call in order to gain access to the wireless network. But war driving is
used to detect any vulnerabilities in a wireless network itself. Wireless con-
nections are not totally safe either, as they can extend well beyond 100 feet.
5. Distributed Denial of Service:
As was reviewed earlier, the cyberattacker in this kind of scenario does not
access a particular system, but simply blocks access to legitimate end users.
Typically, servers are the primary target. Tis kind of attack can be specif-
cally defned as follows: “It is characterized by an explicit attempt by a cyber-
attacker to prevent legitimate end users of a service from using that shared
resource.” One form of a DDoS attack is when the cyberattacker foods the
targeted system with millions of false connection requests such that the serv-
ers’ processing and computing power are brought down.
In compiling the total risk factor score, the frst two categories are added
together, and the third category is subtracted from the summation of the frst two.
Tus, the risk factor scores will range from –8 (which represents a very low risk and
high security type of web application) to 19 (which represents a very high risk and
low security type of web application). In other words, the lower the number, the
less vulnerable the web application is to a cyberattack, but the higher the number,
the greater the risk.
1. Te Firewall:
In general terms, a frewall can be defned as a “… barrier between a net-
work and the outside world.” It is important to note that a frewall can be a
stand-alone server (for example, a router or even just a software application).
But whatever form that it does take, the primary objective of a frewall still
remains the same: to flter out network trafc that is entering and exiting an
entire IT infrastructure. Very often a frewall will be situated just behind
what is known as a “proxy server.” Tis masks all of the IP addresses that are
currently being used in a network infrastructure and presents just a single IP
address to the outside world. Firewalls can also be used in conjunction with
IDSs in order to spot malicious and/or anomalous activity.
2. Access Control:
Tis can be specifcally defned as “… the aggregate of all measures that are
taken to limit access to resources.” Examples of this typically include logon
procedures that are set forth in the security policies of the business or cor-
poration, encryption (which will be covered in more detail in Chapter 2), or
any other defned method that is designed to prevent unauthorized access
to a shared, network-based resource. A subset of access control is “access
control,” and this is defned as “… the process of determining whether the
credentials given by a user are authorized to access the network resource in
question.”
3. Nonrepudiation:
Tis can be defned as a method or a technique “… that is used to ensure that
someone performing an action on a computer, wireless device, or any other
20 ◾ Testing and Securing Web Applications
1. Te DDoS Attack:
Te goal here is to deprive legitimate end users of a target server on which the
web applications reside. Tis type of attack really is not designed to infltrate
a server per se, or even capture confdential information and data. Rather, the
primary goal of this kind of cyberattack is to prevent end users from gaining
access to shared resources. Tis kind of cyberattack is one of the most com-
mon ones to occur on an almost daily basis. Te concept behind the DDoS
attack is based upon the fact that any device, whether it is a computer, server,
wireless device, etc., has its own set of operational limits in terms of workload
capacity. Tis can be defned in a variety of ways, which include the follow-
ing variables:
– Te number of simultaneous end users that can be served at the same
time
– Te fle size of the shared resources
– Te speed of the network data transmission (whether it is hard wired or
wireless)
– Te amount of information and data that are stored which need to be
accessed
Probably one of the best ways to understand how a DDoS attack actually
works is to simulate this on an actual computer. In order to do this, you will
need to have access to a web server service. You can use either the Microsoft
Internet Information Server or the Apache-based HTTP Server. Make sure
that you use an older computer or server in order to download either of
these two web server services in order to fully grasp how a DDoS really
works. Te premise behind this is that the older the machine is, the quicker
it will be to respond to a simulated DDoS attack because of its slower pro-
cessing and computational powers.
For purposes of this chapter, we will examine a simulated DDoS attack
using the Apache HTTP Server.
If you are using a Windows operating system:
– Download Apache for Windows at www.apache.org.
– Once it is downloaded, look for the following directory structure:
C:\Program Files\Apache Group\Apache2\conf
Tis will contain the httpd.conf fles that you need.
22 ◾ Testing and Securing Web Applications
/etc/httpd/conf
/etc/init.d/httpd start
/var/www/html
/etc/init.d/httpd start
/etc/init.d/http stop
Now, you are ready to launch your simulated DDoS attack. To do this, you
will execute the “ping” command. Follow these steps:
– Type in ping/h.
Network Security ◾ 23
Tis particular command will show all of the options that are available
for the ping command, which are as follows:
• - w: Tis option will determine how many milliseconds it will actu-
ally take for the Ping utility to wait for a response from your server.
For purposes of this demonstration, set this option to -0, so that the
Ping utility will not have to wait at all.
• - t: Tis instructs the Ping utility to keep sending data packets until
it is explicitly told to stop. With the -1 you can change the size of the
data packets that are sent. It is important to keep in mind that a TCP-
based data packet must be a fnite size.
– At the shell in the Linux operating system, type in the following:
• ping <IP Address of target server goes here> -1 6500 -w -t.
What is happening now is that your Apache HTTP Server is being fooded
with an excessive number of ping commands, and eventually the processing
power will reach its limits, and the Apache Server should eventually crash
and not respond to any more requests. Generally speaking, the methods that
are used by a cyberattacker to launch a real-world DDoS attack are far more
sophisticated and covert. For example, he or she could very easy easily craft
a virus whose only purpose is to initiate a large-scale Ping attack against a
selected server or group of servers. Tis virus in turn could spread itself to
other servers on a global basis, causing a grave, cascading efect. As men-
tioned previously, DDoS attacks are one of the most popular threat vectors
for the cyberattacker, for two primary reasons:
– It is easy to do.
– Te cyberattacker can launch a DDoS attack from another computer or
server, thus masking their identity. If they launched it from their own
infrastructure, the data packets could be traced back to it.
2. Te SYN Flood:
Te SYN food is a more sophisticated type of DDoS attack. In order to
launch this particular threat vector, the cyberattacker must have a deep
knowledge of how network connections are established and maintained to a
server that hosts a web application. When a session is initiated between the
client and the server using the TCP/IP protocol, a small bufer in memory
is created in order to create what is known as a “handshaking” exchange of
messages, which actually establishes the network session.
Tis type of session establishing includes what is known as a SYN feld, which
specifcally identifes the sequences in the message exchange. Te goal of the SYN
food is to actually subvert this entire process. In other words, the cyberattacker
sends a number of connection requests in a rapid-fre fashion and then subse-
quently never follows up with the reply that is sent back by that server. As a result,
this has the efect of leaving the network connection on the server half-open, and
the bufer memory that is allocated for these connections is thus reserved and not
available to other web applications that are hosted on that server.
24 ◾ Testing and Securing Web Applications
Although the data packets that are in the memory bufer are discarded
after about three minutes without a reply from the sender (in this case, the
cyberattacker), the efect of sending hundreds or even thousands of requests
all at once makes it difcult for the legitimate requests for a network session
to get fully established.
One of the primary reasons why SYN fooding is so popular is that any
server which engages in a TCP/IP-based network connection is vulnerable to
it – and pretty much all of the servers use the TCP/IP protocol in order to
establish a network connection between the end user and the web applica-
tion. But there are numerous ways in which to defend against a SYN Flood
attack, and these are as follows:
– Using Micro Blocks:
Tis method changes the way in which the server allocates the mem-
ory space for any connection request that it might receive. For example,
instead of allocating a complete connection, the server can be altered so
that it only allocates what is known as a “micro-record.” Newer methods
of micro blocks can allocate as little as 16 bytes for the incoming SYN
object.
– Using Bandwidth Trottling:
Tis is when the frewall, router, or IDS detects any sort of excessive
network-based trafc coming from one or more IP addresses. If this is
detected, the bandwidth to the server is drastically restricted and scaled
back.
– Using SYN Cookies:
With this method, the server does not immediately create a bufer space
in its memory in order to initiate the handshaking process. Instead, it frst
sends a SYNACK message. Tis is an acknowledgement signal that actu-
ally initiates the handshaking process. Te SYNACK consists of a very
carefully crafted cookie, which is generated as a hash that contains the
port number, the IP address, and any other information and data coming
in from the computer of the end user that is requesting a network con-
nection. When the computer of the end user responds with an ACK (or
acknowledgement message), the information and data generated by that
cookie will be verifed by the server to which the network connection will
be established. However, using this kind of defense mechanism is intense
in terms of processing power, and because of that, this method is not as
commonly used. In other words, this particular defense mechanism illus-
trates the fact that there is a trade-of between performance and security.
– Using RST Cookies:
With this method, the server intentionally sends a wrong SYNACK mes-
sage back to the computer of the end user. In response to this, the client
computer will then generate an RST (or Reset) data packet that noti-
fes the server that there is something wrong in establishing a network
Network Security ◾ 25
connection. Because of this, the server now understands that this is actu-
ally a legitimate request that is coming from the computer of the end
user and will thus subsequently establish the network connection that
is requested. But a frewall or a router could very well block the return
SYNACK data packet.
– Using Stack Tweaking:
Tis method involves altering the TCP stack on the server so that it will
take less time to time out in the instance that a SYN connection still
remains incomplete. But when compared to the other defensive methods
just reviewed, this is the most complex one.
Probably the most efcient way in which to defend against a DDoS attack is
to use a combination of these defense mechanisms. For example, using both
SYN cookies and RST cookies in conjunction with stack tweaking is deemed
to be the best defensive mechanism thus far.
3. Te Smurf Attack:
Tis is yet another form of a DDoS attack. In this kind of cyberattack, an
Internet Control Message Protocol (ICMP)–based data packet is trans-
mitted to the broadcast address of any network infrastructure. But in
return, its specifc address has been altered in such a way that it matches
up to one of the IP addresses on the network. In return, all of the comput-
ers, workstations, and wireless devices will then respond back by pinging
this server.
It should be noted that the ICMP-based data packets use ICMP to
transmit error messages over the Internet. But because the addresses on
these kinds of data packets are sent to a broadcast address, that address
responds back by echoing that data packet to all of the hosts that reside on
that network infrastructure, which in turn will send out a spoofed source
address.
By constantly sending out all of these data packets, this in itself will cause
the network infrastructure to perform a DDoS attack on one of its own serv-
ers. Tis is actually a rather sophisticated attack, but the main difculty is in
launching the data packets onto a target network infrastructure. But this can
be accomplished by making use of a virus or a Trojan horse to execute the
start of the fow of the data packets.
In a smurf attack, the network infrastructure performs a DDoS attack on
itself. But there are two ways in which in which you can protect your network
infrastructure from a smurf attack, which are as follows:
– You can confgure all of your routers in such a way that they do not for-
ward any sort of broadcast-based data packets. If they are transmitted,
the smurf attack is then very often contained to a subnetwork (which is
just a trunk of the entire network infrastructure).
– You can simply protect your network infrastructure by guarding against
Trojan horses. But this is actually easier said than done, because your
26 ◾ Testing and Securing Web Applications
the internal routers. But once again, this is an extreme measure and should only be
used as a last resort.
Other steps you can take to protect your organization from a DDoS attack are:
within the network infrastructure, indicating that the message is coming from a
totally diferent IP address than where it is originating from.
If the primary intent of the cyberattacker is to gain unauthorized access to the
shared network resources, it is quite likely that the spoofed IP address will be that
of a system which is considered to be a trusted host. In order to successfully launch
such a cyberattack, the cyberattacker must frst locate and determine the IP address
of a server that is deemed to be that of a trusted host.
Once this has been accomplished, the cyberattacker can then modify the head-
ers of the data packets during their transmission across the medium from the web
application to the end user’s computer, and vice versa. Tis will ultimately appear
as if these data packets are coming from the trusted host.
Quite surprisingly, IP spoofng was known by computer scientists in the aca-
demic sector, at least on a theoretical level, before it was even launched as a “true”
cyberattack. Tis goes as far back as the early 1980s. It continued to remain on a
theoretical level until a computer scientist, Robert Morris, discovered a security
faw in the TCP-based protocol. Tis was technically known as “sequence prediction,”
and more details of this breakthrough can be found in the scientifc paper he wrote
entitled “Security Problems in the TCP/IP Protocol Suite.”
It should be noted that IP spoofng attacks are becoming much less frequent, pri-
marily because the threat vehicles that are used to launch them are becoming much
more secure. However, any cyberattacker can still launch this type of attack on a whim.
In order to prevent against an IP spoofng attack, the following is recommended:
◾ Do not in any way reveal the internal IP addresses of either your IT or net-
work infrastructures.
◾ You must always monitor for incoming data packets that are malicious or are
indicative of an IP spoofng attack. Tus, it is always important to use the
necessary software applications in these situations. Tey monitor the infux
of data packets that originate from the external environment, which contain
both the source and destination addresses of your IT and network infra-
structures. As a result, any data packet that “claims” to be coming from your
internal infrastructure will be immediately tagged when the evidence clearly
shows that they are coming from the external environment.
One of the biggest dangers of IP spoofng is that not all frewalls examine for
data packets that look like they are coming from an internal IP address. Tus, the
routing of data packets through various fltering routers is quite possible if they
are not specifcally confgured to flter for incoming data packets when the source
address resides clearly in the local domain.
Some typical examples of those router confgurations that are vulnerable to an
IP spoofng attack include the following:
◾ Routers that interface to an external network which supports multiple types
of internal interfaces
30 ◾ Testing and Securing Web Applications
◾ Proxy frewalls in which the proxy applications make use of the source IP
address for authentication purposes
◾ Tose routers that have two interfaces which allow for the subnetting of an
internal-based network
◾ Tose types of routers that do not flter for data packets in which their source
address clearly resides in the local domain.
that come to it. As a result of this, the cyberattacker can then put his or her own
server in front of that unresponsive endpoint. In other words, the primary goal of
this cyberattack is to exploit any vulnerabilities or weaknesses that are detected
in the network lines of communication and gain access to the target server upon
which the web application resides.
In the end, the only true way to prevent this kind of cyberattack is to utilize
some sort of encrypted transmission, such as a VPN. It is also important to
keep in mind that a cyberattacker can also make use of what is known as a data
snifer. Tis is a software-based tool that can very easily intercept data packets
going across a wireless network and copies the data packets that are sent and
received.
1. Te Macro Virus:
Tese kinds of viruses afect the macros found in Ofce-based documents.
Tese are essentially mini programs that are often found in Microsoft Ofce
32 ◾ Testing and Securing Web Applications
products, most notably that of Word and Excel. Although these macros have
been designed to help automate routine processes on a very basic level, they
can also be crafted to make them into viruses. In both Microsoft Word and
Excel, a scripting language known as Visual Basic is used to develop these
macros, whether it is for malicious purposes or not.
2. Te Boot Sector:
As its name implies, this kind of virus does not actually afect the operating
system (OS) of the server upon which the web application resides, but rather,
it attacks the boot sector of the hard drive in the server. As a result, this
makes them much harder to detect and remove with the traditional forms of
antivirus software packages. Because these kinds of viruses can be launched
outside of the operating system, the boot sector virus can be used as a covert
and stealthy form of cyberattack. Tis has the potential to be a very nasty
kind of virus, as it not only can afect the boot sector but can also delete
mission-critical fles that are needed to run the server upon which the web
application resides.
3. Te Stealth Virus:
Tis is deemed to be one of the largest groups of viruses present in the cyber-
threat landscape of today. Tis category can be classifed as a general one, in
that the main intention of it is to avoid detection (thus its name, “stealth”).
Typical examples of this kind of virus include the following:
– Te polymorphic virus:
Tis virus changes its form and structure routinely in order to avoid
detection by antivirus software applications. A much more advanced
form of this is known as the metamorphic virus. Tis kind of virus can
totally change its attack pattern in order to avoid detection.
– Te sparse infector:
Tis kind of virus avoids detection in the sense that it delivers its mali-
cious payload only on a very random basis, which makes it difcult to
predict even for the most seasoned and experienced penetration testing
and threat hunting teams. With this type of virus, the “symptoms” of it
appear in an on and of cycle. For example, it may not be launched until
the tenth or even the thirtieth time until the server that hosts the web
application reboots itself. Or the opposite can happen. For example, it
may demonstrate a huge burst of activity and then lie dormant for quite
some time. Te bottom line with this virus is that its primary goal is to
avoid detection by keeping up with random launches that are very dif-
fcult to predict or even guess. Another example of this kind of virus is
known as the fragmented payload. Tis one is split into various sorts of
modules, with the main one being the loader module. Te objective of
this is to download all of the other fragments of the virus. When this has
been accomplished, all of the modules will then be reassembled by the
loader and the malicious payload will be launched.
Network Security ◾ 33
– Ransomware:
At the present time, this is probably one of the most prevalent forms
of viruses out there. Te ultimate goal of this is to lock up the screen
of the server upon which the web application resides, as well as the
source code and other related fles that are associated with it. In order
to unlock the server and the fles, the business or corporation must
pay a ransom, usually in some sort of virtual currency, such as that
of Bitcoin. But even when this is paid, there is no guarantee that the
cyberattacker will send the decryption algorithms in order to unlock
the server and the associated fles. Probably the most famous and dead-
liest form of ransomware was known as WannaCry, and it attacked
the healthcare systems in both the United Kingdom and Scotland.
Actually, ransomware has been around for a long time, going as far
back as even 1989, with the PC Cyborg Trojan. It is important to keep
in mind that ransomware frst starts out as a worm and then trans-
forms itself into malware.
– Te Trojan horse:
So far, the Trojan horse has been one of the prime examples that we have
used in this chapter. Just to review, it may look benign to the end user,
but it has a dangerous and malicious payload right behind it. Te Trojan
horse can be very tricky to detect at frst, because it is usually frst down-
loaded as an application (such as a game, or even a utility-based program).
But once its payload is executed, any of the following repercussions could
happen:
• It can download harmful software from any website that you may
visit.
• It can install a keylogger or any other form of spyware onto the server.
• It can delete mission-critical fles.
• It can open up a backdoor in the source code of the web application
for the cyberattacker to enter into.
The Firewall
For the purposes of this book, a frewall can be defned specifcally as follows: “A
barrier between a server and an internal network from the outside world and/or the
Internet.” On a more technical level, a frewall is also referred to as a separation
34 ◾ Testing and Securing Web Applications
from the behind the demilitarized zone (DMZ) and the part of it that is made
available to the public (remember, the entire Internet is not made available to the
public – a lot of this consists of the Dark Web, which will be covered in a later
chapter of this book).
A typical frewall can be confgured as follows in order to protect the server
upon which the web application resides:
◾ Packet fltering
◾ Stateful packet fltering
◾ User authentication
◾ Client application authentication
At the bare minimum level, a frewall should be able to flter the incoming data
packets based on key variables such as the actual size of the data packet, the source
IP address, any associated network protocols, and the destination port number.
Te most common types of frewalls are examined in the next subsections of this
chapter.
Types of Firewalls
Te following types of frewalls are used most often when it comes to securing web
applications:
Although it can be powerful to use when securing the server upon which the
web application is hosted, there are a few disadvantages to it, which include
the following:
– Because no history is kept of other data packets that have been allowed to
enter into the network infrastructure, there is no baseline profle in which
to make a comparison with. Tus, they are prone to either a Ping food or
a SYN food type of cyberattack.
– Tere is no user authentication, so they are quite easily accessible.
– It only examines the data packet header; it does not examine what the
data packet actually consists of.
– It cannot detect any unusual behavior in the fow of network trafc.
2. Te Stateful Inspection Firewall:
Tis kind of frewall examines data packets that are not only in the current
stream of network communications and also maintains a history of the data
packets that have been allowed to enter the network infrastructure previously.
Tis simply means that it is aware of the technical context in which previous
data packets were sent. Tus, this makes them far less vulnerable to Ping or
SYN food attacks, unlike the packet fltering frewall.
It also has the following advantages:
– It can ascertain if a data packet is actually a subset of a much larger stream
of data packets that exhibit abnormal or malicious types of behavior.
– It can determine if a particular data packet possesses a source IP address
that appears to come from within the confnes of the network infrastruc-
ture. If this is the case, then more than likely an IP spoofng attack is in
progress.
– It can also examine the actual contents of a data packet and determine if
it consists of any sort of malicious payload.
– It can examine the state of the data packet as it relates to the entire
IP-based conversation between the web application and the computer,
workstation, or wireless device of the end user that is accessing it.
3. Te Application Gateway Firewall:
Tis is actually a software application that runs on a frewall. It can work with
other diferent kinds of frewalls from within the network infrastructure in
order to ascertain if a certain group of data packets should be allowed to enter
a network infrastructure or not. In technical terms, this is also known as
“negotiation” because a process of authentication and verifcation is utilized.
Tis software application will also carefully examine the fow of data pack-
ets from the web application as well as the server in which a connection is
trying to be established. Tus, it will fully ascertain if the end user’s com-
puter, workstation, or wireless device is allowed to penetrate into the network
infrastructure. Te software application that runs on this kind of frewall is
also known as a gateway or a proxy server.
36 ◾ Testing and Securing Web Applications
Of course, as one can see, this poses many security concerns, and thus is not uti-
lized often, only when extreme circumstances dictate its use.
With whitelisting, the web application server is made available to only certain
employees within the organization, and this is most likely just the IT staf and
related security personnel.
earlier in this chapter) and will simply block certain types of data pack-
ets based upon the network protocol that is being used, the port num-
ber, source IP address, and destination IP address. Also, the various port
numbers work in the transport layer of the OSI model, thus ofering yet
another method for data packet fltering.
Finally, as you and your IT security staf determine the types of frewalls
you need and the various confgurations, it is very important that you do not
develop the mentality that “more is better.” In other words, just don’t simply
deploy 10 or 15 frewalls in the hopes that they will defend your web applica-
tion server. Tere are two primary disadvantages with this:
– You are simply increasing the attack surface for the cyberattacker.
– You are spending a lot of money out of your precious IT funds.
Rather, as the CIO or CISO, you need to frst conduct an assessment as to
where the frewalls should be strategically placed, and from there, spend the
money appropriately in order to procure and deploy them.
It is a device that has been designed to detect for signs that a cyberat-
tacker is attempting to breach a system in the network infrastructure
and to alert the IT security staf that suspicious or anomalous behavior
is taking place.
Te NIDS inspects and examines all of the inbound as well as outbound port
activity on a web application. By doing this, it looks for suspicious patterns that
could indicate that a cyberattack is about to be launched or is currently underway.
40 ◾ Testing and Securing Web Applications
For instance, if the NIDS determines that a series of data packets were sent to each
port in a sequential fashion from the same source-based IP address, this is highly
indicative that the network infrastructure is being maliciously scanned for any vul-
nerabilities and weaknesses that may exist.
Also, a key advantage of a NIDS is that it can quite easily and efciently detect
an abnormally huge infux of data packets from the same IP address in just a very
short period of time. Actually, the primitive NIDS was just a hub. Ten it became
a network switch.
With this approach, once a data packet has traveled all the way from the end
user’s device to the server that is hosting the web application, it makes its way to the
subnet to which the web application is connected to. Once this has been accom-
plished, the MAC address of the server is used to locate the server, and from that
point onwards, the data packets would then be directed towards the server.
Tus, all of the web application servers on a particular subnet could see those
particular data packets, and if the MAC address of the destination web application
server did not match any other MAC addresses, the data packets would then be
discarded.
Tis gave rise to the data packet snifer (as discussed previously), but then peo-
ple started to realize that if the contents of a data packet could be collected, they
could also be analyzed in order to detect any malicious signs or abnormal network
behavior.
Preemptive Blocking
Tis is also referred to in technical terms as banishment vigilance. Te primary goal
of a NIDS-based system is to prevent a malicious intrusion from occurring before
it explodes into a large-scale cyberattack. Tis can be done in the early footprinting
stages of a pending intrusion, then from there, blocking the IP address that is at
the root of this malicious behavior. Although this may sound like an easy task to
accomplish, it can actually be a very complex process to undertake.
Te primary reason for this is that it can be quite difcult at times to determine
and ascertain legitimate network trafc from malicious network trafc. Tis can
result in an escalation of false positives. Tis occurs when the NIDS mistakenly
identifes legitimate network trafc as anomalous behavior. If this were to happen,
the NIDS would then shut down that fow of data packets to the web application
server, no matter what.
It should be noted at this point the use of AI and ML tools are becoming of
prime importance. Specifcally, AI can be defned as follows:
1. Machine Learning:
Tis automates analytical model building. It uses methods from neural net-
works, statistics, operations research, and physics to fnd hidden insights
in data without explicitly being programmed for where to look or what to
conclude.
2. Te Neural Network:
Tis is a type of machine learning that is made up of interconnected units
(like neurons) that process information by responding to external inputs,
42 ◾ Testing and Securing Web Applications
So as one can see from these defnitions and concepts, the use of both machine
learning and artifcial intelligence can be a huge boon to the IT security staf when
determining which alerts coming in from the frewalls and the NIDS are false
positives and which data packets are malicious or not. It can take a human being
many hours to accomplish these particular tasks, but with the use of both artifcial
intelligence and machine learning, this can be done within a matter of seconds.
Tis is especially crucial when the cyberthreat landscape is changing on liter-
ally a minute-by-minute basis. However, as noted, both artifcial intelligence and
machine learning tools require the use of live information and data feeds so that
they can learn about the profles from past threat vectors and predict those that
could be harmful to a web application server.
Anomaly Detection
Tis technique makes use of specifc software in order to detect intrusion attempts
on the web application server and to alert the IT security staf when these types of
incidents actually take place. With this, any type of activity that does not match the
Network Security ◾ 43
pattern of normal data trafc to the web application is thus noted and logged. Tis
is achieved by comparing actual, observed activity against expected data packets to
the web application server.
Tis kind of activity can also be referred to as a “traceback” because once an
anomaly is detected, either the artifcial intelligence or the machine learning tool
tries to ascertain where this malicious activity frst originated. Some of the ways in
which a particular anomaly can be detected include the following:
◾ Treshold Monitoring:
Tis process establishes a certain baseline of acceptable behavior and makes
observations if any type of activity has actually exceeded the baseline that
has been set forth. However, establishing what is deemed to be an acceptable
level of risk to the web application server just based on this method can be
quite challenging, because this involves much more of a qualitative judgment
as to what the particular level of baseline should be, as opposed to using a
quantitative-based approach.
◾ Resource Profling:
Tis methodology considers and measures the system-wide use of shared
resources, and from there, historic profles are thus created. Tis can be used
to help determine any malicious or anomalous behavior that is taking place.
But once again, since this is more of a macro-level key performance indicator
(KPI) of a network infrastructure, false positives can be generated as well.
For example, increased usage of a certain part of a network infrastructure
may not necessarily mean that an attack is underway against a web applica-
tion server; it just indicates that the web application in question is getting
increased traction from the various end users that are trying to access it.
◾ Executable Profling:
Tis particular technique measures and quantifes how the various software
packages in a web application server use the services that are available from it.
But the key diference here is that it tracks those kinds and types of activities
that cannot be traced back to a particular end user. Tis includes the cyberat-
tacks of viruses, Trojan horses, worms, trapdoors, etc. Tis is accomplished
by specifcally profling how web application server objects are accessed from
both the internal confnes of the network infrastructure and the data packet
fow and network communications that are coming to it from the external
environment. Tis allows for the NIDS to help confrm any type of cyberat-
tack that could be a grave risk to the web application server.
processes and subcomponents are involved in order to ensure its efcient and efec-
tive operations. Examples of this include the following:
1. Te Activity:
Tis is that part of the data packet that is of primary interest to the frewall
or NIDS.
2. Te Administrator:
Tis is the particular individual who is responsible for the overall security of
the web application server.
3. Te Sensor:
Tis is a specifc NIDS component that collects the information and data
about the infow of data packets to the web application server.
4. Te Alert:
Tis a certain message from the analyzer of the NIDS indicating that some sort
“interesting” activity has been detected from within the network infrastructure.
5. Te Manager:
Tis is the management component of the NIDS.
6. Te Notifcation:
Tis is the process by which the NIDS alerts the IT security staf of the
“interesting” activity that is taking place.
7. Te Event:
Tis is the specifc occurrence that a suspicious or malicious activity is
underway.
8. Te Data Source:
Tis is the raw information and data that are stored in the data packet.
9. Te Active NIDS:
Tis is also known as an intrusion prevention system (IPS). Tis type of sys-
tem will stop any and all network communications fow to the web applica-
tion server that is deemed to be malicious or suspicious in nature.
10. Te Passive NIDS:
Tis system just logs all network activity coming into the web application
server.
11. Te HIDS:
Tis stands for a host-based intrusion detection system. As its name implies,
it only monitors just one subnet of the network infrastructure.
12. Te HIPS:
Tis stands for a host-based intrusion prevention system, and this type of
system monitors all of the subnets of the entire network infrastructure.
technology was very expensive and only the Fortune 500 companies could aford
to deploy it. But as the technology has quickly advanced and matured over time,
the price of it has greatly come down, and in fact, a small to medium sized busi-
ness (SMB) can even conduct a basic Google search to see which type of VPN will
work best for them, pay on a subscription basis, and download and deploy it in a
few minutes.
A VPN can be specifcally defned as follows:
As one can see from this defnition, a virtual private network creates a private
network connection over the Internet in order to create a highly secure connection
between the web application server and the device of the end user that is access-
ing it. Instead of using a dedicated connection, which can be easily detected by a
data packet snifer (such as that of an unencrypted Wi-Fi connection), the VPN
makes use of what are known as virtual connections specifcally routed through the
Internet from the end user’s device to the web application server.
Te rest of this subsection is devoted to the VPN and how it can be specifcally
used to protect not only the lines of communication but the fow of data packets to
the web application server.
1. Te PPTP:
Tis is an acronym that stands for Point to Point Tunneling Protocol. Tis
is actually a specifc tunneling protocol that makes use of an older network
connection protocol known as Point to Point Protocol (PPP) for short. PPTP
enables the data packets to be encapsulated (or encrypted) over the IP pro-
tocol and thus have the ability to be forwarded to any IP-based network in
which the web application server is interfaced, or linked to. Point to Point
Tunneling Protocol is actually one of the frst and oldest forms of VPN con-
nections to be created. It made its frst public appearance in 1996 by a consor-
tium known as the PPTP Forum. Although PPTP is still widely used today,
one of its main benefts is that it operates at layer 2 of the OSI model (as
reviewed earlier in this chapter), which is the data link layer. Another primary
advantage of the PPTP is that it can support the encrypted transmission of
older forms of data packets, such as those of IPX, NetBEUI, and others. It is
important to note at this point that PPTP supports two kinds of tunneling
mechanisms, which are as follows:
– Voluntary Tunneling:
In this type of scenario, the device of the end user that is attempting to
establish a connection with the web application server frst connects to
the network backbone of an Internet service provider (ISP), and from
this point, the VPN is then launched in order to create the secure PPTP
session. With this setup, the end user actually selects the type and level
of encryption and authentication that he or she wishes to use, hence, the
name.
– Compulsory Tunneling:
In this type of confguration, the web application automatically selects
the encryption and authentication protocols that are to be used.
1. EAP:
Tis is an acronym that stands for Extensible Authentication Protocol. Tis
was specifcally designed to work with PPTP and provides the framework
and baseline for several other diferent authentication protocols to be used as
well. Tis includes the use of RSA tokens, and the public key infrastructure
(this will be reviewed in much more detail in Chapter 2, which is about
cryptography).
2. CHAP:
Tis is an acronym that stands for Challenge Handshake Authentication
Protocol. Tis technique is actually a three-way handshaking process in
order to fully authenticate the end user. Here is how this process specifcally
works:
– Once the network lines of communication have been established between
the device of the end user and the web application server, the sever actu-
ally sends a challenge message to the device of the end user.
– Tis device in turn responds to this challenge by transmitting a specifc
value which has been computed using a one-way mathematical hashing
function.
– In return, the web application server checks this response against the
hash value that it has computed, and if these two values match up, the
end user is then fully authenticated into the web application server. But
if the values do match up, the network connection that has been estab-
lished between the device of the end user and the web application server
is immediately terminated.
A primary advantage with this technique is that this three-way handshak-
ing process is refreshed on a daily basis, thus creating new hash values on a
dynamic and real-time basis.
be reviewed in more detail later in this chapter) in order to provide for a robust and
secure VPN.
L2TP supports the following authentication mechanisms:
◾ MS-CHAP:
Tis authentication mechanism was created by Microsoft exclusively for the
Windows Server operating systems. Its primary objective is to further the
local area networks (LANs) upon which a web application server may reside,
as well as to integrate hashing and encryption mathematical algorithms in
client-server–based network topologies. But there are some exclusive func-
tionalities of MS-CHAP, which are as follows:
– It is only designed to be interoperable in a Windows-based networking
enterprise.
– Cleartext and passwords that can be reverse-engineered are not supported
– It provides for authenticator-based retry and automatic password chang-
ing mechanisms.
– It possesses what is known as a reason for failure code system, in that certain
values are returned if certain data packets do not reach the web application
server, for whatever reason, if the authenticator has failed. Tese are a spe-
cifc set of codes that only a Windows Server operating system can inter-
pret, thus providing for a specifc reason for the failure of the authenticator.
◾ Te password that is transmitted from the device of the end user and the web
application server is sent over in a cleartext format.
◾ It is meant to work with HTTP and not the Hypertext Transport Protocol-
Secure (HTTPS) protocol.
and retransmits it again in order to gain unauthorized access to the web application
server. Te primary reason for this is that the levels of encryption that are aforded
with this specifc protocol could be reverse-engineered, given that the cyberattacker
has the right set of tools to do this.
Once the device of the end user attempts to decrypt Message A with the private
key, the password that is in that data packet will be compared to what the end user
has typed in. If the two passwords match up, the decryption process continues; if
not, the session is immediately terminated. Assuming that the two passwords do
correspond exactly, there is yet another message process, which is as follows:
◾ Message C: Tis consists of the TGT from Message B and the username of
the end user that is requesting access to the web application server
50 ◾ Testing and Securing Web Applications
Once Message C and Message D have been received by the TGS, the TGS
then attempts to obtain Message B via both Message C and Message B. It should
be noted that Message B is decrypted using the TGS private key. Tis is known
specifcally as the client/TGS session key. By making use of this key, the TGS then
decrypts Message D (which is the authenticator), and at this point, the following
two messages are then transmitted:
Once these two messages have been received by the TGS, the device of the end
user that is attempting to make a connection with the web application server can
now be 100% authenticated to the service server (SS) that also resides upon the
web application server. Once this has been established, two new messages are sent:
Te SS then decrypts the ticket (which is Message E) by making use of its pri-
vate key to retrieve the client/server session key. By making use of this same key, the
SS also decrypts the authenticator (which resides on the web application server) and
then transmits the fnal message:
Te device of the end user then decrypts Message H by making use of the cli-
ent/server key (as also previously described) and checks if the timestamp is correct.
If this is confrmed, then the end user can trust the validity of the web application
server and start requesting the various services that are available from it.
Network Security ◾ 51
◾ Transport mode: Tis is when the entire data packet is encrypted, except
for its header. What this means is that the source address, the destination
address, and the data that is contained in the header remain unencrypted.
◾ Tunnel mode: Tis encrypts the rest of the data packet, which is the header
and the data that resides in it.
It is important to note here as well that there are types of protocols that make
IPSec the tool of choice for a VPN. One of these is known as the Internet Key
Exchange (IKE) for short. Tis is used to help fortify the existing security function-
alities that a VPN has to ofer. For example, a secure association is created by the
two endpoints of a VPN tunnel, and this ultimately will determine what informa-
tion and data get encrypted and authenticated.
With regard to this, the following variables are taken into consideration:
1. Te frst exchange between the two VPN endpoints establishes the technical
aspects of the security policy that the VPN will use.
2. Te initiator then suggests the various encryption and authentication algo-
rithms that can be used for the VPN.
3. Te responder decides on the specifc algorithms that will be used.
4. Te second exchange between the two VPN endpoints passes on what
are known as the Dife-Hellman public keys (this will also be covered in
Chapter 2).
5. Tese public keys will then be used to encrypt the data packets that will be
sent between the two endpoints of the VPN.
6. Te third exchange between the two endpoints of the VPN then authenti-
cates the Internet Security Association and Key Management session, and
this process is technically known as the main mode.
7. Once this step has been initiated, the IPSec-based negotiation is then trig-
gered (this is specifcally known as the quick mode).
8. Te quick mode then further negotiates the SA for the level of encryp-
tion that will be used and does all of the key management for the IPSec
protocol.
9. Finally the secure connection between the endpoints of the VPN is frmly
established, and the transmission and fow of data packets between the device
of the end user and the web application (and vice versa) is started and contin-
ues until the VPN is no longer needed.
1. Te device of the end user transmits to the web application server the type of
SSL that is being used, the specifc cipher-based settings, and the data packets
that will be transmitted.
2. In return, the web application server sends its own set of this same informa-
tion back to the device of the end user in order to establish the baseline, or
profle, that will be needed to establish a secure connection.
Network Security ◾ 53
3. Te device of the end user then uses this information to confrm the “identity”
of the web application server, taking into account the following parameters:
– If the issuer of the SSL or TLS certifcate originates from a trusted cer-
tifcate authority
– Te expiration date of the SSL or TLS certifcate
– If the SSL or TLS has been revoked before
If for some reason the web application server cannot be positively identi-
fed, then a secure connection cannot be guaranteed. But the end user still
has the option of continuing onto the next step.
4. Assuming that a secure connection can actually be established, the
device of the end user then transmits to the web application a specialized
private key.
5. If the web application server accepts this private key, the device of the end
user then transmits over its SSL or TSL certifcate.
6. Based upon this, the web application server will attempt to authenticate the
device of the end user. If this cannot be done, the network lines of communi-
cation between the two are then terminated. But if this can be done (meaning
the device of the end user can be authenticated), the web application server
will decrypt the private key that was originally sent over to it.
7. From this, a “master secret” is then formulated in order to establish a pair of
session keys. Tese are “symmetric” based keys that are then used to encrypt
and decrypt the data packets that are being sent from the device of the end
user to the web application server (and vice versa).
8. Once these symmetric-based keys have been received by the device of the
end user, all of the data packets that will be transmitted are then 100% con-
frmed to be encrypted as they traverse their way across the network line of
communication from the device of the end user to the web application server
(and vice versa).
9. In return, the web application will also confrm that any data packets that are
transmitted from it will also be 100% encrypted.
With threat hunting, the primary goal is to break down the walls of defense
starting internally with the network infrastructure, and from there, going into the
external environment. Te objectives of threat hunting are the same as penetration
testing, but in reverse. Te objective is to see if there are any covert pieces of mal-
ware that could exist from within the network infrastructure and determine how
they got there in the frst place from the external environment.
Penetration testing and threat hunting will be covered in Chapter 3 and Chapter 4,
respectively. But there is yet another methodology that can be used to gauge the
level of security as it relates to the web application in question and the server that
it resides upon. Tis is known as a risk assessment and ofers a more quantitative
approach and utilizes various rating factors in order to determine just how much at
risk the web application and its server face. We will turn to this next.
◾ Avoidance: Tis means that there is, on a theoretical basis, no security risk
that is posed to the web application server.
◾ Transference: Tis refers to shifting the risk that is faced by the web applica-
tion to another entity. A prime example of this is cybersecurity insurance. For
example, if the web application and the server that it resides upon have been
destroyed, the business or corporation will then fle a claim with the respective
insurance company. As a result, the insurance company then assumes the risk,
because that entity will then have to process the claim and issue the funds.
◾ Acceptance: Tis is the statistical probability that the risk of a cyberattack to
a web application is extremely remote, and in fact, the cost of taking preven-
tative measures exceeds the fnancial impact that a cyberattack could pose to
the business or corporation.
Each of these variables, in turn, is then assigned a specifc numerical value as follows:
1. Te Level of Attractiveness:
Tis will receive a rating of 1 if the web application has no value to the cyberat-
tacker, but will receive a value of 10 if it is very attractive to the cyberattacker.
2. Te Sensitivity Nature of the PII:
Tis will receive a rating of 1 if there is no PII stored on the web application,
but will receive a value of 10 if there is PII stored on the web application.
3. Te Level of Security:
Tis will receive a rating of 10 if there are layers of security to guard the web
application server, but will receive a value of 1 if there are no layers of security
to guard the web application server.
In order to compute the specifc value of risk that is posed to the web applica-
tion and the server that it resides upon, the following mathematical formula is used,
considering the previously mentioned variables:
The level of attractiveness + The sensitivity nature of the PII – The level of security =
The risk posed to the web application
56 ◾ Testing and Securing Web Applications
Once this value has been computed, it can then be applied to the following
rating scale:
1 No Impact
10 Catastrophic Impact
◾ Patches
◾ Ports
◾ Protection
◾ Probing
◾ Physical
1. Patches:
As most cybersecurity professionals can attest, applying software patches
and updates (even the frmware) to the web application is one of the most
fundamental and even crucial aspects that you and your IT security staf
should be embarking upon. In this regard, you should confrm that there
Network Security ◾ 57
3. Protection:
As its name implies, this part of the initial assessment involves making sure
that the entire network infrastructure is protected and well-fortifed, using
the latest security software applications and hardware. Tis includes the
following:
– Firewalls and routers are put into place (especially making use of those
that are stateful packet inspection based).
– Antivirus and antimalware software packages are deployed onto the web
application server.
– Network intrusion devices are also deployed.
– Proxy servers are being used (this will mask all of the IP addresses that are
internal to the network infrastructure).
– All lines of network communications from the web application server to
the external environment are encrypted and made secure through use of
a VPN.
It is also equally important to conduct regular security audits (at least once a
quarter) in order to make sure that these functionalities are fully optimized in
order fortify the lines of defense of the business or corporation.
4. Physical:
Apart from hardening your entire network infrastructure and all of the IT
assets that reside within it, you must also equally ensure that the business or
corporation has deployed and implemented equal amounts of security from
the physical access entry perspective as well. Tis simply means that only
authorized personnel are able to physically to access the data center where
the web application server resides. Tis should be a key and fundamental of
your overall security policy as well. Because so much attention is paid to net-
work security, this is an often-forgotten aspect. Tis realm of security also
includes providing strong layers of physical security to other aspects as well,
such as backup tapes and other key documentation as it relates to the web
application and the server that it resides upon. Also, access to any type of
security tools should be highly restricted. Tis includes the routers, frewalls,
switches, hubs, network intrusion devices, etc. Any company devices that
are issued to employees should have some sort of mark engraved into them,
so that they can easily be identifed, and they should be inventoried and
accounted for on a quarterly basis. Just like in network security, you must
also implement multiple layers of security when it comes to physical access
entry. In this aspect, the use of biometric technology, such as hand geometry
recognition, iris recognition, and fngerprint recognition, is a robust modal-
ity for any business or corporation to implement. Tese techniques should
be used for the main entry access both externally and internally within
the organization. Also, keep in mind that by maintaining strong levels of
physical access entry security, this will even help mitigate the risks of an
insider attack from occurring. Tese are very difcult to detect, and even a
Network Security ◾ 59
legitimate employee that has access to a web application can cause serious
damage to it. Terefore, in order to be proactive about this, the organization
must maintain a 24 × 7 × 365 confdential hotline so that any suspicious
activity can be reported immediately.
5. Probing:
Tis typically involves conducting a deep scan of the network infrastruc-
ture in order to discover any unknown security weaknesses and gaps. In this
instance, this is where penetration testing and threat hunting become abso-
lutely critical (these topics will be covered much more extensively in Chapters 3
and 4, respectively). Network infrastructure probing on a macro level typi-
cally involves the following:
– Port scanning: Tis typically involves scanning all of the network ports
that are most commonly used. But in order to conduct a much more thor-
ough search, all of the ports that reside in your network infrastructure
should be scanned.
– Enumerating: Tis is where either the penetration testing or threat hunt-
ing teams adopt the mind-set of a real-world cyberattacker and com-
promise such items as employee access accounts, shared resources and
folders, printers, and other hardware items associated with the web appli-
cation server.
– Vulnerability assessment: Tis is the use of any and perhaps even all avail-
able network snifng and probing tools in order to assess both known
and unknown vulnerabilities. Tis is typically decided upon before the
threat hunting and penetration teams engage in their exercises.
to study all of the vulnerabilities and weaknesses, and once they are ready, they
then “move in for the kill.” But a key diferentiating factor now is that rather than
using the “try and get all” approach, the goal of the cyberattacker is to move in
slowly, in small increments.
Te primary goal here is to stay inside the confnes of the victim organization
for as long as possible, going unnoticed. But at this point, it is now very important
to distinguish what the terms hacking and attacking really mean, as a there is a
distinct diferentiation between the two, which are specifcally defned as follows:
◾ Hacker: “In computing, a hacker is any skilled computer expert that uses
their technical knowledge to overcome a problem.”
◾ Attacker: “In computer and computer networks an attack is any attempt to
destroy, expose, alter, disable, steal or gain unauthorized access to or make
unauthorized use of an asset. Tus, an attacker is the individual or organiza-
tion performing these malicious activities.”
Thus, as one can see from these two definitions, a hacker is an individual
(or group of individuals) whose primary interest is that of just sheer curiosity. He or
she has particular expertise in a certain area of IT and wants to apply that in order
to learn more about what they are targeting, especially about its security weak-
nesses and vulnerabilities.
In fact, there three distinct types of hacker, which are as follows:
1. Passive Searching:
In most cases, the cyberhacker will start here. In this technique, the goal is
to gather as much as information and data as possible about the target system
and does not involve any sort of direction or interaction with it by the cyber-
hacker. Note in this situation, we use the term “target” and not “victim,” as
there is no damage or harm that is being caused yet. In order to accomplish
this task, the cyberhacker will use various searching tools (examples of this
include www.netcraft.com and www.archive.org) in order to get this infor-
mation and data. Te cyberhacker may even conduct some social engineering
exercises to a very limited degree in order to learn about the people that are
associated with the target that is being studied.
2. Active Scanning:
As its name implies, this technique requires that the cyberhacker have a direct
connection with the intended target to some degree in order to get the informa-
tion and data that they seek. Examples of typical scans include the following:
– Port scanning: Tis involves conducting a deep scan to see what ports are open
on the web application server. Te port number typically reveals the specifc
services that each port is actually using. In this regard, one of the most widely
used tools to conduct a port scan is Nmap. It is a downloadable tool that is
available from www.nmap.org. By using this particular tool, various “fags”
can be implemented in order to mark those pieces of information and data
that the cyberhacker is seeking. Te fags in Nmap include the following:
• -0: Operating System Detection
• -sP: Te Ping Scan
• sT: Te TCP Connection Scan
• -sS: Te SYN Scan
• -sF: Te FIN Scan
• -sN: Te NULL Scan
• -sU: Te UDP Scan
• -s0: Te Protocol Scan
• -sA: Te ACK Scan
• -sW: Te Windows Scan
• -sR: Te RPC Scan
• -sL: Te List/DNS Scan
• -sI: Te Idle Scan
• -Po: Te Don’t Ping
• -PT: Te TCP Ping
• -PS: Te TCP and ICMP Pings
• -PM: Te ICMP Netmask
• -oN: Te Normal Output
• -oX: Te XML Output
• -oG: Te Greppable Output
• -oA: Te All Output
62 ◾ Testing and Securing Web Applications
After this command has been executed, the cyberhacker can then execute the
following command to determine what kind of operating system and related ser-
vices the web application server is using:
– Once this has been accomplished, you can move over to the Windows
system 32 directory structure to create a copy of what is known as the
magnify application by issuing the following commands:
Cd /mnt/windows/Windows/System32
Mv magnify.exe magnify.bck
– Ten make a copy of the cmd.exe (this is the command prompt) and from
here, change its respective name to magnify.exe, by issuing the following
command:
– Now you can reboot into Windows and log into the Windows-based web
application server by selecting the Accessibility and Magnifer options.
2. SQL Injection:
SQL is an acronym that stands for Structured Query Language and is still
one of the most widely forms of threat vectors used by the cyberattacker. Te
prime target here is the login screen of the web application, in which the end
user must enter in a username and password. Tis combination is, of course,
checked against a database of all issued usernames and passwords in order to
64 ◾ Testing and Securing Web Applications
confrm that what has been entered by the end user is actually valid. Here is
how this can attack can be simulated:
– When creating the username and password database, the SQL code that
has been used to create the actual web application is found in quotation
marks in order to separate it from the source code that has been used to
create the web application, demonstrated as follows:
Tis query then tells the SQL based database as well as the Web applica-
tion to allow for a malicious login, X=X
– In order to delete records in a SQL database, the cyberattacker issues the
following command:
X’; DROP TABLE users; - -
– In order to hack into a SQL database that consists of email addresses, the
cyberattacker issues the following command:
X’ UPDATE members SET email = ’[email protected]’
WHERE email = ’[email protected]’
It is important to note that that these SQL-based commands are only the
basic ones. Tey can get far more complex, covert, and stealthy, based upon
the skill level of the cyberattacker.
3. Cross-Site Scripting (XSS):
Tis type of cyberattack deals specifcally with the HTML-based pages
that have been created for the specifc web application in question. It can be
defned specifcally as follows:
Network Security ◾ 65
Data Confdentiality
Data confdentiality, as mentioned earlier, is security related to the actual trans-
mission of data between diferent aspects of a system using the network. Tis is
66 ◾ Testing and Securing Web Applications
critically important to the overall security of the web app, because an attacker (or a
curious user of your application) has several points at which they can insert them-
selves to inspect the trafc fowing between the front-end and various back-end
pieces. If any of those insertion points contain data that is not encrypted (known
as plaintext), the eavesdropper would at least be able to read the data and possibly
be able to change the data to support their goals. Figure 1.1 shows what plaintext
submission of a web search for “cybersecurity” looks like using http://web.mit.edu
when viewed with the popular network analysis tool Wireshark. Note that every
aspect of the communication is completely visible to the attacker. If credentials
were being provided instead of a search term, such a disclosure could be disastrous.
As we start our exploration of data confdentiality, we’ll frst discuss some of the
most common technical layouts for web applications. Tis will allow us to compare
and contrast the advantages and disadvantages associated with each model.
TLS
TLS is a protocol built on top of TCP that provides the ability to perform a “hand-
shake” between the client (your end user) and the server (wherever your web app is
hosted) that establishes a secure connection using encryption. Te specifcs – such
as which cipher suite will be used to generate a strong symmetric key for the session
being set up – are negotiated during the steps of the handshake.
70 ◾ Testing and Securing Web Applications
Figure 1.6
Figure 1.7
Figure 1.8
72 ◾ Testing and Securing Web Applications
Figure 1.9
would typically be the name of the page where your web app is accessible. In
this case, we see that we’re visiting dropbox.com (Figure 1.9), which matches
the name of the site where the Dropbox web app is hosted and the name on
its certifcate (Figure 1.10; more on that later).
Certifcate
Te next message in the handshake (Figure 1.11) carries the server’s certifcate. Tis
information is used to attest that the site being connected to is in fact the same as
the site we requested.
Figure 1.10
Network Security ◾ 73
Figure 1.11
Te server’s certifcate will provide information about the issuer of the certif-
cate, which includes:
◾ Country name (here, US)
◾ Organization (DigiCert, Inc., a trusted certifcate authority)
◾ Organizational unit
◾ Common name
Because Dropbox is the subject of this certifcate, we’ll see similar information
identifying both it and the website. Moreover, this certifcate is only valid for a
certain period of time (in this case, a little more than two years), which means that
a new certifcate will need to be requested before the end date.
We’ll talk more about certifcates in the next section, but for now know that
using a trusted certifcate authority like DigiCert is a requirement for deploying a
web app that you wish people to trust.
suite is used to generate cryptographic information that will be used in creating the
session key. Because this step varies based on the selected algorithm and delves deep
into cryptographic principles, it is beyond the scope of this book.
Figure 1.12
Figure 1.12 is the capture of the parameters (cipher suite plus the culmination
of the key exchange steps) that can be used to validate the client and communicate
with it for as long as the ticket is valid and the client provides it.
Te fnal messages – Change Cipher Spec – are sent by both client and server
and are intended to alert the other side that the rest of the communications will be
encrypted using the session key mutually identifed earlier.
For more in-depth resources regarding TLS and encrypted communications,
we recommend reading Bulletproof SSL and TLS: Understanding and Deploying
SSL/TLS and PKI to Secure Servers and Web Applications by Ivan Ristić.2
Site Validity
Now that we know how to set up the web app’s infrastructure and ensure that the
data in fight is encrypted, it’s time to talk about site validity – helping the user to
trust that they are using the right web app.
Figure 1.13
Commonly, errors can indicate that there is a third party intercepting and/or trying
to manipulate the data (a man in the middle attack; see Figure 1.13), that the site owner
has forgotten to renew their certifcate (Figure 1.14), that some aspect of the certifcate
was deemed to no longer be trustworthy (Figure 1.15), or that the strength of the certif-
cate isn’t good enough per the requirements set out by the client (Figure 1.16).
Yet even if there is not an error, that doesn’t mean that the certifcate should be trusted
in all cases. Te term certifcate authority is intended to refer to a relatively small number
of organizations where trust for the majority of Internet sites (and therefore web apps)
is granted. Tese CAs have an incentive to carefully control the certifcates they create
because a mis-certifcation can, at the least, cause negative publicity3 and, at the worst,
destroy their business.4 Additionally, there are many cases where developers (especially
during initial testing) will use self-signed certifcates. In these cases, the trust of the under-
lying site and web app is only conferred by the local system (whether that is an internal
server or a hosting provider). Because these organizations (i.e., the developer or the host-
ing provider) are not explicitly focused on safeguarding these certifcates, issues – such
as the inability to revoke a compromised certifcate, or the ease of impersonating a site –
abound. In fact, most common browsers will actually produce an error (Figure 1.17).
Network Security ◾ 77
Figure 1.14
Figure 1.15
Unfortunately, with that comes confusion for the average user. While certif-
cates for most of the history of the Internet conferred some additional level of trust
(on average), this is no longer the case.
Another option is to leverage an Extended Validation (EV) certifcate, which
actually requires manual steps to guarantee that an entity7:
Tese certifcates are much harder to fake, but they are also much less rarely
used and are more expensive than standard certifcates.
Figure 1.16
Figure 1.17
80 ◾ Testing and Securing Web Applications
Let’s Encrypt statistics.6
Figure 1.18
Network Security ◾ 81
Figure 1.19
82 ◾ Testing and Securing Web Applications
clone your site to look and behave identically like the real thing (at least superf-
cially). Te Medium blog post8 from 2017 by Sebastian Conijn discusses in detail
how he pretended to be Dropbox to compromise his colleague.
In addition, because the number of top-level domains that can be registered for
a given domain has exploded in recent years, it is extremely easy to register a look-
alike domain name (like yourwebapp.info) that you may not have thought to reserve.
Conclusion
In this section, we discussed web app security from the lens of network security. In
doing so, we discussed two major topics: data confdentiality and site validity. In
the former, we talked in detail about the importance of encrypting data in fight
and how diferent physical and virtual layouts can leave vantage points for difer-
ent kinds of attackers. In the latter section, we discussed the importance of using
certifcates to confer trust to the users of your web app. We also explored the topic
of extended validation certifcates for stronger levels of trust and how standard
certifcates – combined with clever attackers – can easily imitate your web app.
Resources
https://www.livescience.com/20727-internet-history.html
https://blog.secureideas.com/2018/04/a-brief-evolution-of-web-apps.html
https://www.digitalinformationworld.com/2018/11/infographic-the-short-history-of-
website-building.html
https://intetics.com/blog/a-brief-history-of-software-development-methodologies
https://www.sas.com/en_us/insights/analytics/what-is-artifcial-intelligence.html
https://www.sas.com/en_us/insights/analytics/machine-learning.html
https://searchnetworking.techtarget.com/defnition/virtual-private-network
https://www.beyondtrust.com/blog/entry/diference-between-a-threat-actor-hacker-attacker
https://www.owasp.org/index.php/Cross-site_Scripting_(XSS)
Easttom, Chuck. Network Defense and Countermeasures, 3rd Edition. Published by
Pearson Education, Inc.
References
1. https://www.ietf.org/rfc/rfc5246.txt
2. https://www.feistyduck.com/books/bulletproof-ssl-and-tls/
3. https://www.esecurityplanet.com/browser-security/google-hit-again-by-unauthorized-
ssltls-certifcates.html
4. https://www.wired.com/2011/09/diginotar-bankruptcy/
5. https://www.ssllabs.com/ssltest/index.html
6. https://letsencrypt.org/stats/
7. https://www.digicert.com/ev-ssl-certifcation/
8. https://medium.com/hike-one-digital-product-design/how-i-used-phishing-to-get-
my-colleagues-passwords-this-is-how-i-did-it-73b9215689f1
Chapter 2
Cryptography
When one thinks of a web application, very often it is the front end that is con-
jured up. When we talk about this “front end,” it is very often the frst part that
you see in a website after typing in the domain or the uniform resource locator
(URL). Likewise, the front end can also be the shopping site of an online merchant,
from whom you can purchase various goods and products. But as we eluded to in
Chapter 1, today, much more is involved with a web application than just manag-
ing the front end.
For example, there is the back end, which is the database. From here, all
sorts of information and data are stored, such as the personal identifable infor-
mation (PII) of customers, all of the transactions that occur from the device
of the end user to the web application (and vice versa), and all of the mission-
critical fles that are needed to run the web application in an efcient and seam-
less fashion.
Te other consideration in a web application is the security perspective.
Given the rapidly changing cyberthreat landscape of today and just how prev-
alent, covert, and stealthy cyberattacks have become, securing all angles of
the web application must be one of the highest priorities for a business or a
corporation.
For example, the lines of network communications that are used to communi-
cate back and forth between the device of the end user to the web application must
be made as secure as possible, and in fact, invisible to the outside world. Tis is so
that any confdential information and data cannot be easily intercepted by a mali-
cious third party, such as that of a cyberattacker.
Tis is what was reviewed in great length in Chapter 1 of this book, which
was all about the network security issues that a web application and the server
83
84 ◾ Testing and Securing Web Applications
that it resides upon faces. In particular, the following topics were examined in
great detail:
An Introduction to Cryptography
Cryptography is a science that dates all the way back to the times of Julius Caesar.
In its simplest terms, the science of cryptography is merely the scrambling and the
descrambling of text, or written messages, between two individual parties. Tese
individual parties can also be referred to as the sender and the receiver. Te sender
creates the text or the written message that needs to be sent, and the receiver (as
the name implies) receives the text or the written message and then reads it and
appropriately responds.
Cryptography ◾ 85
Ciphertexts
When the decrypted message is once again encrypted into a state of context
that is totally incomprehensible and undecipherable, this is known as cipher-
text. So, to illustrate all of this, with the previous example, when the sending
party creates the written message of “I LOVE YOU,” this is the plaintext or the
cleartext.
Once this message is encrypted into the format of “UYO I VEOL,” and while
it is in transit, it becomes known as the ciphertext. Ten, once the receiving party
gets this ciphertext and then decrypts it into a comprehensible and understandable
form of “I LOVE YOU,” this message then becomes the plaintext or the cleartext
once again.
At this point, the question that often gets asked is how does the sending party
actually encrypt the message and how does the receiving party then actually decrypt
the ciphertext? Well, in its simplest form, the written message is encrypted via a
special mathematical formula. Tis formula is specifcally known as the encryption
algorithm. Because the ciphertext is now encrypted by this special mathematical
algorithm, it would be rendered useless to a third party with malicious intent due
to its totally garbled nature.
As the receiving party receives this ciphertext, it remains in its garbled format
until is it is descrambled. To do this, a “key” is used, which is only known by the
sending party and the receiving party. In terms of cryptography, this key is also
Cryptography ◾ 87
known as the cipher, and it is usually a short string of characters, which is needed
to break the ciphertext.
As will be examined later in this chapter, interestingly enough, the encryption
algorithm is actually publicly known and is available for everyone to use. Terefore,
the key or the ciphertext must remain a secret between the sending party and the
receiving party.
In order to send the ciphertext between the sending party and the receiv-
ing party, as well to share the keys that are needed to encrypt and decrypt the
ciphertext, specifc cryptographic systems are needed. Today, two such types of
systems exist. Tey are known as symmetric key systems and asymmetric key
systems.
in the Caesar cipher, since it specifies how the plaintext should be encrypted
over into the ciphertext.
With the Caesar cipher, some 25 diferent combinations, or key values, can
be used. An improvement over the Caesar cipher came with a newer technique
known as the monoalphabetic cipher. What distinguishes this from the Caesar is
that although one letter of the alphabet can still be replaced with another, no exact
mathematical sequencing is required. Rather, the letters in the plaintext can be
substituted at random in order to create the ciphertext. So once again, for example,
the plaintext message of “I LOVE YOU” can be written at will and at random as
“UYO VOLI E.” With the monoalphabetic cipher, more pairings of letters are pos-
sible. For example, there are 10 ^ 26 possibilities of letter pairings versus only the
25 letter pairings available with the Caesar cipher.
Tus, if a hacker were to attempt a brute-force attack on a monoalphabetic
cipher (which is just the sheer guessing of the ciphertext for any type of pattern
in order to decipher the plaintext), it would obviously take a much longer time to
crack versus the Caesar cipher.
1. Ciphertext-only attack: With this type of attack, only the ciphertext is known
to the attacker. But if this particular individual is well trained in statistics,
he or she can use various statistical techniques to convert the ciphertext back
into the plaintext.
2. Known-plaintext attack: Tis occurs when the hacker knows some aspect of
the letter pairings, thus, they can consequently convert the ciphertext back
into the plaintext.
3. Chosen-plaintext attack: With this type of attack, the hacker can intercept
the natural plaintext message that is being transmitted across the network
medium, and from this, reverse-engineer it back into its ciphertext form in
an attempt to fgure out the specifc encryption scheme.
Polyalphabetic Encryption
Over time, improvements were made to both the Caesar cipher and the monoal-
phabetic cipher. Te next step up from these two techniques was another technique
known as polyalphabetic encryption. With this, multiple types of Caesar ciphers
are used, but these ciphers are used in a specifc sequence, which repeats once the
overall cipher has reached its logical end the frst time, in order to fnish the com-
pletion of the encryption of the plaintext message.
Cryptography ◾ 89
Tis means that the wrap-around technique is also prevalent in this type of
scenario. Let us illustrate this example once again with “I LOVE YOU.” Building
upon the example used previously, suppose that two types of Caesar ciphers
are being utilized, such as where k = 1, and k = 2 (“k” once again denotes the
actual Caesar cipher, or the sequential spacing of the number of letters later in the
alphabet).
Te following chart demonstrates this in order to make it clearer:
Plaintext: ABCDEFGHIJKLMNOPQR
STUVWXYZ
First Caesar Cipher, where k = 1 BCDEFGHIJKLMNOPQRS
TUVWXYZA
Second Caesar Cipher, where k = 2 CDEGHIJKLMNOPQRSTU
VWXYZABC
Block Ciphers
Using a method of transposition, the plaintext message is then encrypted into its
scrambled format. Let us illustrate this again with our previous example, but this
time, let us assume a block of three characters, mathematically represented as 3 bits,
or where k = 3.
Note that an extra character as added at the end, which is the letter “X.” Tis
was added so that a complete plaintext block can be formed. As a rule of thumb,
if the total number of characters in the plaintext is not divisible by the block size
permutation (in this instance, where k = 3), it can be safely assumed that extra
characters will be needed in order for the last block of plaintext to be considered
complete. Tis is known as padding. It should be noted that the most widely used
block is where k = 8 bits long.
As we can see, even with the simple example provided earlier, block ciphers are
a very powerful tool for symmetric key cryptographic systems. After all, it goes
through a set number of iterations of scrambling in order to come up with a rather
well-protected ciphertext. But despite these strong advantages of block ciphers, it
does sufer from an inherent weakness, which if discovered by a hacker, can cause
rather detrimental damage, with irrevocable results. Tis vulnerability is that two
blocks can contain the exact same data. Let us examine this with our previous
example once again. As it was illustrated, the ciphertext block was formulated as
“OLI YEV XUO.” But, of course, depending upon the actual written context of
the plaintext, it is possible that the ciphertext block can contain two or more exact
blocks of the same data.
Initialization Vectors
Continuing with our example, it would look like this: “OLI OLI YEV.” To alleviate
this weakness, a system of initialization vectors (IVs) is used. Although it sounds
complex, simply put, this involves creating some further scrambling, or random-
ness, within the ciphertext block itself. However, it should be noted that it is not the
IV itself that further encrypts the ciphertext blocks.
1. Te IV is created frst.
2. Trough a mathematical process known as XOR (which stands for eXclusive
OR, and is used quite frequently to determine if the bits of two strings of data
match or not), the frst created IV is XOR’ed, with the frst block of ciphertext data.
3. Te frst chunk of data that has been XOR’ed is further broken down by
another layer of encryption.
4. Tis process is then continued until all of the blocks of ciphertext have been
XOR’ed and enveloped with another layer of encryption.
Cryptography ◾ 91
Tus, this is how CBC got its name. For instance, steps 1–4 create the frst
loop or chain, the second loop or chain is then next initiated, and so on, until the
ciphertext has been fully analyzed and encrypted by this methodology.
1. Key distribution
2. Key storage and recovery
3. Open systems
With regard to the frst one, key distribution, symmetric cryptography requires
the sharing of secret keys between the two parties (sending and receiving), which
requires the implicit trust that this key will not be shared with any other outside
third party. Te only way that any type of secrecy can be achieved in this regard
would be to establish a secure channel.
While this works very well in theory, in practicality, it is not a feasible solution.
For instance, the typical organization would not be able to aford implementing
and deploying such a secure channel, except for the very large corporations and
government entities. Tus, the only other solution available in this circumstance
would be the use of a so-called designated “controller.”
Tis third party would have to be very highly trusted by both the sending and
the receiving parties. But this methodology of trust to create a secure channel can
prove to be a very cumbersome task. For example, imagine a place of business.
Suppose that the chief executive ofcer (CEO) decides to share the keys of the
business with the employees who need access to it at irregular hours. Rather than
trusting the employees explicitly, the CEO could decide to utilize a manager to
whom the employees must give the key when they are done with their job duties,
and from there, this same manager would then give this key to the next employee
who needed access.
Already, one can see that this is a very tedious and time-consuming process,
and to compound this problem even more, the designated controller, in this case
the manager, cannot be trusted either because in between the distribution of keys
to the employees, this manager could very well give these malicious keys to a mali-
cious third party. As a result, this method does not guarantee the secrecy of the key
that is needed to encrypt and decrypt the plaintext message.
In terms of key storage and capacity, let us take the example of a very large
place organization, such as a multinational corporation. Te problem of using the
principles of symmetric cryptography becomes quite simple. First, since there will
92 ◾ Testing and Securing Web Applications
be many more lines of communication between the sending and the receiving par-
ties, the need to implement that many more controllers becomes totally unreal-
istic as well as infeasible. Tus, the distribution of the keys can become a virtual
nightmare.
Second, all of the private keys associated with symmetric cryptography
have to be securely stored somewhere, primarily in a database that resides in
a central server. As is well known, primary and central servers are often prone
to worms, viruses, and other types of malicious software. Compounding this
problem even more is the fact that the larger the number of private keys stored
onto this central server, the greater the chances of the central server being
hacked into.
A way that these private keys can be stolen is if a piece of malicious code is
injected into the intranet of the corporate network, which in turn reaches the data-
base. Tis malicious code then actually covertly hijacks these private keys and sends
them back to the hacker.
Tird, when companies and organizations become large, the chances that
employees will require remote access to the corporate intranet and network
resources become even greater. As a result, the private keys that are used to commu-
nicate between the sending and receiving parties can also be hijacked very quickly
and easily by a hacker who has enough experience and knowledge.
Finally, with an open system, private or symmetric cryptography works best
only when it is used in a very closed or “sterile” environment, where there are, at
best, only just a few sending and receiving parties. But this is not the case with
“open” or public environments, such as our example of the very large corporation.
In these situations, there is simply no way to confrm the authenticity or the integ-
rity of the private keys and their respective ciphertext messages.
So, as one can see, private keys and symmetric cryptography simply are infex-
ible, too costly, and do not scale well for most types of environments. For example,
“solutions that are based on private-key cryptography are not sufcient to deal with
the problem of secure communications in open systems where parties cannot physi-
cally meet, or where parties have transient interactions.”2
Although there will never be a perfect 100% solution that will correct the faws
of symmetric cryptography, there is a partial solution known as the key distribution
center (KDC), which is reviewed next.
It should be noted that these passwords are also encrypted. Now, if one end
user wishes to communicate with another end user on a diferent computer system,
the sending party enters their password into the KDC, using specialized software
called Kerberos. When the password is received by the KDC, Kerberos then uses a
special mathematical algorithm, which adds the receiving party’s information and
converts it over to a cryptographic key.
Once this encrypted key has been established, the KDC sets up and estab-
lishes other keys for the encryption of the communication session between the
sending and the receiving party. Tese other keys are also referred to as the tick-
ets. Tese tickets have a time expiration associated with them, so the ticket will
expire at a predetermined point in time in order to prevent unauthorized use,
and it would also be rendered useless if it is stolen, hijacked, or intercepted by a
third party.
Although the KDC system does provide a partial solution to the shortcomings
of symmetric key cryptography, the KDC also by nature has some major security
faws, such as:
1. Because the KDC contains all of the master keys and the access rules needed
for encrypted communication, the server that contains the KDC system must
be both logically and physically protected all the time. If an attack is success-
ful on the KDC, the entire communications channel within the organization
will completely break down. Also, personnel who have access to the KDC
can easily decrypt the ciphertext messages between all of the sending and
receiving parties.
2. Te KDC process presents a single point of failure for the organization. If
the server containing the KDC crashes, all kinds of secure communications
become impossible, at least on a temporary basis. Also, since all of the end
users will be hitting the KDC at peak times, the processing demands placed
onto the KDC can be very great, thus heightening the chances that very slow
communications between the sending and the receiving parties, or even a
breakdown of the communications system, can also happen.
within the organization who appear to be ofine. For example, if the send-
ing party sends a ciphertext message to the receiving party, and after send-
ing the message they go ofine, the KDC system could just literally “hang”
and maintain an open session indefnitely until the sending party comes
back online again. With this particular algorithm, this problem is averted
by immediately terminating the communication session once either party
goes ofine.
2. Te Digital Encryption Standard algorithm (DES): Tis mathematical algo-
rithm was developed in 1975, and by 1981, it became the de facto algorithm
for symmetric cryptography systems. Tis is a powerful algorithm, as it puts
the ciphertext through 16 iterations in order to ensure full encryption.
3. Te Triple Digit Encryption Standard algorithm (3DES): Tis mathematical
algorithm was developed as an upgrade to the DES algorithm. Te primary
diference between the two of them is that 3DES puts the ciphertext through
three times as many more iterations than the DES algorithm.
4. Te International Data Encryption Algorithm (IDEA): Tis is a newer
mathematical algorithm than 3DES and is constantly shifting the letters
of the ciphertext message around until is decrypted by the receiving party.
It is three times faster than any of the other DES algorithms just reviewed,
and as a result, it does not consume as much processor power as the DES
algorithms do.
5. Te Advanced Encryption Standard algorithm (AES): Tis is the latest
symmetric cryptography algorithm and was developed in 2000, primarily
designed for use by the federal government.
ciphertext between the sending and the receiving parties. Now, in next section,
we look at an entirely diferent methodology: asymmetric key cryptography.
With this type of methodology, not just one key is used, but rather, two keys
are used.
Now, this brings up a very important point: Te public key is literally “public,”
meaning that anybody can use it, even all of the hackers in the world. So, how does
asymmetric cryptography remain secure? It remains so based on the privacy of the
private key (sk) which is being utilized. In these cases, it is then up to the receiving
party now to share the private key (sk) with any other party, no matter how much
they are trusted.
If the privacy of the sk is compromised in any way, then the security scheme
of asymmetric cryptography is totally compromised. In order to help ensure that
the private keys remain private, asymmetric cryptography uses the power of prime
numbers. Te basic idea here is to create a very large prime number as a product of
multiplying two other very large prime numbers together.
Mathematically speaking, the basic premise is that it will take a hacker a very
long time to fgure out the two prime number multiples of a very large product,
which is several hundred integers long, and thus, give up in frustration. Even if a
hacker were to spend the time to fgure out one of these prime numbers, the hacker
still has to fgure out the other prime number, and the chances that they will fgure
this out is almost nil.
As a result, only one portion of the (pk, sk) is fgured out, and the asymmet-
ric cryptography technique utilized by the sending and the receiving parties still
remains intact and secure. In other words, the hacker cannot reverse-engineer one
key to get the other key to break the ciphertext. It should also be noted than in
asymmetric key cryptography, the same public key can be used by multiple, difer-
ent sending parties to communicate with the single receiving party, thus forming a
one-to-many, or 1:N, mathematical relationship.
shared with anybody else or even intercepted by a third party. But with asymmetric
cryptography, the public key can be shared virtually indiscriminately, without the
fear of compromising security.
Second, symmetric cryptography utilizes the same secret key for the encryption
and decryption of the ciphertext, but with asymmetric cryptography, two diferent
keys (namely the public and the private keys) are used for the encryption and the
decryption of the ciphertext.
In other words, in asymmetric cryptography, the roles of the sender and the
receiver are not interchangeable with one another, like with symmetric cryptogra-
phy. Tis means that with asymmetric cryptography, the communication is only
one way. As discussed, because of this, multiple senders can send their ciphertext to
just one receiver, but in symmetric cryptography, only one sending party can com-
municate with just one receiving party.
Also, asymmetric cryptography possesses two key advantages: (1) It allows for
the sending party(ies) and the receiving party to communicate with another, even
if their lines of communication are being observed by a third party; and (2) because
of the multiple-key nature, the receiving party needs to keep only one private key to
communicate with the multiple sending parties.
1. Te RSA algorithm
2. Te Dife-Hellman algorithm
3. Te elliptical wave theory algorithm
In terms of the RSA Algorithm, this is probably the most famous and widely
used asymmetric cryptography algorithm. In fact, this very algorithm will serve
as the foundation for the discussion on biocryptography later in this chapter. Te
RSA algorithm originates from the RSA Data Security Corporation, and is named
after the inventors who created it: Ron Rivest, Adi Shamir, and Leonard Adelman.
Te RSA algorithm uses the power of prime numbers to create both the public
key and the private key. But using such large keys to encrypt such large amounts of
data is totally infeasible from the standpoint of processing power and central server
resources. Instead, ironically, the encryption is done using symmetric algorithms
(such as the ones reviewed previously), then the private key is further encrypted by
the receiving party’s public key.
Once the receiving party obtains their ciphertext from the sending party, the
private key generated by the symmetric cryptography algorithm is decrypted, and
then the public key that was generated by asymmetric cryptography can be subse-
quently used to decrypt the rest of the ciphertext.
In terms of the Dife-Hellman asymmetric algorithm, it is named after its
inventors as well: Whit Dife and Martin Hellman. It is also known as the DH
algorithm. But interestingly enough, this algorithm is not used for the encryption
of the ciphertext; rather, the main concern is to address the problem of fnding a
solution of the issue of sending a key over a secure channel.
Here is a summary of how it works on a very simple level:
1. Te receiving party, as usual, has the public key and the private key that they
have generated, but this time, they both are created by the DH algorithm.
2. Te sending party receives the public key generated by the receiving party
and uses this DH algorithm to generate another set of public keys and private
keys, but on a temporary basis.
3. Te sending party now takes this newly created temporary private key and
the public key sent by the receiving party to generate a random, secret num-
ber – this is known as the session key.
Cryptography ◾ 99
4. Te sending party uses this newly established session key to encrypt the
ciphertext message and sends this to the receiving party, with the public key
that they have temporarily generated.
5. When the receiving party fnally receives the ciphertext from the sending
party, the session key can now be derived mathematically.
6. Once the previous step has been completed, the receiving party can now
decrypt the rest of the ciphertext.
Finally, elliptical wave theory can be used to encrypt very large amounts
of data, and its main advantage is that it is very quick and does not require a
lot of server overhead or processing time. As its name implies, elliptical wave
theory first starts with a parabolic curve drawn on a normal x,y coordinate
Cartesian plane.
After the frst series of X and Y coordinates are plotted, various lines are
then drawn through the image of the curve, and this process continues until
many more curves are created and their corresponding intersecting lines are also
created.
Once this process has been completed, the plotted X and Y coordinates of each
of the intersected lines and parabolic curves are then extracted. Once this extrac-
tion has been completed, all of the hundreds and hundreds of X and Y coordinates
are then added together in order to create the public and the private keys. But the
trick to decrypting a ciphertext message encrypted by elliptical wave theory is that
the receiving party has to know the shape of the original elliptical curve and all of
the X and Y coordinates of the lines where they intersect with the various curves, as
well as the actual starting point at which the addition of the X and Y coordinates
frst started.
Tis entity is usually an outside third party that hosts the technological infra-
structure needed to initiate, create, and distribute the digital certifcates. In a very
macro view, the PKI consists of the following components:
1. Te certifcate authority (CA): Tis is the outside third party who issues the
digital certifcates.
2. Te digital certifcate: As mentioned, this consists of both the private key
and the public key, which are issued by the CA. Tis is also the entity that
the end user would go to in case he or she needed to have a digital certifcate
verifed. Tese digital certifcates are typically kept in the local computer of
the employee, or even the central server at the organization.
3. Te LDAP or X.500 directories: Tese are the databases that collect and dis-
tribute the digital certifcates from the CA.
4. Te registration authority (RA): If the organization is very large (such as a
multinational corporation), this entity usually handles the requests for the
required digital certifcates and then transmits those requests to the CA to
process and create the required digital certifcates.
In terms of the CA, in extremely simple terms, it can be viewed as the main govern-
ing body, or even the “king” of the PKI. In order to start using the PKI to com-
municate with others, it is the CA that issues the digital certifcates, which consist
of both the public and private keys.
7. Te Subject Distinguished Name: Tis is the name that specifes the digital
certifcate owner.
8. Te Subject Alternate Name Email: Tis specifes the digital certifcate’s
owner email address (this is where the actual digital certifcates go to).
9. Te Subject Name URL: Tis is the web address of the organization to whom
the digital certifcates are issued.
these rules and policies, the following is just a sampling of some of the topics that
need to be addressed:
1. Where and how the records and the audit logs of the CA are to be kept,
stored, and archived
2. Te administrative roles for the CA
3. Where and how the public keys and the private keys are to be kept, stored,
and backed up
4. Te length of time for which the public keys and the private keys will be
stored
5. If public or private key recovery will be allowed by the CA
6. Te length of the validity period for both the public keys and private keys
7. Te technique whereby the CA can delegate the responsibilities to the RA
8. If the digital certifcates issued by the CA will be used for applications and
resources
9. If the digital certifcates issued by the CA will be used for the sole purpose of
encrypting the ciphertext
10. If there are any types of applications that should be refused digital certifcates
11. When a digital certifcate is initially authorized by the CA, if there will be a
fnite period when the digital certifcate will be subject to revocation
As one can see, based upon the establishment of the many rules and policies
that need to be set in place, the actual deployment and establishment of a PKI can
become quite complex, depending upon the size and the need of the particular
business or organization.
when they are frst issued, they can also be revoked for any reason at any time by
the PKI administrator.
In order to accomplish this specifc task, a certifcate revocation list (CRL) is
used. Tis list is composed of the digital certifcate serial numbers that have been
assigned by the CA. But looking this type of information and data can be very
taxing on system resources and processes. Terefore, it is obviously much easier to
reissue the digital certifcates as they expire, rather than revoke them and having to
reissue them again, and of course, this would mean that the PKI system adminis-
trator would then have to update the CRL.
In the world of the PKI, it should be remembered that the public keys and the
private keys (also known as the digital certifcates) are created instantaneously and
all of the time. In fact, public keys and private keys are everywhere in a PKI, even
when one establishes a Secure Shell (SSH) connection over the Internet with their
particular brand of web browser (this typically uses 128-bit encryption).
In fact, there are even public keys and private keys in the PKI that are only used
once, terminated, and discarded. Tese types of public keys and private keys are
known more commonly as session keys. Public keys and private keys are nothing
more than computer fles.
Also, in order to keep hackers at bay, it is equally important that not all of the
public keys and the private keys be used all the time in the communication process
between the sending and the receiving parties. It is also important to keep the pub-
lic keys and the private keys fresh, or in other words, it is important to introduce
randomness into the PKI.
Such randomness is known as entropy, and this entropy is created by what
are known as random number generators and pseudo-random number generators.
Also, in a PKI, there are diferent classes of public keys and private keys. Here is a
listing of just some of these classes:
1. Signing Keys: Tese are the keys to create the digital signatures.
2. Authentication Keys: Tese are the keys that are created to authenticate comput-
ers, servers, and the receiving parties and the sending parties with one another.
3. Data Encryption Keys: Tese are the keys that are used to encrypt the fles.
4. Session Keys: Tese are the keys that are used to help secure a channel across
an entire network for only a very short period.
5. Key Encryption Keys: Tese types of keys literally wrap the ciphertext to
provide further protection between the sending and the receiving parties.
6. Reof Key: Tis is the master that is used for signing all of the other public
keys and private keys which originate specifcally from the CA.
Cryptography ◾ 105
Security Policies
Te exact mechanism as to how to exactly establish a specifc security policy is
beyond the scope of this book, but the security policy should cover, at a minimum,
the following key issues as it relates to public key and private key generation and
distribution:
1. Te individuals who are authorized to access the key server (assuming that
the place of business or organization has one)
2. Who within the organization is even allowed to use the public keys and the
private keys at all
3. What types of ciphertexts and content messages and even corporate
data can use the encryption methods provided by the public and the
private keys
4. If the public key and private key generation and distribution processes are
outsourced to a third party, who at the organization has the authority to issue
public keys and private keys after they have been generated
5. Te specifc requirements for the employees within the organization to obtain
both public keys and private keys
106 ◾ Testing and Securing Web Applications
1. Key escrow
2. Key recovery
Key escrow refers to the storage of the public keys and the private keys at a safe
location, and key recovery refers to breaking up the public keys and the private keys
at the point of origin (which would be when the ciphertext is sent from the send-
ing party) and putting them back together once again at the point of destination
(which would be when the receiving party receives the ciphertext).
RTYHDHDHjjjdd8585858hd0909344jdjdjdjMNGDfsweqwecbthrdn*&^%gh$
Tis helps ensure the integrity of the ciphertext while it is in transit across
the network medium. In other words, this is proof positive for the receiving
party who receives the ciphertext from the sending party that the ciphertext has
remained intact while it has been in transit and it has not been altered in any way,
shape, or form.
Te hash or the message digest can be viewed as a fngerprint of the ciphertext.
Te hash is actually created at the point of receiving party, and it is then calculated
again via a mathematical algorithm utilized by the receiving party. If the math-
ematical algorithm used by the receiving party generates the same type of garbled
data message such as the one shown in the example, the receiving party can be
100% sure that the ciphertext they have received from the sending party is the
original ciphertext at the point of origination and has remained intact.
a result, the receiving party is fooled into believing that this new, altered cipher-
text and the new altered hash are the originals sent by the sending party, while
the hacker keeps the actual ciphertext and hash that was generated the frst time
around.
To fx this major security vulnerability, the ciphertext is combined with a “secret
key” at the point of origination frst, and then the hash is created. As a result, this
hash will contain specifc information and data about the secret key itself. In turn,
the receiving party can be further ensured that the ciphertext they have received is
the original sent by the sending party. Tis is so because even if the ciphertext, the
hash, and the associated secret key were to be intercepted, there is very little that a
hacker can do to alter the ciphertext and its associated hash, because they need the
information and data about the secret key, which is, of course, something they will
never gain access to.
1. Confusion:
Tis the part of the DES in which the relationship between the key that is
being used (either public or private) and the ciphertext that it is protecting
becomes invisible to the outside world. Tis type of association can be typi-
cally found wherever the substitution principle is needed in securing the con-
fdential information and data that resides upon the web application server.
2. Difusion:
Tis is an encryption process where the characters used in the plaintext mes-
sage are actually distributed over many other ciphertext-based symbols that
are used to mask it from the outside world. A common element that is used
here is known as bit permutation.
in order to build a strong cipher that cannot be easily broken into by the cyberat-
tacker. Combining both confusion and difusion is known as the product cipher.
In fact, any modern block cipher, but especially DES, possesses high-level prop-
erties that are based upon difusion. For example, moving around a bit of a plain-
text message will result in the change of at least half of the output bits that have
been formulated. In other words, the second half of the ciphertext that has been
create from the plaintext message is statistically independent of the frst half of the
output bits that have been created. Tis is a crucial aspect of the DES.
Te DES is a cipher that has the capability to encrypt blocks of plaintext mes-
sages that are at least 64 bits in length and possesses a key size of 56 bits. It is also
what is known as a symmetric cipher, which means that the same key is very often
used for both encrypting and decrypting the plaintext message. It is essentially an
iterative-based mathematical algorithm, which simply means that for each block
of a plaintext message that needs to be protected, the level of encryption that is
provided is accomplished in 16 distinctive rounds.
In each of these 16 separate rounds, a unique subkey is created and utilized
(which can be denoted as “Ki”) from the main key (which can be denoted as “K”).
Most of the time, the DES makes use of what is known technically as the Feistel
network. One of the key advantages of utilizing this is that both the encryption and
the decryption processes can be conducted by using the same operation.
For example, the decryption of the Feistel network only needs what is known as
a reversed key schedule. Tis can provide strong advantages in terms of protecting
both the web application and its server. Here is how the Feistel network carries out
its operations:
Li = Ri − 1;
Ri = Li XOR f (Ri − 1, ki)
Cryptography ◾ 109
where i = all of the rounds from the starting point of 1 to the ending point of 16.
It is important to note that the fnal permutation, which is round 16, can be
represented as IP ^ –1. Tis is actually the last round of processing from within the
DES. Te key schedule also manages and maintains each of the 16 rounds.
Also, the Feistel network provides encryption initially in the frst half of the
input bits in each of the successive 16 rounds, which is at the left side of the par-
ticular input. In order to start the encryption process for the right side of the Feistel
network, it is literally copied to the next round (which at frst would be round 2).
Finally, the confusion and difusion properties, as described before, can only be
achieved from within the f-based function. Te Feistel network actually becomes
more secure as the processing of each of the 16 rounds is completed.
The f-Function
Tis functionality plays a very critical role in securing the DES. For instance, in
round “i” (of the 16 total rounds that are available, as previously described), it usu-
ally takes the right half of Ri – 1 from the output of the previous round, as well as
the current round key that is being used, “ki,” as the input. As a result, the output
of the f-function is then statistically XOR’ed in order to encrypt the left half of the
round, which is denoted as “li – 1.”
Te mathematical structure of the f-function is represented as a 32-bit input,
which can be expanded out to 48 bits. Tis is done by partitioning the input into
eight diferent 4-bit blocks, and from there, expanding each of the blocks into 6 bits.
Tis type of permutation consists of 4 bits (which are 1, 2, 3, and 4, respectively)
in the frst block. In succession, the second block consists of 4 bits (which are
5, 6, 7, and 8, respectively).
110 ◾ Testing and Securing Web Applications
Once the result of 48 bits has been achieved, the expansion is then statistically
XOR’ed with any round key that is available, which is represented as “ki.” From
here, the eight 6-bit blocks are then are divided up into what are known as substitu-
tion boxes (S-boxes), in which there are eight of them in total. Tis is what actually
defnes the overall cryptographic strength of the DES. In fact, the S-box is the only
nonlinear component that exists in the DES algorithm.
Tis nonlinearity can be expressed mathematically as follows:
Interestingly, if this nonlinearity did not exist in the DES, it is quite possible that
it could be hacked into by a cyberattacker. In the end, the 32-bit output that has
been created is permuted in a bit-wise fashion. Te mathematical difusion that is
introduced by the S-box is known technically as the avalanche efect.
◾ In the rounds of where i = 1, 2, 9, and 16, the two halves are shifted to the
left by one bit.
◾ In the other successive rounds in which i =/1, 2, 9, and 16, respectively, the
two halves are shifted by two bits.
Tese parameters only apply to either of the two 28-bit halves. Tus, the total
number of the rotation positions can be mathematically represented as follows:
4 *1+ 12 * 2 = 28
OR
Tis mathematical property is actually needed for the decryption process of the
DES algorithm to occur.
In order to compute the key of “k15,” the intermediate variables of C15 and
D15 are needed. Tese can be also mathematically derived from (C16,D16). But it
should be noted that this can only be done via a right shift, and this is mathemati-
cally represented as:
K15 = PC − 2(C15,D15)
= PC − 2[RS2(C16), RS2(D16)]
= PC − 2[RS2(Co), RS2(Do0)]
Tese rounds keep continuing, via the right-shift approach, until the key of
“k1” is reached.
round 15, etc. Tis cycle keeps continuing until the reversal of the encryption in
round 1 has been reached. Tis is mathematically represented as:
where:
Lo^d = R16
Ro^d = R15
It is the last mathematical equation that is the most crucial. For example, a duplicate
f-function is XOR’ed in two distinct patterns to L15. As a result, both of these subpro-
cesses cancel each other out, so that R1^d = L15 in the end. Because of this, we now have
proven that the frst decryption round reverses or cancels out the last encryption round.
As mentioned previously, this subprocess will keep going on until all of the 15 decryption
rounds have been examined. Mathematically, this can be represented as follows:
L1^d = R16 − i;
R1^d = L16 − i
where i = the total number of iterations.
Finally, once all 16 decryption subprocesses have been analyzed, the entire process
as just detailed is completely reversed. Te mathematical equation for this is as follows:
IP^-1(R^16, L16^d) = Ip^-1(Lo, Ro) = IP^-1 [IP(x)] = x
where x = the plaintext that was used as the input initially in the DES algorithm.
Cryptography ◾ 113
◾ Te diferential cryptanalysis
◾ Te linear cryptanalysis
Also, the DES algorithm, in a manner similar to that of the AES algorithm, can
compute any round key or subkey as (Ko, K1, K2, etc.).
◾ Finite felds
◾ Prime felds
◾ Extension felds
◾ Sophisticated addition and subtraction
◾ Sophisticated multiplication
◾ Inversions in GF (2^m)
A feld with order ‘m’ only exists if ‘m’ is a prime power, where m = p^n.
In this case, ‘n’ is a positive integer, and ‘p’ is a prime integer. ‘p’ is also
referred as the characteristic of the fnite feld.3
In other words, when there is a fnite feld that consists of 11, 81 (where 81 =
3^4), or 256 elements (where 256 = 2^8), there is no fnite feld with just 12
elements, because 12 = 2^2 * 3, and 12 is not a prime number.
2. Te Prime Fields:
Probably the best examples of the fnite feld are those where the felds consist
of a distinct prime order, where n = 1. Te elements in this kind of feld can
be denoted as integers of 0, 1, and p – 1. Te two types that can be conducted
in this feld are known as modular integer addition and integer multiplication
with the modulo P. Tis is based upon the following mathematical theorem:
In other words, if the integer ring of Z^m consists of any integers with a
modular-based addition and multiplication property, and if ‘m’ is a prime
number, then Z^m can also be considered as a fnite feld. In order to do any
sort of mathematical calculations in the prime feld, the following integer
ring rules must be followed:
– Any addition or multiplication operations are carried out with the modulo P,
where the additive inverse can be referred to as:
a + (− a) = mod ‘p’
– Te multiplicative inverse can be represented as follows:
a*a^-1 = 1
It should be noted that another crucial prime feld in this instance is that
of GF(2) = (0,1), where any mathematical calculations are done with the
modulo 2.
3. Te Extension Fields:
In the AES algorithm, any fnite feld consists of at least 256 elements, and
this can be represented as “GF (2^28).” In this, each of the elements are
116 ◾ Testing and Securing Web Applications
In this instance, 256 distinct polynomial groupings are possible, and can be
stored as an 8-bit vector as follows:
5. Sophisticated Multiplication:
Any sort of multiplicative operations carried out in the AES algorithm makes
use of the MixColumn transformation procedures and polynomials in a fnite
feld. Tis is mathematically represented as:
where:
C ˜o = AoBo Mod2
C ˜1 = AoB1+ A1Bo Mod2
C ˜2m-2 = Am-1^Bm-1Mod2
It is also important to note that the AES algorithm makes use of what are
known as irreducible polynomials, and are mathematically represented as
follows:
P(x) = x^8 + x^4 + x^3 + x + 1
Cryptography ◾ 117
6. Inversions in GF (2^m):
Tis is considered a crucial property in the AES algorithm. Tis can also be
mathematically represented as follows:
It should be noted at this point that the AES algorithm is actually a byte-
oriented cipher, unlike the DES algorithm, which makes extensive usage of bit
permutations, thus giving it the bit-oriented infrastructure.
Tese will now be reviewed in further detail.
S(Ai) = Bi
where state byte Ai is replaced by another state byte, Bi. It should be noted
that the S-box is the only component in the AES algorithm that is not linear
in nature. For example:
Te technical mapping of the S-box substitution (as just described) is done via
bijective mapping. In other words, each of the 2^8 = 256 inputs is matched
up on a one-to-one basis to each output element. Tis unique feature allows
the S-box to be reversed, which is a crucial component that is needed in order
to decrypt the AES algorithm. Another key component of the AES algo-
rithm is the GF (2^28) functionality. Because of its nonlinearity properties, it
afords a strong level of protection against cyberthreats that are posed to the
AES algorithm.
118 ◾ Testing and Securing Web Applications
2. Te Difusion Layer:
Tis layer of the AES algorithm consists of two diferent sublayers, which are
described as follows:
– Te ShiftRows Sublayer: Using a cyclical approach, this mechanism
shifts the second row of the state matrix by 3 bytes to the right, the third
row by 2 bytes at a time to the right, and fnally, the fourth row by 1 byte
to the right. Te ultimate goal of these shifting processes is to further
enhance the difusion properties of the AES algorithm.
– The MixColumn Sublayer: This is actually a linear-based transfor-
mation which integrates, or mixes, each column of the state matrix.
In this particular instance, each and every input byte has a direct
influence over the four corresponding out bytes. This sublayer is
deemed to be the major diffusion component of the AES algorithm.
In fact, it is the combination of these two sublayers that it makes it
possible to have just three rounds where each and every byte of the
state matrix can produce 16 plaintext bytes. This can be represented
as follows:
MixColumn (B) = C
where:
B = Input State
C = Output State
Keep in mind that, as discussed in the last subsection, the concept of difu-
sion is literally spreading the infuence as to how individual bits are calculated
over an entire state. Tis sublayer does not consist of an S-box that is nonlin-
ear in nature; rather, it makes use of state matrices.
3. Te Key Addition Layer:
Tis internal part of the AES algorithm takes the input key that was
frst derived (these are the 128, 192, and 256 bits as discussed earlier),
and from there, derives the subkeys that are used in the AES algorithm.
Te XOR process is utilized here, and it can also be referred to as “key
whitening.” In this situation, the total number of subkeys is equal to the
total number of rounds that were needed to process them, plus one extra
round.
Te total number of rounds for each bit size is as follows:
– 128 Bits: 10 rounds are needed (Nr = 10) and 11 subkeys are generated,
which has a key length of 128 bits.
– 192 Bits: 10 rounds are needed (Nr = 10) and 13 subkeys are generated,
which has a key length of 128 bits.
– 256 Bits: 10 rounds are needed (Nr = 10) and 15 subkeys are generated,
which has a key length of 128 bits.
Cryptography ◾ 119
Tese subkeys are produced recursively, and it is important to note that the
AES key schedule is what is known as “word-oriented.” In this case, one word
is equal to 32 bits. Tere are diferent schedules for the 128 bits, 192 bits, and
256 bits, which are as follows:
– Te Key Schedule for the 128-Bit Key:
Te 11 subkeys are stored in a key expansion array, with the individual
elements represented as follows:
W[0}, . . . . W[43]
Te frst subkey (k0) is actually the original AES algorithm key, and this
is then transferred to the frst four elements of the key array (W). Te
other elements are calculated as follows, based upon this mathematical
formula:
Tese representations keep going until all eight iterations are completed.
– Te Key Schedule for the 256-Bit Key:
Tis particular schedule has 15 subkeys, each with 256-bit keys. It con-
sists of seven distinct iterations, which are calculated in the same fashion
as just described. Tere are also seven round coefcients, which are rep-
resented as follows:
RC[1] . . . . RC[7]
120 ◾ Testing and Securing Web Applications
Also, the order of the subkeys is reversed, because the last round of encryption in
the AES algorithm does not calculate the MixColumn layer and the frst decryp-
tion round will not contain an inverse layer. But all of the other rounds will contain
some sort of AES layer.
the four input bytes (C4, C5, C6, C7) in a repetitive fashion. Te constant
values in the state matrix are in a hexadecimal format, which is mathemati-
cally represented as follows:
Ai = S^-1(Bi) = S^-1[S(Ai)]
Te second step of the inverse of the Galois feld is also calculated, with fol-
lowing formula:
Ai = (B°i)^-1 = GF(2^8)
plaintext into the ciphertext, and the private key is used to decrypt the ciphertext
back into a more decipherable format that is comprehensible and easy to understand.
Tus in this regard, it is the sending party that uses the public key and the
receiving party that uses the private key. Because of this, the asymmetric cryptog-
raphy approach ofers much more layers of security than when compared to the
symmetric cryptography approach. But the primary disadvantage here is that by
using two separate keys, this process requires much computational and process-
ing power.
Asymmetric cryptography is also known as “public key cryptography,” and it
can be defned as:
1. Key Establishment:
Numerous cryptographic algorithms can be used in this regard, such as
the Dife-Hellman Key Exchange (DHKE) and the RSA Key Transport
Protocols. Tese algorithms are very robust in nature, and are used quite
often in this kind of infrastructure.
2. Nonrepudiation:
Using asymmetric cryptography (or public key cryptography) provides
very strong levels of ciphertext integrity by using the previously mentioned
algorithms.
3. Identifcation:
Te authenticity of both the sending and receiving parties can be confrmed
easily by making use of a challenge–response system along with digital
signatures.
Tis is where other security protocols are used in conjunction with the earlier
mentioned protocols as well, such as SSL/TLS and IPSec, as reviewed exten-
sively in Chapter 1.
It should be noted that another disadvantage of asymmetric cryptography is
that it requires the use of very long keys. While this afords very strong layers
of protection, such long keys can actually greatly slow down both the encryption
and the decryption processes. Apart from the two cryptographic algorithms just
described, there are other important cryptographic algorithms that need to be
reviewed as well, as follows:
Each of these can be used to further enhance the mechanisms of the public key
and private key establishment and provide nonrepudiation via the use of digital
signatures. Te following table depicts the recommended bit key lengths for all of
the asymmetric cryptography algorithms just discussed3:
Integer RSA 1024 bit 3072 bit 7680 bit 15630 bit
Factorization
Discrete DH, DSA, 1024 bit 3072 bit 7680 bit 15360 bit
Logarithm Elgamal
Elliptic Curves ECDH, ECDSA 160 bit 256 bit 384 bit 512 bit
Symmetric Key AES, 3DES 80 bit 128 bit 192 bit 256 bit
Based upon this table, the complexity of these algorithms will expand the cor-
responding cube bit length. With the RSA algorithm, if the bit length is increased
from 1024 bits to 3076 bits, this will result in a processing speed that is 3^3 (which
equals 27) slower than normal. As one can see, this can cause a huge constraint on
both the web application and the server that it resides upon.
124 ◾ Testing and Securing Web Applications
◾ Te Euclidean algorithm
◾ Te extended Euclidean algorithm
◾ Te Euler Phi function
◾ Fermat’s little theorem
1. Te Euclidean Algorithm:
Tis algorithm frst starts by calculating what is known as the greatest
common divisor (GCD). It is represented by the following formula:
GCD(R0,R1) = GCD(R1,0) = Ri
where the variables “s” and “t” represent the integer coefcients. Tis is
also often referred to as the Diophantine equation. In turn, the variables
“s” and “t” are computed by the following mathematical formula:
0/(M)
0/(M) = Pi^ei-Pi^ei-1
Tis factorization process is very important for the RSA algorithm, which
facilitates the decryption of the ciphertext.
4. Fermat’s Little Teorem:
Tis algorithm can be used to check the efciency of a public key cryp-
tography infrastructure. It is represented as follows:
A^p-1 = 1(mod p)
A^P = a(mod p)
Tis theorem can be further extended into what is known as Euler’s theo-
rem, which is mathematically described as follows:
A^0/(M) = 1(mod m)
upon, primarily because of the robust security features that it possesses. It is used
primarily to encrypt data while it is transit (for example, from the device of the
end user to the web application server and vice versa) and for creating digital
signature.
It is important to note at this point that the RSA algorithm has not been
designed to replace symmetric-based ciphers; rather, its main purpose is to make
use of its specialized encryption functionality for creating key exchanges in con-
junction with a symmetric-based cipher.
In this particular instance, the RSA algorithm is often used in association
with the AES algorithm, because of its symmetric-based cipher functionalities.
Te mathematical premise behind the RSA algorithm is integer factorization. For
example, multiplying two very large prime numbers and obtaining its product is
quite easy to compute. But factoring this product is very difcult to do, which gives
the RSA algorithm one of its key strengths. In fact, Euler’s theorem and Euler’s Phi
function (which were reviewed in the last subsection) are also quite heavily used
with the RSA algorithm. Te encryption process in the RSA algorithm is math-
ematically represented as follows:
where:
Kpub = the public key (n,e)
X = the plaintext
where:
Kpr = the private key (d)
Y = the ciphertext
It should be noted at this point that x, y, n, and d are extremely large numbers,
about 1024 bits in total length, and even larger. Te variable “e” is also referred
to as the encryption or public exponent, and the variable “d” is also referred to as
the decryption or private exponent. Further, the general requirements for the RSA
algorithm are as follows:
◾ Since a cyberattacker potentially has access to the public key, the private key
cannot be computed given the values of “e” and “n.”
◾ One cannot encrypt more than “l” bits of ciphertext in the cases where “l” is
greater than the specifed bit length of “n.”
Cryptography ◾ 127
◾ x^e mod n must be easy to calculate for the purposes of encryption, as well as
y^d mod n for the purposes of decryption
◾ For any given value of “n,” there should be many more public key and private
key pairs in order to avoid what is known as a brute-force attack.
◾ First, two extremely large integers are chosen, denoted by the variables of
“p” and “q.”
◾ Second, the computation where n = p*q is done,.
◾ Tird, 0/(n) is computed by (p-1)*(q-1).
◾ Fourth, the public exponent “e” is calculated, where e = [1, 2 … . O/(n) -1] so
that the GCD can be computed as follows:
GCD[e, O/(n)] = 1
where:
SQ = the Squaring Component
MUL = the Multiplication Component
Te values for the exponents noted in this equation are typically chosen in the
bit range of 1024 to 3072, and where it is needed, even larger than this. Tis is also
known as the square and multiple algorithm. One of the key advantages to this
128 ◾ Testing and Securing Web Applications
approach is that it provides the means in which to perform both the squaring and
the multiplicative functions of “x” by computing “x^H.”
Also, this algorithm works by literally examining the exponential bit from the
left to the right. In each and every iteration, the result that is yielded is ultimately
squared. But this exponent that has been examined must contain a numerical value
of at least 1. Tis is the only instance where the current result is multiplied “x” fol-
lowing any squaring computations that may have transpired earlier.
3 11(2) 2
17 10002(2) 5
2^16 + 1 1000000000000000(2) 17
It is important to note at this point that the use of short public key exponentia-
tion from within the RSA algorithm is considered to be mathematically a very fast
process, which requires an even lesser amount of computational and processing
power than the technique just reviewed. But even with this newer technique, the
RSA algorithm could still theoretically slow down if the private key with a value of
“d” is utilized for decryption purposes.
Xp = x mod p
Xq = x mod q
Cryptography ◾ 129
Yp = Xp^dp mod p
Yq = Xq^dq mod q
Dp = d mod (p-1)
Dq = d mod (q-1)
Once this has been done, the values of “Dp” and “Dq” are now bounded with
other values of “p” and “q.” Tis is also true of the values of “Yp” and “Yq.”
3. Te Inverse Transformation
Te very last step to be done is taking the values of “Yp” and “Yq” and con-
verting them into a modular-like expression. Tis is mathematically done as:
N = p*q
130 ◾ Testing and Securing Web Applications
The prime numbers that are used in this calculation must contain half
of the bit length of “N.” For instance, if we assign the bit value of 1024 to
“N,” then the values of “p” and “q” should each be about 512 bits in length.
In order to pick any large prime number, the general approach is to generate
these integers at random, which is done by a tool known as the random num-
ber generator (RNG). The RNG must be nonpredictable in nature, because
if a cyberattacker can guess one of these two large prime numbers, the RSA
algorithm can then be broken into quite easily. Further, two questions must
also be answered:
◾ How many random integers need to be generated and tested before it is deter-
mined that an integer is actually prime or not?
◾ How quickly can be this process actually be achieved?
Te answer to the frst question is actually dependent upon the probability laws
of statistics. In this case, the prime number theory is used, which is mathematically
defned as follows:
P = 2/ln(p)
FOR i = 1 TO x
Choose random a>= (2, 3, . . . . p-2)
IF a^p-1 =/1
RETURN ("p is a composite number")
RETURN ("p is likely prime")
One of the advantages of this kind of test is that it can be used for all
large prime numbers that need to be tested. For example, if a^p-1 =/1,
then it is deemed not to be a prime number. But the disadvantage is that
this process cannot work in the reverse fashion. In other words, there
could be other numbers that are not detected which could be very well be
Cryptography ◾ 131
FOR i=1 TO s
Choose random a>= (2, 3, . . . . p-2)
Z=a^r mod p
IF z=/1 and z=/p-1
FOR j=1 TO u-1
Z=z^2 mod p
IF z=1
RETURN (“p is a composite number”)
IF Z=/p-1
RETURN (“p is a composite number”)
RETURN (“p is likely a prime number”)
1. It is deterministic in nature:
A specifc piece of plaintext message is always attached to a specifc piece of
associated ciphertext. If the same private key is used, it is likely that the cyber-
attacker will be able to derive certain relationships between the plaintext and
the ciphertext, and from there, launch the cyberattack.
2. Te inverse numerical values:
In this instance, if the plaintext values are x=0, x=1, x=2, etc., the same
inverse will hold true of the ciphertext: x=0, x= −1, x= −2, etc.
3. Small values:
If, for some reason, small numerical values are used in the RSA algorithm, it
could prove vulnerable to a cyberattack. Tus, that is why very large prime
integers are used for this very reason.
4. Malleability:
Tis occurs when a cyberattacker can take a ciphertext, transform it into
another variant of a ciphertext, and from there convert it into a plaintext.
Mathematically, this is represented as:
K− M −2H −2
2. Concatenation then occurs, where a single byte is created with a known hexa-
decimal value of 0 × 01, and a data block, known as “DB,” is created of a
length k - |H| -1 bytes. Tis is represented as:
DB = HASH(L) PS 0x01 M
EM = 0 x 00 maskedSeed maskedDB
◾ Protocol attacks
◾ Mathematical attacks
◾ Side-channel attacks
1. Protocol Attacks:
Tese kinds of cyberattacks exploit both the known and the unknown
vulnerabilities that are in the RSA algorithm. In this regard, the most
common threat vehicles are those that target its malleability. But the con-
cept of padding (reviewed in the last subsection) can be used to help
mitigate the risk of this kind of cyberattack from happening.
Cryptography ◾ 133
2. Mathematical Attacks:
Te main type of cyberattack that occurs here is when the factoring pro-
cess is not computed properly, in which the value of 0/(n) can be calcu-
lated easily the cyberattacker. It can be done in the following three-step
mathematical process:
0/(n) = (p-1)*(q-1)
D^-1 = e mod 0/(n)
X = y^d mod n
In order to avoid this kind of cyberattack, the value of the modulus must
contain a very large integer. An ideal value here is at least 1024 bits or
higher, typically in the range of 2048 bits to 4096 bits.
3. Side-Channel Attacks:
Tese types of cyberattacks further exploit the weaknesses and the vulner-
abilities that are found in the private key. Tis is typically accomplished
using physical-based channels in the asymmetric (public key cryptogra-
phy) infrastructure. One mathematical method that is used to avoid this
is to execute a multiplication with various kinds of dummy variables that
are associated with an exponent bit value of “0.”
1. A prime number, known as “p,” is generated where 2^1023 < p < 2^1024.
2. A prime divisor, known as “q,” is then found of p-1 where 2^159 < q < 2^160.
3. An element “A” is then found where ORD(A) = q.
4. A random number of “d” is then selected where 0 < d < q.
5. A value known as “B” is then calculated where A^d mod p.
6. Te generated public key and private key are now as follows:
Kpub = (p, q, A, B)
Kpr = (d)
134 ◾ Testing and Securing Web Applications
Te basis of the DSA is that two cyclical mathematical groups are involved.
One of these is the larger group that is represented as “Zp^0/,” which has a total
length of 1024 bits. Te second grouping is actually a subset of “Zp^0/,” which is
160 bits long.
(A^ke mod p)
[SHA(x) + d * r]Ke^-1mod q
Te two digital signatures, once computed in the earlier process, are then com-
pletely verifed by making use of this methodology:
W = S^-1mod q
U1 = w*SHA(x) mod q
U2 = w*r mod q
Find a prime number denoted as "q" where 2^150 < q < 2^160
FOR i=1 to 4096
Generate a random integer known as "M" with 2^2013 < M <
2^1024
Mr=M mod 2q
p-1 = M-M1
If the value "p" is prime
RETURN (p,q)
I=i+1
GOTO the first step
D = log aB mod p
Finally, in order to help mitigate these two types of cyberattacks, the NIST
recommends the following bit lengths for the DSA 3:
It should be noted that the 1024-bit length provides a reasonable layer of secu-
rity, whereas the 2048- and the 3072-bit lengths provide the maximum level of
protection for the DSA. Also, when it comes to using the DSA, a new pair of a
randomly generated public key and private key must be used each and every time.
◾ Te bit lengths for the ECDSA algorithm are typically in the range of 160 to
256 bits; this is the equivalent to the 1024 to 3072 bits in the RSA algorithm.
Tus, much stronger levels of security are provided.
◾ Because of these shorter bits, the ECDSA algorithm results in greatly reduced
processing times, as well as the need for computational resources.
The Generation of the Public Key and the Private Key Using
the ECDSA Algorithm
Te mathematical framework is actually embedded in the prime number feld
known as “Zp” and the Galois feld represented as “GF (2^m).” Te Galois feld
was reviewed in detail earlier in this chapter. Te process for generating both the
public key and the private key is as follows:
Trough this process, the public key and private key are now created, which are
as follows, respectively:
Kpub = (p, a, b, q, A, B)
Kpr = (d)
Te public key and private key will thus be 160 bits, which is the minimum
level of security that is accepted by the ECDSA algorithm.
S = [h(x) + d * r)kE^-1mod q
W = s^-1 mod q
U1 = w * h(x)mod q
U2 = w * r mod q
138 ◾ Testing and Securing Web Applications
Te following matrix shows the recommended bit lengths for the ECDSA algo-
rithm, as well as its level of security3:
Because of these limitations, the best solution is to create just one short digital
signature of any length that is possible. Tis is where hashing comes into play. It
can be specifcally defned as follows:
Hashing is the transformation of a string of characters into a usually shorter
fxed-length value or key that represents the original string. Hashing is used to
index and retrieve items in a database because it is faster to fnd the item using the
shorter hashed key than to fnd it using the original value. It is also used in many
encryption algorithms.5
Some important characteristics of hashing include the following:
◾ Preimage resistance
◾ Another type of preimage resistance
◾ Collision resistance
1. Preimage Resistance:
Tis is very often referred to as one-wayness. Tis procedure is computed
as follows:
[Ek(x),sigKpr,B^(Z)]
At this point, if the hash function is not one-way in nature, the plaintext
message can be very easily computed by the cyberattacker. But if it is
one-way, the plaintext message cannot be reassembled again after it has
been encrypted.
2. Another Type of Preimage Resistance:
Tis is also referred to as weak collision resistance. In other words, if
two digital signatures are present, it is imperative that the correspond-
ing plaintext messages do not compute to the same hash function value.
In terms of mathematics, the two unique plaintext messages should be
where X1=/X2 and should also not possess equal hash function values
where:
Z1 = h(x^1) = h(x^2) = Z2
Given this example of the birthday paradox, the statistical probability for
a hash function to consist of no values is calculated as follows:
It is also important to note that the typical output length for a hash func-
tion is 128 bits, if not longer. Te following matrix depicts the various
hash function output values in order to avoid any sort of collision:
Hash Output Lengths
A 128 Bit 160 Bit 256 Bit 384 Bit 512 Bit
An important concept that should be noted here at this point is the Merkle-Damgård
construction. Tis is when a hash function process an “x” length of ciphertext, and
from there, creates a fxed-length output. Tis is very often done by segmenting the
input into a sequence of blocks that are of equal size and are processed sequentially.
In this scenario, the value of the hash function can be thought of as the last itera-
tion of a compression function.
should not be used with some of the other encryption algorithms, most
notably the AES algorithm, as reviewed earlier in this chapter. Te primary
reason for this is that this specifc algorithm has a security level in the range
of 128 to 256 bits.
Te NIST also came out with three more versions of the SHA-1, and these
are as follows:
– SHA-256
– SHA-384
– SHA-512
Teir hash function digest bits are 256, 384, and 512, respectively. Tese,
including that of the SHA-1, are grouped into another family known as SHA-2.
Te following matrix illustrates the main parameters of all of these hash
functions just described3:
# of Collisions
Algorithm Output Input Rounds Detected
One of the techniques that are used to create the block from the ciphertext mes-
sage is mathematically demonstrated as follows:
Hi = Eg (Hi-1)^[(Xi)0/Xi]
144 ◾ Testing and Securing Web Applications
Tis is known as the Matyas-Meyer-Oseas hash function. Tere are other tech-
niques as well, which are as follows:
Hi = Hi-1 0/EXi(Hi-1)
All of these hash function techniques just described need to have initial values
that are assigned to the variable “Ho.” Te commonality between all of these hash
functions is that their bit size is equivalent to the block width of the ciphertext that
it is associated with. Tese techniques can also be used to create even larger hash
function message digests, which produce a block length of “b,” which is twice the
size of other blocks that are created from the ciphertext.
In this instance just described, the “Hirose construction” technique can be used.
Tis consists of a 128-bit hash function output and a divided block size of 64 bits.
◾ Preprocessing
◾ Te Hash computation
1. Preprocessing:
Preprocessing uses the concept of padding, which was described in detail
earlier in this chapter. In order to process the ciphertext message in a 512-
bit piece, the following computations are required:
K = 512 − 64 − 1 − i
= 448 − (l + 1) mod 512
Cryptography ◾ 145
where:
X = the ciphertext message
L = the bit length (which is actually a 64 − bit binary representation)
After this has been accomplished, the 512-bit piece is then divided with
the following computation, before the compression functionality is actu-
ally applied:
2 20 . . . . 39 K2=6ED9EBA1 F2(B,C,D)=B0/C0/D
4 60 . . . . 79 K4=CA62C1D6 F4(B,C,D)=B0/C0/D
146 ◾ Testing and Securing Web Applications
Ya = Eka^(Kses)
Yb = Ckb^(Kses)
It is important to note that the two ciphertext messages are both further encrypted
with the KEKs. Both of them can be considered “long-term” keys, meaning that
their structure and integrity never change; they always remain static in nature.
Tis is what forms the secret lines of communication between the sender and the
receiver (or, for example, the web application server and the end user).
A number of security issues are associated with KDCs, which are as follows:
◾ Te replay attack
◾ Te key confrmation attack
◾ Communications requirements
◾ A single point of failure
◾ No forward secrecy
1. Te Replay Attack:
Tis can happen when an old private key is reused again and it is not refreshed
over a period of time. Tis is situation can further worsen if older private keys
are used that have already been covertly compromised.
2. Te Key Confrmation Attack:
In this instance, the KDC is manipulated into thinking that a legitimate end
user is requesting to establish a secure session, when it is really a cyberattacker
that has initiated this.
3. Communications Requirements:
Each and every time a new secure session is requested between the end user
and the KDC, new lines of communication must be established. As a result,
Cryptography ◾ 147
◾ It specifes a certain lifetime, denoted with the variable “T” for the private key.
◾ A timestamp is also created and provided, which displays the recentness of
the private key. Tis merely provides a sense of assurance to the receiver that
the ciphertext message has not been compromised in any way.
148 ◾ Testing and Securing Web Applications
From this defnition, the concept of CAs is addressed. Tis is at the heart of a
PKI, and it can be defned as follows8:
One of the most common types of certifcates is the X.509 certifcate. Tis, too,
is used by many web applications and the end users that they communicate with. It
consists of the following properties:
1. Te Certifcate Algorithm:
Tis is where the type of hash function that is being used is specifed. For
example, this can be the SHA-1 or even the SHA-2.
2. Te Issuer:
Tis specifes either the entity that originally issued the digital certifcate in
question.
Cryptography ◾ 149
3. Te Period of Validity:
It is important to remember that the public key has a fnite lifetime. Te pri-
mary reason for this is that the private key that is associated with the public
key could become prey for the cyberattacker.
4. Te Subject:
Tis includes the relevant information and data about the individuals or enti-
ties that have requested that a specifc digital certifcate be created.
5. Te Subject’s Public Key:
Tis contains the relevant information and data about the public key that was
created and issued. Te particulars about the cryptographic algorithm that
was used to create the public key are also stored here as well.
Because a PKI can be quite large and complex, with many CAs residing within
them, these are also prime targets for the cyberattacker. Because of this, any public
key, private key, or even digital certifcate must be disabled as quickly as possible
if there is any sign of trouble lurking. In order to accomplish this task, CRLs are
widely deployed.
As its name implies, this is merely a listing of all of the public keys, private keys,
and digital certifcates that have expired, so that they cannot be used again. Even
live ones can be placed on this list as well in order to render them useless in the face
of a cyberattack that is occurring.
But keep in mind that CRLs are not comprehensive in nature; rather, they only
contain a listing of the most recently disabled keys or certifcates. Tis is known as
a delta CRL. Te primary reason for having this is that the CRL, if it was compre-
hensive, would be become too large to the point where it would almost be impos-
sible to process in real time.
Resources
1 Computer Networking: A Top Down Approach, Kurose, J.F. & Ross, K.W., Pearson
Education Group, 2008, p. 683
2 Computer Networking: A Top Down Approach, Kurose, J.F. & Ross, K.W., Pearson
Education Group, 2008, p. 687.
3 https://www.fedidcard.gov/faq/what-pki-public-key-infrastructure-and-why-do-i-
need-it
4 https://w w w.globalsign.com/en/ssl-information-center/what-is-public-key-
cryptography/
5 https://searchsqlserver.techtarget.com/defnition/hashing
6 https://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-install/What-is-
Kerberos-and-How-Does-it-Work_003f.html
7 https://www.ssl.com/faqs/what-is-a-certifcate-authority/
8 Paar Christof and Jan Pelzl. “Understanding Cryptography: A Textbook for Students
and Practioners”. 2010, Springer-Verlag Heidelberg.
Chapter 3
Penetration Testing
Introduction
“I’m just a keystroke away from downloading their entire database,” said the expe-
rienced hacker! Fortunately, this was an ethical hacker and an expert penetration
tester in my company performing an authorized test commissioned by a client,
while carefully documenting the results to present to said client.
Unfortunately, there are plenty of bad actors who would download the
“entire database” and sell or post the contents on the Dark Web or to other
bad actors. Performing penetration tests is an excellent way to determine how
vulnerable your systems, applications, and organizational assets are. In fact,
although cybersecurity is truly multilayered and multifaceted, frequent penetra-
tion testing is a quick way to really understand what I would call infrastructural
blind spots.
Te intent of this chapter is to clearly defne penetration testing, as well as
elaborate on its requirement by multiple cyber-compliance standards and frame-
works. I will also spend some time elaborating on the methodologies and elements
of a thorough vs. mediocre penetration test.
Clients will often call and say, “I want a penetration test.” I always make it a
point to ensure we’re on the same page by ascertaining whether the client wants a
mere vulnerability or web app scan, or a proper penetration test. So how would you
defne the diferences?
Clearly stated, a penetration test is a real-world, simulated attack performed by
certifed and qualifed engineers, using both automated and manual attack techniques.
Tey professionally fnd and appropriately exploit all vulnerable attack vectors until they
have exploited them all and professionally document all fndings with clear remediation
advisement, including multiple screen shots.
151
152 ◾ Testing and Securing Web Applications
Scans, by contrast, are fully automated, and in fact will be employed in a test
much like a tool in one’s auto mechanic’s toolbox. Te mechanic uses torque wrenches,
manual and hydraulic ratchets, Phillips and fat-head screwdrivers, volt meters, and a
myriad of other tools, all suited for the purpose of diagnosing and fxing your vehicle.
In the same way, scanners like Nessus and Open Vas, tools like Burp Suite, Kali
Linux, and dozens more are the wrenches and screwdrivers in your pen test engineer’s
toolkit, which, wielded with expertise, will contribute to a quality report.
True Stories
Here are some practical examples of how this process works. Tese true scenarios illus-
trate how one vulnerability or attack vector might lead to another until the compro-
mise includes potential system takeover or admin rights of certain servers or devices.
early in the morning with a full workload ahead, only to have a message appear on
the screen, “Tis computer and all others in the frm have been encrypted. Please
pay 5,000 Bitcoin to this address (bitcoin address) and the code will be given to
release your machines.” (Such actually happened to a company I am familiar with.)
In the case of the law frm being described, this didn’t happen but could have very
easily based on the system vulnerabilities discovered. Te reason for this was a video
camera system that was vulnerable. It had been installed without changing default pass-
words. Common brute-force tools can easily determine such passwords, thus granting
system entry. In this case, the law frm was quite shocked when the engineer delivered
the report with screenshots of their fling cabinets, boardroom, and other camera angles.
Such vulnerability in an auxiliary system is what led to the much-publicized
breach of Target Corporation’s point-of-sale (POS) systems, resulting in the expo-
sure of 40 million credit and debit card numbers and 70 million records of personal
information.1 If a bad actor can gain access to a system, it is only a matter of time
until he pivots into other systems. And that is one big factor on the side of the
attacker – time. Incidentally, the auxiliary system in question here was a heating,
ventilation, and air conditioning (HVAC) system.
It should be noted here that the average dwell time of a hacker, once a system
has been compromised, is 180 to 416 days., depending on the research you read.
Tat means the average attacker may have access to a system for longer than a year
before they are detected.2 In the Target case, there was plenty of time to install mal-
ware, tie into internal File Transfer Protocol (FTP) servers, and begin exfltrating
data out of the organization.
Finally, in the case of our law frm, confdentiality and integrity of the frm’s
private data were compromised via access to sensitive camera feeds. Te two cam-
era feeds in particular that gave access to the most confdential data were the fle
cabinet room and the server room. Te overall risk identifed to the frm as a result
of the penetration test was high. A direct path from external attacker to sensitive
corporate devices was obtained. Te administrative access could be used for further
compromise, such as knowing internal Internet Protocol (IP) address schemes or
allowing for a physical compromise of the building undetected.
A law frm’s vulnerable camera system or the HVAC system of a major corpora-
tion both point to the capacity of bad actors to do – well – bad things, as well as the
critical importance of frequent penetration testing performed by an independent
party that is unafraid to poke, prod, explore, and document.
Internal Testing
Te next scenario involves internal penetration testing. Internal testing will be
reviewed later but is a best practice scenario (and required for PCI, ISO 27001,
NIST, and many other standards). Tis is where the tester is connected behind the
frewall or is otherwise authenticated so as to view internal IP addresses. Again, due
to attacker dwell times, this allows potential “showstoppers” or risky vulnerabilities
154 ◾ Testing and Securing Web Applications
to be discovered so that if an attacker does gain access, she won’t fnd immediate,
easy compromise of systems.
Te client in question here happened to be a manufacturer with international
ofces. External testing had discovered minimal vulnerabilities to exploit, but the
internal testing of a particular international ofce in the company’s system was a
diferent story.
Report Narrative
Tis international ofce scenario illustrates several ways by which systems could be
compromised once the bad actor is inside the network. I emphasize that in many
instances, the only thing separating the external from the internal is time, and time
is always on the bad actor’s side.
Internal testing is so important that I have elected to show another scenario
illustrating why it should be part of a comprehensive penetration test.
Report Narrative
Webcheck Security tested the discovered open ports and saw that quite a
few services were replicated across diferent servers. A sampling of those
services was tested for access, but all failed. Other ports and protocols
Penetration Testing ◾ 155
Tis particular scenario shows that once inside the network, hackers would have
inevitably found inroads through an unpatched system, as well as administrative
portals for printers that were using default credentials.
Although asset enumeration is an important part of a cybersecurity strategy, it
is easy to see how just one machine or device missed on an update or log table can
lead to disastrous breach capability, hence strengthening the need for assurance
provided by internal testing.
Additional Detail
Email enumeration possible (See screenshot below)
*Using a message like the following is suggested: To reset your password (or, if
you haven’t yet, activated your account), check your inbox and look for our email.
So in this scenario, if the application reveals that the “email is already in use,”
the hacker knows that all she has to do is crack the password. Tat leads to a broad-
ening of the attack surface, since now multiple emails can be attempted and either
validated or invalidated, leading to a very large amount of potential credentials to
compromise.
Web applications can be so fraught with vulnerabilities that I felt it helpful to
include this example of how multiple attack vectors in one application might lead
to total system compromise.
Penetration Testing ◾ 157
1. Internal Git index fles are accessible to the outside world, they give a full
list of all software installed via Git and give signifcant sensitive information
to an external attacker.
a. Severity – HIGH
5. SSL cookies are not set with “secure” fag. Tis can lead to session ID com-
promise should a user go from an HTTPS session to an HTTP session on
a shared public network.
a. Severity – MEDIUM
Informational:
Eight email addresses and 12 usernames were discovered as part of the assess-
ment. Tis is just informational to have XYZ Company be aware and to make
sure they have robust passwords. (All usernames are then listed in the report.)
Tis scenario outlines the multiple risks that can be lurking in web applica-
tion code, leading to critical data or total system compromise. It is interesting to
note how much information about an application was found from a public source,
namely the GitHub repository, which could prove to be very damaging to a cor-
poration. Reconnaissance, therefore, is an important part of web application and
penetration testing in general. Several pages will be dedicated to the concept of the
reconnaissance process as part of good penetration testing outcomes.
158 ◾ Testing and Securing Web Applications
Te small excerpt from this last scenario exposes a vulnerability that we con-
tinue to fnd in so many web applications, known as SQL injection. I will dis-
cuss this in more detail as we discuss the Open Web Application Security Project
(OWASP) and the OWASP Top Ten.
SSID Testing
Service set identifer (SSID) or Wi-Fi testing is another best practice to ensure
systems are secure internally. Stated another way, businesses typically have one or
more wireless routers, which also provide obvious potential points of entry for bad
actors.
Here is a scenario typical of one we recently found:
In this case, the passwords were good, but one router had weak encryption and
bad passwords enabled, making it easily “hackable.”
All of these are typical of scenarios that will be encountered by qualifed
penetration testers. Tey underscore the importance of multifaceted approach
testing.
Penetration Testing ◾ 159
compromise. Also indicative of whether users are apt to click on “bad” links,
which may allow malware or ransomware to be downloaded into the system.
Mobile Application Testing – Testing iOS- or Android-installed applications
to fnd vulnerabilities, loopholes, or other methods of compromise.
Comprehensive penetration testing will often include all of this in larger orga-
nizations, at a price tag starting at $20,000 for one web app and just a handful of
external and internal IPs and SSIDs. Please note: Te small investment in a pen-
etration test and other cybersecurity controls is a small price to pay when compared
to the average cost of a data breach, which currently sits at $3.9 million.3 Te invest-
ment in a comprehensive pen test, even in a larger scope with a $50,000 price tag,
pales in comparison to the potential losses posed by data breach.
Critical – Test fndings that will most likely lead to efective total compromise
and having critical impact on the organization. Tis might involve adminis-
trative or root system–level access on a network, server, or servers, or provide
access to sensitive information and subsequent data exfltration. In compli-
ance situations, it could also signify a severe lack of compliance with regula-
tory bodies, which could lead to fnes, penalties, or loss of contractual status
and resulting business.
High – Test fndings that will indicate a high impact on the enterprise if com-
promised. Indicates potential for compromise of information systems on a
network or servers containing information or documents. In compliance situ-
ations, it could also signify a high lack of compliance with regulatory bodies,
which could lead to fnes, penalties, or loss of contractual status and resulting
business.
Medium – Test fndings that can immediately lead to compromise of nonpublic
data or has the potential to lead to the compromise of data through further
exploitation. May document a lack of compliance with industry best prac-
tices or standards that could lead to possible logical or physical exploitation,
penalties, fnes, or other monetary or legal actions as regulatory requirements
become stricter, or an operational defciency that would leverage the com-
pany’s ability to ensure the confdentiality, integrity, and/or availability of
information.
Penetration Testing ◾ 161
◾ PCI*
◾ HIPAA
◾ ISO 27001*
◾ SOC 1/SOC 2
◾ FedRAMP*
◾ NIST*
◾ CIS*
◾ COBIT*
◾ HITRUST*
For example, let’s say a test reveals that you are running a vulnerable operat-
ing system (OS) or software. Upon further review you might realize that your IT
inventory process as well as patch management may not be up to par, spurring you
to implement efective changes.
One word on NIST, mentioned earlier. Tere is a groundswell movement in
the U.S. government to protect itself from attacking nation-states, particularly
China, Iran, and Russia. In 2020 a new element is being introduced to the NIST
framework in that all defense and government contractors who may have access
to nonsensitive, nonclassifed data will still have to be Cybersecurity Maturity
Model Certifcation (CMMC) certifed. CMMC will be primarily based (for most
businesses) on the NIST SP 800-171 or Defense Federal Acquisition Regulation
Supplement (DFARS). My point here is that such certifcation and business con-
tinuance will require penetration testing.
Now to my point. Tese top ten common web application vulnerabilities are
only the tip of the iceberg. Penetration testing is an art, and only by hiring a skilled
and experienced tester, who will test without organizational bias, can you identify
issues such as these, have them documented, and enjoy sound discussion on how to
remediate the problems.
Reconnaissance
Phase Research and company study critical to a great pen test
Social Media LinkedIn, Twitter, Facebook, and job search sites for
open positions and tools used
Scanning Phase Applying the right tools for various results in the process
Exploit Phase Widening the cracks to fnd more threats for clients to
address
No specifc tools here, this is different enough each time that only manual
effort is used.
Penetration Testing ◾ 167
As you can see, a good penetration tester will, on an external test only, run
many recon tools such as ARIN and DNSRecon, NMAP, then OpenVas, then
several other tools in addition to attempted manual exploits.
6. Run Nikto on both the HTTP version and HTTPS version of the URL:
a. “Nikto --host {Domain name of site - ex: www.google.com} > nikto_
http_{site name}.txt”
b. “Nikto --host {Domain name of site - ex: www.google.com} --ssl --port 443 >
nikto_HTTPS_{site name}.txt”
7. Run an OpenVAS scan on the site (this will not get you much, but we are
looking for web application software versions for later attacks).
a. Add a target of the site IP address then run the scan. Wait until it fnishes
(this can take a long time)
b. Check out the scan results; sometimes you get lucky and fnd a heartbleed
vulnerability
8. Load up Burp Suite Professional and route web traffic through it via
proxy settings. Make sure to tie the site to a save file, do not use a temp
session.
9. Visit the site and manually click through every major page and drop-down.
THIS IS IMPORTANT! Burp misses things when it spiders the site unless
you give it a lot to work with frst.
10. Once this is done, if there is a login page, log in to it and click through any
remaining new sections of the site that come up.
a. When you log in, look through the HTTP history right after and note
the variable for username and password
11. Put the username and password variable into the Options section of the
Spider tab in Burp Suite Professional.
12. Spider the site. If you have put in the variables correctly into the Options tab,
very few forms should pop up to fll out. If forms keep popping up asking for
username and password, double-check the variable names.
13. Look at your Nikto and Openvas scan results and note the software versions
displayed.
14. Run a content brute-force check on the URL. Limit the subdirectories to fve
or six, leave the other options the same, and add the proper vulnerable list of
pages for the web application version you discovered earlier. Te vulnerability
lists are located here:
a. /usr/share/wordlists/dirb/vulns/
b. Let the brute force run all night; it can take a long time. If you notice
that after a couple minutes the brute force slows way down, you are likely
being throttled by a WAF. Pause the brute force and adjust the delay in
milliseconds. Take it 10× slower (so 200 milliseconds instead of 20). See
if things steady back out.
c. Sometimes the brute-force content scanner gets into directory loops due
to bad website design. Limiting the subdirectories can help with this, but
you can sometimes have tens to hundreds of thousands of pages discov-
ered. You will have to scan through the directory tree structure and delete
the trees that are repeats.
Penetration Testing ◾ 169
15. Once you fnally have the brute force done, do an active scan of the entire web-
site. Usually I leave the default options, but it depends if it is 10,000+ web pages.
Tat will take too long, especially if you have to throttle the scanning speed for
a WAF. If this happens, unselect the pages without any input felds; that usually
is the majority of the scan. Do the scan and see if you fnd anything good. If you
don’t, then do another scan of just the pages without any input felds.
iii. http://niiconsulting.com/checkmate/2014/01/from-sql-injection-
to-0wnage-using-sqlmap/
iv. Manual SQL injection steps without SQLMap: https://medium.
com/@hninja049/step-by-step-sql-injection-ed1bb97b3eae
v. More manual examples: https://resources.infosecinstitute.com/
anatomy-of-an-attack-gaining-reverse-shell-from-sql-injection/
d. XSS (refected or stored) results:
i. Load each one into the Repeater tab in Burp.
ii. Attempt to reproduce the page (sometimes the session ID has
expired). If need be, revisit in the browser and send the new page
from the HTTP history to the Repeater tab.
iii. Attempt to get a working proof of concept of the XSS vulnerability.
Once you do, note the results and take a screenshot. Create reproduc-
tion steps for the developers.
e. Local/remote fle inclusion:
i. If you fnd this, you found a reverse shell entry point. Follow the
instructions on how to go from LFI/RFI to reverse shell. Here are
some guides that can help:
ii. https://blog.techorganic.com/2012/06/21/lets-kick-shell-ish-
part-1-directory-traversal-made-easy/
iii. https://blog.techorganic.com/2012/06/26/lets-kick-shell-ish-
part-2-remote-fle-inclusion-shell/
iv. https://awakened1712.github.io/oscp/oscp-lf-rf/
v. https://www.adampalmer.me/iodigitalsec/2013/08/15/php-local-
and-remote-fle-inclusion-lf-rf-attacks/
f. CSRF
i. Send this one to the Repeater tab and attempt to confrm. If you can,
take notes and screenshots.
g. CORS
i. You will fnd a lot of these. It depends if you want to report this one;
most of the time the client does not understand it whatsoever and the
risk is low.
h. Access to fles outside of individual user’s rights
i. Look at the site tree and see if any of the folders or addresses look like
they are a number sequence or other procedurally generated number/
letter combinations.
ii. Load one of the pages into the Intruder tab. Clear all felds and then
select a feld that you want to brute force. Load up a brute-force list
or iterate through numbers, then kick of the attack. Look for HTTP
200 pages or for large changes in content amount.
i. Login pages
i. Tese are fun. First, look at the version of the software of the
login page and see if it has a default username/password. Try that
Penetration Testing ◾ 171
As you can see from this thorough process (most of which is geared to web
applications; items 1 to 4 and 7 can apply to external IP testing), there is much
work to do for a thorough pen test to be efective, and of course this explains the
cost of proper penetration testing, as well as clear diferences between merely run-
ning scans versus employing pen test techniques.
At a high level, we break down our process into four critical components illus-
trated in the following graphic:
172 ◾ Testing and Securing Web Applications
Intelligence. During the Intelligence phase, data is gathered not just about the
targets being tested through recon tools, but also research on the company itself is
performed. Tis research may reveal potential user IDs, password possibilities, and
in many cases other unprotected IP data that is “hanging out there” and might be
exploited.
Exploit. Tis is, of course, a critical component of penetration testing. As
observed in the test script on the previous page, the ability to fnd and then fur-
ther exploit the vulnerabilities, or “peeling the next onion layer,” is how loopholes
and problems are found. My associate Curt always says that good pen testers
are problem solvers. Tey enjoy a challenge and seek for nuggets to exploit and
document.
Documentation. Tis phase may be just as critical as the actual exploitation.
After all, it won’t help your client if you can’t describe (1) what you found, (2) how
you got there, and (3) how they fx it! Clear and concise writing with good, easy-to-
follow vulnerability descriptions and remediation advice will be critical.
Including screenshots in a professionally formatted report is also a critical com-
ponent of the documentation phase. I’ve included a sample in the ensuing page to
illustrate the “nice touch” that a sharp report can provide.
You will notice in the table of contents that the report has an Executive
Summary, a Conclusion and Risk Rating section, and a clear Recommendations
section. Each of these has a critical purpose for a client, but of these, the Executive
Summary and Recommendations are the most important.
Te client’s clients will often ask for proof of a penetration test, and the Executive
Summary is what can be sent rather than the comprehensive detail found in the
full report. Second, a clear Recommendations page will help the client understand
exactly what needs to be done in order to remediate the fndings (Figure 3.1).
Discussion. Tis is the “extra mile” phase. It not only is in place to ensure
clients get the consultation they need and deserve to understand test results but
are clear on exactly what steps or direction must be taken to resolve key fndings.
For many penetration testers, the organization tasks them with cranking out
reports or are themselves obsessed with merely this level and miss the opportunity
to enhance the customer experience and really add value to the penetration test
deliverable and outcome.
It could be said that this phase fows throughout the test, since important
fndings really should be communicated to clients promptly in order to facilitate
prompt action and quickly reduce existing vulnerabilities.
Chapter Takeaways
Now you should have greater insight into penetration testing, tools, methods,
and standards that require it. I started the chapter with some scenarios where real
exploitable attack vectors were found. Let me fnish with one as well.
Penetration Testing ◾ 173
Figure 3.1 Webcheck Security title and contents pages. All Rights Reserved.
In my frst real cybersecurity gig years ago, a client, a small but successful
Mexican restaurant in the Southwest, was hacked and for at least three months had
been bleeding its clients’ credit card data. Tey only came to my company because
they a digital forensic investigation was required through their card processor by
MasterCard, who had triangulated the data loss and fraud to their restaurant.
Te outcome was not good. MasterCard levied a fne of $80,000, assessed
through the acquiring bank or processor. In this scenario, if one chooses not to pay,
your account is garnished until it is and your processing privileges shut of. We all
know how annoying it is to go to pay for food only to be told “We accept cash only.”
Hence not a good option.
Te outcome of this scenario was not good. Not long after the forensic inves-
tigation was completed, the restaurant went out of business. Multiply that little
$80,000 fne by 10 or even 100. If it’s not uncommon for an average data breach
cost to be $3.9 million in the United States, those numbers could be catastrophic.
Contrast the $80,000 to what may have been less than $5,000 for a one- or
two-IP address penetration test – a test that would certainly have uncovered the
vulnerabilities, which in this case the hacker used to deposit malware and a rogue
FTP server on the POS server (a Remote Desktop Protocol [RDP] desktop was
open, by the way).
Tis fnal narrative in this chapter serves to summarize the key message here,
which is annual or semi-annual penetration testing is a small fee compared to the
alternative.
174 ◾ Testing and Securing Web Applications
Resources
1 arXiv:1701.04940v1 [cs.CR] 18 Jan 2017; https://arxiv.org/pdf/1701.04940.pdf
2 M-Trends 2019: Celebrating 10 Years of Incident Response Reporting; https://www.
fireeye.com/blog/executive-perspective/2019/03/mtrends-2019-celebrating-ten-
years-of-incident-response-reporting.html
3 IBM in conjunction with the Ponemon Institute https://databreachcalculator.myblue-
mix.net/
4 Source: Curt Jeppson, VP Engineering, Webcheck Security and expert cyber
practitioner.
5 Ibid, Curt Jeppson.
Chapter 4
Threat Hunting
Treat hunting is reducing bad actor dwell time in assets and systems of
an organization from the average of 200 days to less than a week. Tis
is accomplished by trained security analysts using advanced tools and
cyber triage* knowledge.
*Triage – Te ability to efectively identify, classify, and report on
cyber threats.
175
176 ◾ Testing and Securing Web Applications
Without threat hunters armed with the appropriate training and tools, cyber
events can wreak havoc on businesses, schools, government, nonprofts – any kind
of organization.
A SOC’s automated tools will miss things. Malicious actors use new con-
cepts and attack vectors every day, and the rules and algorithms raising alerts
sometimes take time to catch up. Threat hunting catches events that are
missed by automated alerts, and these can be turned into use cases to modify
the SOC rules so an alert is automatically generated for similar occurrences
in the future.
Not-So-Tall Tales
First some threat hunting stories based on real events. Tese are stories I have docu-
mented over the years from various sources, though the sources, as well as the
names of people and companies involved, are not identifed here for obvious rea-
sons. Some embellishment of the narrative has been added to better illustrate the
core events:
Oh, and by the way, your executives’ SSN numbers are as follows (he
lists their Social Security numbers.)
Joe and his team were fabbergasted. “Is this some kind of hoax? How do we address
this?” After careful discussion with his team, they called Adam in IT and asked
him to fx things. Adam toiled all day but without success. Alas, the team didn’t
have proper backups. As 5:00 rolled around and the RFP deadline came and went,
no data was able to be restored. Tey thought that perhaps over time they might
restore some of the data from personal hard drives….
…two weeks later, another email came in from the hacker:
Dear Mr. CEO and Executive Team: I see from your emails that you have
not taken me seriously nor has my ransom been paid. It’s a shame you
missed out on that $50 million-dollar RFP isn’t it? Oh – and tell Sally she
can’t have Wednesday of as she requested (see – I am in your emails).
Because you didn’t take me seriously, I downloaded an additional 40 GB
of product designs and data from your design folder. I’ve downloaded
15 years of employee bank account numbers. Oh, and Joe, you Fred and Jim
need to stop golfng and pay attention to my demands. BTW – here are your
salaries: (States executives’ salaries to prove he has been in payroll.)
Not such a good outcome for this company. Tis is an example of one without a
threat hunting team or service. Here are several condensed narratives showing what
can happen to prevent disasters such as this one:
What would Harry have done as the IT director had the company
been severely damaged in this scenario due to ransomware, data exfl-
tration, brand tarnishing, or legal and other implications? Clearly,
the systems in place along with this threat hunter reduced dwell time
and, even better, completely circumvented a potential tragedy within
1.5 hours.
Similarly, these ensuing bullet-point narratives from two diferent SOCs show
an analyst’s perspective (the threat hunter) and demonstrate not only insight into the
threat hunting process but the clear benefts of threat hunting in an organization:
Threat Hunting ◾ 179
As one can see from these narratives, threat hunters wear capes (or should).
Tey are truly superheroes and can save the day in many instances. Te events
Threat Hunting ◾ 181
described here saved the organizations millions of dollars to include legal liability,
business disruption, and brand tarnishment.
Clearly the nation-state of Iran was poking at more than just the few clients of
this particular SOC. Nation-states often have complete call center–type environ-
ments full of young eager hackers, banging away at thousands of U.S. IP addresses
looking for loopholes and vulnerabilities to exploit.
Which leads us to China. Also while preparing the manuscript for this book,
I was invited to attend an FBI briefng on the China threat. It was fascinating and
quite an eye-opener. To preface the “China cyber threat,” it should be noted that
the annual cost to the U.S. economy of counterfeit goods, pirated software, and
theft of trade secrets is $225 to $600 billion!1
Also according to this report, the Made in China 2025 Plan “lists 10 domestic
Chinese industries in which China seeks to signifcantly reduce its reliance on foreign-
produced technology and develop 70% of the components for these projects in China”:
◾ Information technology
◾ Computer numerical control machine tools and robotics
182 ◾ Testing and Securing Web Applications
◾ Aerospace equipment
◾ Electric power equipment
◾ Marine engineering equipment and high-tech ships
◾ Agricultural equipment
◾ Advanced rail transportation equipment
◾ New materials
◾ Energy-efcient and new-energy automobiles
◾ Biomedicine and high-performance medical instruments
In the presentation I attended, one might sum up China’s strategy using the 3 Rs:
◾ Rip of
◾ Replicate
◾ Replace
In other words, the main goal here is to steal IP and technology, as well as criti-
cal information leading to the goals to replicate it and then replace all American
tech with the stolen and replicated tech.
Targeted hacking, of course, represents one of the main forms of information
theft, and hence threat hunting is a critical defense activity.
Trend Based – Tis method involves looking at trends from weekly reports,
monthly reports, etc., and looking for anomalies. A spike in failed logins or a
change in weekly admin activity can give us a thread to pull on and dig into.
Tis method ofers threat hunting that is more personalized to the customer’s
environment and can detect things that may look like normal trafc to the
other methods mentioned earlier.
MITRE ATT&CK
Te MITRE ATT&CK framework is a globally accessible knowledge base of bad
actor tactics and techniques based on real-world observations. Te ATT&CK
knowledge base is used as a foundation for the development of specifc threat mod-
els and methodologies in the private sector, in government, and in the cybersecurity
product and service community.2
Te result is that SIEM tools, manual threat-hunting tactics, and reporting can
become more thorough, uniform, and even automated using a universally accepted
framework. It will also make the reporting process easier.
Attack techniques and incidents can generally be categorized into one of ten
categories: initial access, execution, persistence, privilege escalation, defense eva-
sion, credential access, discovery, lateral movement, collection, exfltration, com-
mand and control, and tools (general and specifc). To view a comprehensive table,
please see https://attack.mitre.org/.
Technology Tools
It will also be helpful to defne some terms and list some of the core tools utilized
in the threat hunting process. In doing so please read on with a forgiving heart, as
the literally thousands of cyber tools, vendors, techniques, and products would fll
volumes, and in identifying some I am leaving many out. Tis is also not an adver-
tisement or endorsement of any particular technology.
The SIEM
I recently participated in an evaluation of SIEM products. Some of the products
reviewed were (listed in no particular order):
◾ Fortinet FortiSIEM
◾ Netsurion EventTracker
◾ Rapid7’s SIEM
◾ AT&T AlienVault
◾ TrendMicro Cysiv
◾ SolarWinds Treat Monitor
◾ AlertLogic SIEM
184 ◾ Testing and Securing Web Applications
And that is the tip of the iceberg. One can literally quadruple that product list.
Each has its merits and strengths, but for the purposes of this chapter, sufce it to
say that a SIEM is at the core of the threat hunt, as logs are stored, categorized, and
can be cross-correlated to lead to more fndings. Hence, at least in the context of
threat hunting in a SOC environment, the SIEM is at the core.
So let’s start with SIEM. Te SIEM ingests logs from multiple sources: frewalls,
virtual machines, cloud app logs such as Ofce 365, wireless access points, routers,
servers of all kinds (web, DNS, authentication, application, Windows, Linux, FTP,
mail, etc.), and many other types of devices.
SIEM is at the heart of managed detection and response (MDR). Another con-
cept of MDR is “eyes-on-glass,” meaning someone’s got your back. Someone is
“watching the shop” or in other words looking at critical alerts. Tis is critical,
since most organizations have operational IT, and the nature of operational IT is
that of keeping the business running. Hence, since they don’t have the bandwidth
to monitor the security of the organization 24/7/365, a layer of qualifed personnel
doing MDR is critical. MDR is threat hunting and responding.
Treat hunters don’t go it alone, however. Literally millions of events can occur
in one day on just one network alone, and all of those logs are sent to and indexed
by the SIEM. Because the aforementioned SIEM products (and all others not men-
tioned) have complex anomaly detection algorithms and heuristics, alerts get gen-
erated automatically, and threat hunters then dive into these, verifying whether
events are a false positive or actual event of concern.
It should be noted here that many of the compliances listed in Chapter 3 are
fulflled by the log collection and monitoring provided by SIEM systems. I would
caution, however, that log collection without advanced SIEM analytics and threat
hunters is merely checkbox-compliance exercise. It is inefective for true security
purposes. As you learned from the scenarios at the beginning of the chapter, and as
you will see from the thorough threat hunting process documented further on, log
monitoring without SIEM and trained analysts is not security.
EDR
Firewall and server logs don’t tell all, however. As demonstrated by the stories at the
beginning of this chapter, unwitting users may click on bad stuf. Great technologies
are out there to often detect and block bad behavior on user machines. Tis started
decades ago as antivirus protection, but has evolved into a new category of protec-
tion known as endpoint protection and endpoint detection and response (EDR).
EDR also involves sending alerts to analysts to review and incorporates not just
antivirus and antimalware but also advanced heuristics and often user and entity
behavior analytics (UEBA) algorithms. Software that employs UEBA essentially
analyzes user activity data from logs, network trafc, and endpoints and correlates
such data with threat intelligence to identify activities or behaviors that most likely
indicate a malicious threat in the environment.
Threat Hunting ◾ 185
EDR + SIEM
EDR products are also proliferous and are incorporating advanced artifcial intel-
ligence. Sophos, Crowdstrike, Cylance, Carbon Black, Sentinal One, TrendMicro,
and others are all examples of advanced EDR technologies. Te beauty of them is
that most of the SIEM products ingest the logs or alerts of the EDR products, lead-
ing to two amazing technologies functioning to increase what I call the incident
visibility window.
IDS
Now imagine the aforementioned visibility window widened by yet another tech-
nology: an IDS. Many products in this category also exist – and you guessed it –
can forward logs and alerts to the SIEM.
Other terms for IDS are bitstream analytics, deep or full packet inspection,
partial packet or metadata inspection, and sometimes netfow. Regardless of the
technology, the concept is that packets are inspected as they come in from the fre-
wall, both ingress and egress, so that bad stuf or data exfltration can be detected
and alerts sent to the threat hunter.
Common examples of host-based IDS systems are Snort, Suricata, Verizon’s
ProtectWise, Bro, SecurityOnion, and most of the SIEM systems already men-
tioned. Fortinet, Palo Alto, Checkpoint, Sonic, Cisco, and most frewalls also have
IDS built into them at either basic or very advanced levels.
hunters, there still remains a strong argument for the comprehensive, quickly
searchable index of logs (think threat incidents) retained by the SIEM, including
the implications for digital forensic investigations. Further, the advanced evo-
lution of the SIEM also implies a robust platform for cross-correlative threat
hunting.
7. Drawing conclusions
a. No conclusive results? Ticket not raised.
b. Treat? Ticket raised and triage begins.
As you will observe, there could be many other events and more depth to dis-
covered events than was originally found. Treat actors are sneaky and desperate,
though not always impatient. Hence, the additional threat hunting activity that
might occur, spurred by the initial threat hunting procedure, can yield more three-
dimensional data on the state of the enterprise threat posture. Stated another way,
the subsequent secondary or expanded threat process may uncover additional indi-
cators of compromise and lead to more critical fndings. Here’s a simple example:
Let’s say malware is discovered and isolated, and based on IDS and SIEM
data, the source IP is identifed. Ten using telemetry from the frst procedures,
it is determined that the source IP also made several successful or unsuccessful
logins to the user’s Ofce 365 account. Tat might spawn a search in the SIEM
of any successful logins by that IP address to any Ofce 365 account. Te results
could be astonishing and result in a quick blocking of said IP, as well as an urgent
notifcation to the users of any successful logins to change their passwords, turn
on multifactor authentication (MFA), and ensure the new password is sufciently
long and includes a mix of caps, numbers, characters, etc. Day saved by the threat
hunter in the cape!
188 ◾ Testing and Securing Web Applications
1. Pattern recognition
2. Data analytics
3. Malware analysis
4. Data forensics
5. Communication
I would say that an aptitude for problem solving and an analytical mind will
lay the foundation to enhance or acquire the skills enumerated here, but of them
all, communication is the most important. Here, the analogy of the brilliant doctor
comes to mind. I don’t care how brilliant my doctor is. Does he or she care enough
to explain a condition to me in a way that helps me? Does he or she think of ways to
solve my problems and provide the best care? Tis aptly applies to the threat hunter.
Does she or he care enough to carefully research, diagnose, and then articulately
broadcast the problem, following up to ensure the cyber safety of the organization?
Writing and communication skills are critical in this scenario.
I would also recommend to the reader two additional organizations for learn-
ing, training, and certifcation: SANS and ISACA (more info can be found at
https://www.sans.org/ and www.isaca.org).
Additionally, (ISC)2, which is responsible for the respected CISSP certifcation,
is a commendable organization (https://www.isc2.org/). Te CISSP itself involves
many aspects of cybersecurity, which help to round out the security analyst’s cyber
view and efectiveness.
Te CISSP covers key domains of cybersecurity such as:
Although not all analysts have or need their CISSP, many of the efective ana-
lysts I have worked with over the years have possessed this certifcation. My obser-
vation is that it gives practitioners a more holistic view of the cyber landscape.
Resources
1 Federal Bureau of Investigation Handout: China, the Risk to Corporate America.
2 See https://attack.mitre.org/
3 Treat Hunting Careers. https://resources.infosecinstitute.com/category/enterprise/
threat-hunting/threat-hunting-careers/#gref
Chapter 5
Conclusions
As was mentioned at the beginning of Chapter 1, the web applications of today are far
more complex and sophisticated than when they were frst conceived and developed back
during the mid-1990s, during the height of the .com craze. Back then, as long as a web
app had a simple e-commerce front, that was all that was needed to get the attention of
not only venture capitalists but, most importantly, customers and their repeat business.
Today, there is so much more to a web application than just its front-facing site.
Tere is the back end, which can consist of many complex databases, where some
of the most sensitive information and data are stored. Tis can consist primarily of
your customers’ fnancial information, such as credit card numbers, and banking
information, and other relevant forms of personally identifable information (PII).
Of course, just as important, is the server that the web applications reside on.
Te image of them residing in a physical server has now dissipated; rather, they are
stored on virtual machines that are housed in the cloud.
Even the source code that is used to create the web application has become quite
complex, and many of the lines of the relevant code that are needed to create it are
now outsourced to external third parties for rapid creation and development.
So, as one can see, security across a typical web application encompasses a
plethora of areas, ranging from securing the source code, to the virtual machine,
to the Internet connectivity that takes place from the device of the end user to the
server that houses the web application and vice-versa.
Chapter 1, which dealt with the topic of network security for the web applica-
tion, covered the following topics:
191
192 ◾ Testing and Securing Web Applications
Based on this list, network security is thus a very crucial aspect when it comes
to fortifying a secure line of network communications between the virtual machine
that houses the web application and the device of the end user that is accessing this
particular web application.
But as important as this is, it is equally, if not more, important to further secure
the endpoints of this network line of communications, as this is very often over-
looked in the design of a web application. As a result, this becomes a prime target
for the cyberattacker to pounce upon.
As it has been reviewed, there are many areas, both internally and externally, that
a web application needs to be protected from. As Chapter 1 noted, the fow of network
communications from the device of the end user and the server that houses the web
application (and vice-versa) must be protected. But it is important to keep in mind that
a lot of confdential information and data are transmitted across this network medium.
Tis kind of data is also known as “personable identifable information” (PII).
Typical examples of this include credit card information, banking data, Social
Security numbers, driver’s license numbers, etc. Take the example of a business
that makes use of an e-commerce front, also known as the online store. In these
instances, the end user (the customer) will be making purchases from this store-
front, most likely using his or her credit card.
Conclusions ◾ 193
While the actual transmission of this data will be sent over a secure network
connection (primarily that of the Secure Sockets Layer [SSL]), there are reasonably
good statistical probabilities that the credit card information could be still be inter-
cepted by a malicious third party. Unfortunately, if this were to happen, the credit
card information would still be in a plaintext format.
Tus, it can be used to engage in acts of credit card fraud, and worse yet, to
conduct acts of identity theft, which could take the victim a long time to recover
from. But not only does this credit card information have to be safe during the net-
work transmission; it must also stay that way when it is stored in the database of the
web application. For example, many databases still use insecure code, and thus are
prime targets for any threat vector, namely that of SQL injection attacks. Probably
the best way to secure against these forms of attack is to use what is known as
encryption, which is a feld of cryptography.
Te basic point of encryption is to render the PII in a garbled state so that if it
were to be intercepted by a cyberattacker, these datasets would be rendered useless
unless the cyberattacker had the appropriate key to unlock them into a decipherable
format. Tus, Chapter 2 reviewed the essential concepts of both encryption and
cryptography and how they can used to secure the confdential information and
data that are stored in the database of a web application and to make sure it stays
that way while it is in transit across the network medium.
Te topics covered in Chapter 2 included the following:
◾ An introduction to cryptography
◾ Message scrambling and descrambling
◾ Encryption and decryption
◾ Ciphertexts
◾ Symmetric key systems and asymmetric key systems
◾ Te Caesar methodology
◾ Polyalphabetic encryption
◾ Block ciphers
◾ Initialization vectors
◾ Cipher block chaining
◾ Disadvantages of symmetric key cryptography
◾ Te key distribution center
◾ Mathematical algorithms with symmetric cryptography
◾ Te hashing function
◾ Asymmetric key cryptography
◾ Public keys and public private keys
◾ Te diferences between asymmetric and symmetric cryptography
◾ Te disadvantages of asymmetric cryptography
◾ Te mathematical algorithms of asymmetric cryptography
◾ Te public key infrastructure
◾ Te digital certifcates
194 ◾ Testing and Securing Web Applications
◾ Critical
◾ High
◾ Medium
◾ Low
◾ Informational
Next, we discussed the many frameworks and compliances that require pen-
etration testing, such as:
◾ PCI
◾ HIPAA
◾ ISO 27001
◾ SOC 1/SOC 2
◾ FedRAMP
◾ NIST
◾ CIS
◾ COBIT
◾ HITRUST
We then introduced the reader to the very important OWASP organization and
its Top Ten Web Application Flaws list, which included the following:
1. Injection
2. Broken authentication
3. Sensitive data exposure
4. XML external entities (XXE)
5. Broken access control
6. Security misconfguration
7. Cross-site scripting (XSS)
8. Insecure deserialization
9. Using components with known vulnerabilities
10. Insufcient logging and monitoring
We then listed common tools for the pen tester’s toolkit, followed by a detailed
process for testing a web application. Tis process spanned several pages and intro-
duced you to the thoroughness a good tester will employ.
196 ◾ Testing and Securing Web Applications
We next suggested that excellence in the penetration test process will also
involve the concepts of:
1. Intelligence gathering
2. Exploitation at a thorough level
3. Documentation that is professional, detailed, and helpful
4. Discussion of fndings with the client in a helpful and informational way
◾ Presented China’s 3 Rs and other key info from a recent FBI report
◾ Noted resources on the recent events in Iran
◾ Documented actual cyber occurrences recently visualized in a security opera-
tions center, underscoring the fact that Iran is trying to hack us
◾ Intelligence based
◾ Tools techniques and procedures based
◾ Trend based
We further concluded that although EDR technology is getting better and may
be changing the SOC makeup, the SIEM still remains at the heart of MDR. We
then reviewed the threat hunting process and the subsequent secondary search and
correlation which can lead to more discoveries. We fnished with the attributes of
a good threat hunter, followed by resources in organizations geared to train and
certify, such as SANS, ISACA and (ISC)2.
We concluded that threat hunters and SOC services are needed – the threats
are real, as demonstrated in the nation-state scenarios. Further, nation-states are
involved in espionage and cyberattacks with the intent to acquire data, do harm,
and further their initiatives.
Index
199
200 ◾ Index
C Connect scan, 62
Cortana from Microsoft, 7
Caesar, J., 84, 87 Cracking, 17
Caesar methodology, 87–88 Cross-site scripting (XSS), 64–65, 163
Cardinality, see Galois feld Cryptographic attacks, 88
Cascading Style Sheets (CSS), 10 Cryptographic Message Syntax Standard, 103
Cerf, V., 5 Cryptographic Token Information Format
Certifcate algorithm, 148 Standard, 103
Certifcate authority (CA), 75–76, 99–100, 101, Cryptographic Token Interface Standard, 103
148–149 Cryptography, 4
Certifcate of server, 72–73, 72–73 asymmetric key systems, 87, 95, 96–97,
Certifcate revocation list (CRL), 103, 149 121–125
Certifcation Request Syntax Standard, 103 attacks, 88
Chaining certifcates, 101 block ciphers, 89–90
Challenge Handshake Authentication Protocol Caesar methodology, 87–88
(CHAP), 47 certifcate authority (CA), 148–149
Change Cipher Spec, 74–75 cipher block chaining (CBC), 90–91
Chief executive ofcer (CEO), 91 ciphertexts, 86–87
Chief fnancial ofcer (CFO), 176 decryption, 86
Chief information ofcer (CIO), 18 defned, 85
Chief information security ofcer (CISO), 18 digital certifcates, 100–101
Chief operations ofcer (COO), 176 DSA algorithm, 133–138
China, 181–182 encryption, 86
Chinese Remainder Teorem (CRT), 128–129 hashes, 106–107
Chosen-plaintext attack, 88 hashing function, 94–95, 138–145
Cipher block chaining (CBC), 90–91 initialization vector (IV), 90
Cipher suites in TLS, 70, 71 key distribution center (KDC), 92–93,
Ciphertext-only attack, 88 146–147
Ciphertexts, 86–87 LDAP protocol, 102–103
Circuit-level gateway frewall, 36 mathematical algorithms, 93–94, 98–99
Circumventing password, 63 message descrambling, 85
Cisco, 1, 6 message digests, 106
Client Key Exchange, 73 message scrambling, 85
Client/Ticket Granting Service (TGS), 49–50 overview, 84–85
Collision resistance, hash functions, 140–141 PKCS, 103–104
Communications, 93 polyalphabetic encryption, 88–89
electronic, 85 private keys, 95–96, 104, 106
encrypted, 74–75 public key infrastructure, 99–100, 101–102,
everyday, 85 148–149
Netscape, 6 public keys, 95–96, 104, 106, 121–125
network, 4, 30, 35, 43–44, 51, 57, 83, 192 RSA algorithm, 125–133
requirements, KDC, 146–147 security policies, 105
Composer, 10 servers, 105
Compulsory tunneling, 46 symmetric key, 87, 91–92, 96–97
CompuServe, 6 technical review of, 107–112
Computer Science Network (CSNET), 6 Crystal, 13
Confguration Cyberattack(s)
dual-homed host, 37–38 communications requirements, KDC,
network host–based, 37 146–147
router-based, 38 DDoS attack, 21–23
screened host, 38–39 diferential cryptanalysis attack, 113
Conijn, S., 82 distributed refection denial of service, 27
Index ◾ 201
WannaCry, 33
X
War dialing, 17
War driving, 17 XML External Entities (XXE), 163
Web Application Firewalls (WAF), 167
Web applications Y
current state of security of, 53–56
cyberattacks, 17–18, 21–27 Yahoo!, 7
defending against bufer overfow Yarn, 11
attacks, 28 YouTube, 7