Varnish Book 2019 Framework App
Varnish Book 2019 Framework App
Varnish Book 2019 Framework App
vcl_recv
lookup
vcl_hash
vcl_pass
vcl_backend_fetch
read beresp(headers)
vcl_backend_response vcl_backend_error
cacheable?
yes hit-for-pass
vcl_deliver vcl_synth
Done
RESTART
cnt_restart:
Request received ESI_REQ
ok? max_restarts?
cnt_recv:
vcl_recv{} req.* SYNTH
hash purge pass pipe synth
cnt_recv:
vcl_hash{} req.*
lookup
cnt_pipe:
cnt_lookup: filter req.*->bereq.* cnt_purge:
hash lookup (waitinglist) req.* vcl_purge{} req.*
vcl_pipe{}
hit? miss? hit-for-pass? busy? bereq.* synth restart
pipe synth
cnt_lookup:
req.* send bereq,
vcl_hit{}
obj.* copy bytes until close
deliver miss restart synth pass
cnt_miss:
vcl_miss{} req.*
fetch synth restart pass
parallel
if obj expired
cnt_pass:
vcl_pass{} req.*
fetch synth restart
cnt_deliver:
cnt_synth:
Filter obj.->resp.
req.* stream?
req.* vcl_synth{}
vcl_deliver{} resp.* body
resp.*
deliver restart
restart deliver synth
V1D_Deliver
DONE
FETCH BGFETCH RETRY
vbf_stp_startfetch:
vcl_backend_fetch{} bereq.*
abandon fetch
send bereq,
read beresp (headers)
vbf_stp_startfetch:
bereq.*
vcl_backend_response{}
beresp.*
retry deliver
abandon
max? ok? 304? other?
vbf_stp_error:
vbf_stp_condfetch: vbf_stp_fetch:
bereq.*
vcl_backend_error{} copy obj attr setup VFPs
error abandon beresp.* RETRY
steal body fetch
retry
deliver fetch_fail? ok? fetch_fail? error? ok?
max? ok?
"backend synth"
time
If-Modified-Since
Object Lifetime
Authors: Francisco Velázquez (Varnish Software), Kristian Lyngstøl, Tollef Fog
Heen, Jérôme Renard
Copyright: Varnish Software AS 2010-2015, Redpill Linpro AS 2008-2009
Versions: Documentation version 4.x-73-g063677c / Not tested on build machine
Date: 2018-02-28
License: The material is available under a CC-BY-NC-SA license. See
http://creativecommons.org/licenses/by-nc-sa/3.0/ for the full license. For
questions regarding what we mean by non-commercial, please contact
[email protected].
Contact: For any questions regarding this training material, please contact
[email protected].
Web: https://info.varnish-software.com/the-varnish-book
Source: http://github.com/varnish/Varnish-Book/
Contents
1 Introduction 20
1.1 What is Varnish? 21
1.1.1 Varnish is Flexible 22
1.2 Varnish Cache and Varnish Plus 23
1.3 Varnish Cache and Varnish Software Timeline 25
1.4 What is new in Varnish 4? 27
2 Design Principles 29
2.1 How objects are stored 31
2.2 Object Lifetime 32
3 Getting Started 33
3.1 Varnish Distribution 34
3.2 Exercise: Install Varnish 35
3.3 Exercise: Configure Varnish 37
3.3.1 VCL Reload 39
3.3.2 Test Varnish Using Apache as Backend 42
3.4 The front-end varnishadm of the Varnish Command Line Interface (CLI) 43
3.5 More About Varnish Configuration 45
3.6 Command Line Configuration 47
3.7 Defining a Backend in VCL 49
3.8 Exercise: Use the administration interface to learn, review and set Varnish 50
parameters
3.9 Exercise: Fetch Data Through Varnish 51
4 Examining Varnish Server's Output 52
4.1 Log Data Tools 53
4.2 Log Layout 54
4.3 Transactions 55
4.3.1 Transaction Groups 57
4.3.2 Example of Transaction Grouping with varnishlog 58
4.4 Query Language 59
4.5 Exercise: Filter Varnish Log Records 61
4.6 varnishstat 62
4.6.1 Notable Counters 66
4.7 Exercise: Try varnishstat and varnishlog together 68
5 Tuning 69
5.1 Varnish Architecture 70
5.1.1 The Parent Process: The Manager 72
5.1.2 The Child Process: The Cacher 73
5.1.3 VCL Compilation 74
5.2 Storage Backends 75
5.3 The Varnish Shared memory Log (VSL) 77
5.4 Tunable Parameters 78
5.5 Threading Model 80
5.6 Threading Parameters 81
5.6.1 Details of Threading Parameters 83
5.6.2 Time Overhead per Thread Creation 84
5.7 System Parameters 85
5.8 Timers 86
5.9 Exercise: Tune first_byte_timeout 88
5.10 Exercise: Configure Threading 89
6 HTTP 90
6.1 Protocol Basics 91
6.1.1 Resources and Representations 92
6.1.2 Requests and Responses 93
6.1.3 Request Example 94
6.1.4 Response Example 95
6.2 HTTP Characteristics 96
6.3 Cache-related Headers Fields 97
6.4 Constructing Responses from Caches 98
6.5 Cache Matching 99
6.5.1 Vary 100
6.5.2 ETag 102
6.5.3 Last-Modified 103
6.5.4 If-None-Match 104
6.5.5 If-Modified-Since 105
6.6 Allowance 107
6.6.1 Cache-Control 108
6.6.2 Pragma 110
6.7 Freshness 111
6.7.1 Age 112
6.7.1.1 Exercise: Use article.php to test Age 112
6.7.2 Expires 114
6.8 Availability of Header Fields 115
6.9 Exercise: Test Various Cache Headers Fields with a Real Browser 116
7 VCL Basics 117
7.1 Varnish Finite State Machine 118
7.2 Detailed Varnish Request Flow for the Client Worker Thread 121
7.3 The VCL Finite State Machine 123
7.4 VCL Syntax 124
7.5 VCL Built-in Functions and Keywords 125
7.6 Legal Return Actions 126
7.7 Variables in VCL subroutines 127
7.8 Built-in vcl_recv 129
7.8.1 Exercise: Configure vcl_recv to avoid caching all requests to the URL 131
/admin
7.9 Detailed Varnish Request Flow for the Backend Worker Thread 132
7.10 VCL – vcl_backend_response 134
7.10.1 vcl_backend_response 135
7.10.2 The Initial Value of beresp.ttl 136
7.10.3 Example: Setting TTL of .jpg URLs to 60 seconds 138
7.10.4 Example: Cache .jpg for 60 seconds only if s-maxage is not present 139
7.10.5 Exercise: Avoid Caching a Page 140
7.10.6 Exercise: Either use s-maxage or set TTL by file type 141
7.11 Waiting State 142
7.12 Summary of VCL Basics 143
8 VCL Subroutines 144
8.1 VCL – vcl_recv 145
8.1.1 Revisiting built-in vcl_recv 147
8.1.2 Example: Basic Device Detection 148
8.1.3 Exercise: Rewrite URL and Host Header Fields 149
8.2 VCL – vcl_pass 150
8.2.1 hit-for-pass 151
8.3 VCL – vcl_backend_fetch 152
8.4 VCL – vcl_hash 153
8.5 VCL – vcl_hit 154
8.6 VCL – vcl_miss 155
8.7 VCL – vcl_deliver 156
8.8 VCL – vcl_synth 157
8.8.1 Example: Redirecting requests with vcl_synth 159
8.9 Exercise: Modify the HTTP response header fields 160
8.10 Exercise: Change the error message 161
9 Cache Invalidation 162
9.1 Purge - Bans - Cache Misses - Surrogate Keys 164
9.2 HTTP PURGE 166
9.2.1 VCL – vcl_purge 167
9.2.2 Example: PURGE 168
9.2.3 Exercise: PURGE an article from the backend 169
9.2.4 PURGE with restart return action 170
9.3 Softpurge 171
9.4 Banning 172
9.4.1 Lurker-Friendly Bans 175
9.5 Exercise: Write a VCL program using purge and ban 177
9.6 Force Cache Misses 178
9.7 Hashtwo/Xkey (Varnish Software Implementation of Surrogate Keys) 179
9.7.1 Example Using Hashtwo or Xkey 181
10 Saving a Request 183
10.1 Directors 184
10.1.1 Random Directors 186
10.2 Health Checks 187
10.2.1 Analyzing health probes 189
10.2.2 Demo: Health Probes 191
10.3 Grace Mode 192
10.3.1 Timeline Example 194
10.3.2 Exercise: Grace 195
10.4 retry Return Action 196
10.5 Saint Mode 197
10.6 Tune Backend Properties 199
10.7 Access Control Lists (ACLs) 200
10.8 Compression 202
11 Content Composition 204
11.1 A Typical Website 205
11.2 Cookies 206
11.2.1 Vary and Cookies 207
11.2.2 Best Practices for Cookies 208
11.2.3 Exercise: Handle Cookies with Vary and hash_data with HTTPie 209
11.3 Edge Side Includes 210
11.3.1 Basic ESI usage 211
11.3.2 Example: Using ESI 212
11.3.3 Exercise: Enable ESI and Cookies 214
11.3.4 Testing ESI without Varnish 215
11.4 Masquerading AJAX requests 216
11.4.1 Exercise: write a VCL that masquerades XHR calls 217
12 Varnish Plus Software Components 218
12.1 Varnish Administration Console (VAC) 219
12.1.1 Overview Page of the Varnish Administration Console 220
12.1.2 Configuration Page of the Varnish Administration Console 221
12.1.3 Banning Page of the Varnish Administration Console 222
12.2 Varnish Custom Statistics (VCS) 223
12.2.1 VCS Data Model 225
12.2.2 VCS API 228
12.2.3 Screenshots of GUI 230
12.3 Varnish High Availability (VHA) 231
12.4 SSL/TLS frontend support with hitch 233
13 Appendix A: Resources 235
14 Appendix B: Varnish Programs 236
14.1 varnishtop 237
14.2 varnishncsa 238
14.3 varnishhist 239
14.4 Exercise: Try varnishstat, varnishlog and varnishhist 240
14.5 varnishtest 241
14.5.1 The Varnish Test Case (VTC) Language 242
14.5.2 Synchronization in Varnish Tests 244
14.5.3 Running Your Varnish Test Cases 246
14.5.4 Exercise: Test Apache as Backend with varnishtest 247
14.5.5 Setting Parameters in varnishtest 248
14.5.6 Fetch Data with varnishtest 250
14.5.7 Understanding Expires in varnishtest 251
14.5.8 Example of Transactions in varnishtest 252
14.5.9 logexpect 253
14.5.10 Exercise: Assert Counters in varnishtest 255
14.5.11 Understanding Vary in varnishtest 256
14.5.12 Understanding Last-Modified and If-Modified-Since in 258
varnishtest
14.5.13 Understanding Cache-Control in varnishtest 260
14.5.14 VCL in varnishtest 262
14.5.15 PURGE in varnishtest 263
14.5.16 Cache Invalidation in varnishtest 265
14.5.17 Understanding Grace using varnishtest 266
14.5.18 Exercise: Handle Cookies with Vary and hash_data() in 268
varnishtest
14.5.19 Understanding ESI in varnishtest 269
15 Appendix C: Extra Material 271
15.1 ajax.html 272
15.2 article.php 273
15.3 cookies.php 274
15.4 esi-top.php 275
15.5 esi-user.php 276
15.6 httpheadersexample.php 278
15.7 purgearticle.php 281
15.8 test.php 282
15.9 set-cookie.php 283
15.10 VCL Migrator from Varnish 3 to Varnish 4 284
16 Appendix D: VMOD Development 285
16.1 VMOD Basics 286
16.2 varnishtest script program 287
16.2.1 VTC 288
16.2.2 Run Your Varnish Tests 290
16.3 Hello, World! VMOD 291
16.3.1 Declaring and Documenting Functions 292
16.3.2 Implementing Functions 294
16.3.3 The Workspace Memory Model 295
16.3.4 Headers 296
16.3.5 Exercise: Build and Test libvmod_example 297
16.4 Cowsay: Hello, World! 298
16.4.1 Cowsay Varnish Tests 299
16.4.1.1 Exercise: Add Assertions To Your Varnish Tests 301
16.4.2 vmod_cowsay.vcc 302
16.4.3 vmod_cowsay.c 303
16.5 Resources 304
17 Appendix E: Varnish Three Letter Acronyms 305
18 Appendix F: Apache as Backend 307
19 Appendix G: Solutions 308
19.1 Solution: Install Varnish 309
19.2 Solution: Test Apache as Backend with varnishtest 312
19.3 Solution: Assert Counters in varnishtest 313
19.4 Solution: Tune first_byte_timeout and test it against mock-up server 314
19.5 Solution: Configure vcl_recv to avoid caching all requests to the URL 316
/admin
19.6 Solution: Configure Threading with varnishadm and varnishstat 317
19.7 Solution: Configure Threading with varnishtest 318
19.8 Solution: Rewrite URL and Host Header Fields 320
19.9 Solution: Avoid caching a page 322
19.10 Solution: Either use s-maxage or set TTL by file type 323
19.11 Solution: Modify the HTTP response header fields 324
19.12 Solution: Change the error message 325
19.13 Solution: PURGE an article from the backend 327
19.14 Solution: Write a VCL program using purge and ban 330
19.15 Solution: Handle Cookies with Vary in varnishtest 331
19.16 Solution: Handle Cookies with hash_data() in varnishtest 333
19.17 Solution: Write a VCL that masquerades XHR calls 335
Abstract
The Varnish Book is the training material for Varnish Plus courses. This book teaches such
concepts to understand the theory behind Varnish Cache 4. Covered are the Varnish finite
state machine, design principles, HTTP, cache invalidation and more. With these foundations,
the book builds practical knowledge on Varnish Configuration Language (VCL), Varnish Test
Code (VTC) and Varnish utility programs such as varnishlog, varnishstat and
varnishtest. Examples and exercises develop the needed skills to administrate and extend
the functionality of Varnish. Also included are appendices that explain how to develop
Varnish Modules (VMODs) and how to use selected modules of Varnish Plus.
Preface
• Course for Varnish Plus
• Learn specific features depending the course and your needs
• Necessary Background
• How to Use the Book
• Acknowledgments
After finishing this course, you will be able to install and configure the Varnish server, and
write effective VCL code. The Varnish Book is designed for attendees of Varnish Plus courses.
Most of the presented material in this book applies to both, the open source Varnish Cache
and the commercial edition Varnish Cache Plus. Therefore, you can also refer to the Varnish
Cache documentation at https://www.varnish-cache.org/docs/4.0/.
Varnish Plus is a commercial suite by Varnish Software that offers products for scalability,
customization, monitoring, and expert support services. The engine of Varnish Plus is Varnish
Cache Plus, which is the enhanced commercial edition of Varnish Cache. Varnish Cache Plus
should not be confused with Varnish Plus, a product offering by Varnish Software. Varnish
Cache Plus is one of the software components available for Varnish Plus customers.
For simplicity, the book refers to Varnish Cache or Varnish Cache Plus as Varnish when there
is no difference between them. There is more information about differences between Varnish
Cache and Varnish Cache Plus in the Varnish Cache and Varnish Plus chapter.
The goal of this book is to make you confident when using Varnish. Varnish instructors focus
on your area, needs or interest. Varnish courses are usually flexible enough to make room
for it.
The instructor will cover selected material for the course you take. The System
Administration (Admin) course provides attendees with the necessary knowledge to
troubleshoot and tune common parameters of a Varnish server. The Web Developer
(Webdev) course teaches how to adapt web applications so that they work with Varnish,
which guarantees a fast experience for visitors of any website. Besides that, other courses
may also be taught with this book.
Necessary Background
The Admin course requires that you:
• have expertise in a shell on a Linux/UNIX machine, including editing text files and
starting daemons,
• understand HTTP cache headers,
• understand regular-expressions, and
• be able to install the software listed below.
• have expertise in a shell on a Linux/UNIX machine, including editing text files and
starting daemons,
• understand HTTP cache headers,
• understand regular-expressions, and
• be able to install the software listed below.
You do not need background in theory or application behind Varnish to complete this course.
However, it is assumed that you have experience and expertise in basic UNIX commands,
and that you can install the following software:
More specific required skills depend on the course you take. The book starts with the
installation of Varnish and navigation of some of the common configuration files. This part is
perhaps the most UNIX-centric part of the course.
How to Use the Book
• Most of the material in this book applies to both: Varnish Cache and Varnish Cache Plus.
Parts that apply only to Varnish Cache Plus are clearly stated.
• Varnish caching mechanisms are different than in other caching technologies. Open
your mind and try to think different when using Varnish.
• The instructor guides you through the book.
• Use the manual pages and help options.
• See Appendix E: Varnish Three Letter Acronyms for a list of acronyms.
The Varnish Book is designed to be used as training material under the Varnish Plus course
taught by a certified instructor. Under the course, the instructor guides you and selects the
relevant sections to learn. However, you can also use this book as self-instructional material.
There are almost always many ways to do an exercise. The solutions provided in Appendix G:
Solutions are not necessarily better than yours.
Varnish installs several reference manuals that are accessible through the manual page
command man. You can issue the command man -k varnish to list the manual pages that
mention Varnish in their short description. In addition, the vsl man page that explains the
Varnish Shared memory Logging (VSL). This man page does not come out when issuing
man -k varnish, because it does not contain the word varnish in its short description.
The command man varnishd, for example, retrieves the manual page of the Varnish HTTP
accelerator daemon. Also, some commands have a help option to print the usage of the
command. For example, varnishlog -h prints the usage and options of the command with
a short description of them.
In addition, you should refer to the documentation of Varnish Cache and Varnish Cache Plus.
This documentation provides you extended details on the topics covered in this book and
more. To access to this documentation, please visit
https://www.varnish-software.com/resources.
The Varnish installation described in this book uses Ubuntu Linux 14.04 LTS (trusty),
therefore most of the commands instructed in this book are for this Linux distribution. We
point out some differences on how to configure Varnish for other Linux distributions, but you
should reference your Linux distribution's documentation for more details.
The book is written with different formatting conventions. Varnish Configuration Language
(VCL) code uses the mono-spaced font type inside boxes:
vcl 4.0;
backend default {
.host = "127.0.0.1";
.port = "8080";
}
sub vcl_recv {
# Do request header transformations here.
if (req.url ~ "^/admin") {
return(pass);
}
}
The first occurrence of a new term is usually its definition, and appears in italics. File names
are indicated like this: /path/to/yourfile. Important notes, tips and warnings are also
inside boxes, but they use the normal body text font type.
Resources, and Errata
• https://varnish-cache.org
• https://varnish-software.com/academy
• #varnish-hacking and #varnish on irc.linpro.net.
• https://github.com/varnish/Varnish-Book/
• https://www.varnish-cache.org/docs/trunk/users-guide/troubleshooting.html
• https://www.varnish-cache.org/trac/wiki/VCLExamples
This book is meant to be understandable to everyone who takes a Varnish Plus course and
has the required skills. If you find something unclear, do not be shy and blame yourself, ask
your instructor for help. You can also contact the Varnish open source community at
https://varnish-cache.org. To book training, please look at
https://varnish-software.com/academy.
Additional examples from different Varnish versions are available at
https://www.varnish-cache.org/trac/wiki/VCLExamples. These examples are maintained by
the community.
For those interested in development, the developers arrange weekly bug washes were
recent tickets and development is discussed. This usually takes place on Mondays around
13:00 CET on the IRC channel #varnish-hacking on irc.linpro.net.
Errata, updates and general improvements of this book are available at its repository
https://github.com/varnish/Varnish-Book.
Acknowledgments
In addition to the authors, the following deserve special thanks (in no particular order):
• Rubén Romero
• Dag Haavi Finstad
• Martin Blix Grydeland
• Reza Naghibi
• Federico G. Schwindt
• Dridi Boukelmoune
• Lasse Karstensen
• Per Buer
• Sevan Janiyan
• Kacper Wysocki
• Magnus Hagander
• Arianna Aondio
• Poul-Henning Kamp
• Guillaume Quintard
• Everyone who has participated on the training courses
Page 20 Chapter 1 Introduction
1 Introduction
Table of contents:
• What is Varnish?
• Benefits of Varnish
• Open source / Free software
• Varnish Software: The company
• What is Varnish Plus?
• Varnish: more than a cache server
• History of Varnish
• Varnish Governance Board (VGB)
Chapter 1 Introduction Page 21
vcl 4.0;
backend default {
.host = "127.0.0.1";
.port = "8080";
}
sub vcl_recv {
# Do request header transformations here.
if (req.url ~ "^/admin") {
return(pass);
}
}
Varnish is flexible because you can configure it and write your own caching policies in its
Varnish Configuration Language (VCL). VCL is a domain specific language based on C. VCL is
then translated to C code and compiled, therefore Varnish executes lightning fast. Varnish
has shown itself to work well both on large (and expensive) servers and tiny appliances.
Chapter 1 Introduction Page 23
Table 1: Topics Covered in This Book and Their Availability in Varnish Cache and Varnish Plus
Varnish Cache is an open source project, and free software. The development process is
public and everyone can submit patches, or just take a peek at the code if there is some
uncertainty on how does Varnish Cache work. There is a community of volunteers who help
each other and newcomers. The BSD-like license used by Varnish Cache does not place
significant restriction on re-use of the code, which makes it possible to integrate Varnish
Cache in virtually any solution.
Varnish Cache is developed and tested on GNU/Linux and FreeBSD. The code-base is kept as
self-contained as possible to avoid introducing out-side bugs and unneeded complexity.
Therefore, Varnish uses very few external libraries.
Page 24 Chapter 1 Introduction
Varnish Software is the company behind Varnish Cache. Varnish Software and the Varnish
community maintain a package repository of Varnish Cache for several common GNU/Linux
distributions.
Varnish Software also provides a commercial suite called Varnish Plus with software products
for scalability, customization, monitoring and expert support services. The engine of the
Varnish Plus commercial suite is the enhanced commercial edition of Varnish Cache. This
edition is proprietary and it is called Varnish Cache Plus.
Table 1 shows the components covered in this book and their availability for Varnish Cache
users and Varnish Plus customers. The covered components of Varnish Plus are described in
the Varnish Plus Software Components chapter. For more information about the complete
Varnish Plus offer, please visit https://www.varnish-software.com/what-is-varnish-plus. A list
of supported platforms can be found in
https://www.varnish-software.com/customers/#platforms.
Note
Varnish Cache Plus should not be confused with Varnish Plus, a product offering by
Varnish Software. Varnish Cache Plus is one of the software components available for
Varnish Plus customers.
Chapter 1 Introduction Page 25
VG, a large Norwegian newspaper, initiated the Varnish project in cooperation with Linpro.
The lead developer of the Varnish project, Poul-Henning Kamp, is an experienced FreeBSD
kernel hacker. Poul-Henning Kamp continues to bring his wisdom to Varnish in most areas
where it counts.
From 2006 throughout 2008, most of the development was sponsored by VG, API, Escenic
and Aftenposten, with project management, infrastructure and extra man-power provided by
Redpill Linpro. At the time, Redpill Linpro had roughly 140 employees mostly centered
around consulting services.
Page 26 Chapter 1 Introduction
Today Varnish Software is able to fund the core development with income from service
agreements, in addition to offering development of specific features on a case-by-case basis.
The interest in Varnish continues to increase. An informal study based on the list of most
popular web sites in Norway indicates that about 75% or more of the web traffic that
originates in Norway is served through Varnish.
Varnish development is governed by the Varnish Governance Board (VGB), which thus far
has not needed to intervene. The VGB consists of an architect, a community representative
and a representative from Varnish Software.
As of November 2015, the VGB positions are filled by Poul-Henning Kamp (Architect), Rogier
Mulhuijzen (Community) and Lasse Karstensen (Varnish Software). On a day-to-day basis,
there is little need to interfere with the general flow of development.
Chapter 1 Introduction Page 27
The above list tries to summarize the most important changes from Varnish Cache 3 to
Varnish Cache 4. For more information, please visit:
https://varnish-cache.org/docs/4.1/whats-new/index.html
If you want to migrate your VCL code from Varnish 3 to Varnish 4, you may be interested in
looking at the varnish3to4 script. See the VCL Migrator from Varnish 3 to Varnish 4 section
for more information.
Chapter 2 Design Principles Page 29
2 Design Principles
Varnish is designed to:
The focus of Varnish has always been performance and flexibility. Varnish is designed for
hardware that you buy today, not the hardware you bought 15 years ago. This is a trade-off
to gain a simpler design and focus resources on modern hardware. Varnish is designed to
run on 64-bit architectures and scales almost proportional to the number of CPU cores you
have available. Though CPU-power is rarely a problem.
32-bit systems, in comparison to 64-bit systems, allow you to allocate less amount of virtual
memory space and less number of threads. The theoretical maximum space depends on the
operating system (OS) kernel, but 32-bit systems usually are bounded to 4GB. You may get,
however, about 3GB because the OS reserves some space for the kernel.
Varnish uses a workspace-oriented memory-model instead of allocating the exact amount of
space it needs at run-time. Varnish does not manage its allocated memory, but it delegates
this task to the OS because the kernel can normally do this task better than a user-space
program.
Event filters and notifications facilities such as epoll and kqueue are advanced features of
the OS that are designed for high-performance services like Varnish. By using these, Varnish
can move a lot of the complexity into the OS kernel which is also better positioned to decide
which threads are ready to execute and when.
Varnish uses the Varnish Configuration Language (VCL) that allows you to specify exactly
how to use and combine the features of Varnish. VCL is translated to C programming
language code. This code is compiled with a standard C compiler and then dynamically
linked directly into Varnish at run-time.
When you need functionalities that VCL does not provide, e.g., look for an IP address in a
database, you can write raw C code in your VCL. That is in-line C in VCL. However, in-line C is
strongly discouraged because in-line C is more difficult to debug, maintain and develop with
other developers. Instead in adding in-line C, you should modularized your C code in Varnish
modules, also known as VMODs.
VMODs are typically coded in VCL and C programming language. In practice, a VMOD is a
shared library with functions that can be called from VCL code.
The standard (std) VMOD, included in Varnish Cache, extends the functionality of VCL. std
VMOD includes non-standard header manipulation, complex header normalization and
access to memcached among other functionalities. Appendix D: VMOD Development
explains in more details how VMODs work and how to develop yours.
Page 30 Chapter 2 Design Principles
The Varnish Shared memory Log (VSL) allows Varnish to log large amounts of information at
almost no cost by having other applications parse the data and extract the useful bits. This
design and other mechanisms decrease lock-contention in the heavily threaded environment
of Varnish.
To summarize: Varnish is designed to run on modern hardware under real work-loads and to
solve real problems. Varnish does not cater to the "I want to make Varnish run on my 486
just because"-crowd. If it does work on your 486, then that's fine, but that's not where you
will see our focus. Nor will you see us sacrifice performance or simplicity for the sake of
niche use-cases that can easily be solved by other means -- like using a 64-bit OS.
Chapter 2 Design Principles Page 31
Figure 2 shows the lifetime of cached objects. A cached object has an origin timestamp
t_origin and three duration attributes: 1) TTL, 2) grace, and 3) keep. t_origin is the time
when an object was created in the backend. An object lives in cache until
TTL + grace + keep elapses. After that time, the object is removed by the Varnish daemon.
In a timeline, objects within the time-to-live TTL are considered fresh objects. Stale objects
are those within the time period TTL and grace. Objects within t_origin and keep are
used when applying conditions with the HTTP header field If-Modified-Since.
The VCL – vcl_backend_fetch and VCL – vcl_backend_response sections explain how Varnish
handles backend responses and how these duration attributes affect subsequent actions.
Chapter 3 Getting Started Page 33
3 Getting Started
In this chapter, you will:
Most of the commands you will type in this course require root privileges. You can get
temporary root privileges by typing sudo <command>, or permanent root privileges by typing
sudo -i.
In Varnish terminology, a backend is the origin server. In other words, it is whatever server
Varnish talks to fetch content. This can be any sort of service as long as it understands HTTP.
Most of the time, Varnish talks to a web server or an application frontend server. In this
book, we use backend, origin server, web server or application frontend server depending
the context.
Page 34 Chapter 3 Getting Started
• varnishd
• varnishtest
• varnishadm
• varnishlog
• varnishstat
• and more
The Varnish distribution includes several utility programs that you will use in this course. You
will learn how to use these programs as you progress, but it is useful to have a brief
introduction about them before we start.
The central block of Varnish is the Varnish daemon varnishd. This daemon accepts HTTP
requests from clients, sends requests to a backend, caches the returned objects and replies
to the client request. varnishd is further explained in the Varnish Architecture section.
varnishtest is a script driven program used to test your Varnish installation. varnishtest
is very powerful because it allows you to create client mock-ups, fetch content from mock-up
or real backends, interact with your actual Varnish configuration, and assert the expected
behavior. varnishtest is also very useful to learn more about the behavior of Varnish.
varnishadm controls a running Varnish instance. The varnishadm utility establishes a
command line interface (CLI) connection to varnishd. This utility is the only one that may
affect a running instance of Varnish. You can use varnishadm to:
The front-end varnishadm of the Varnish Command Line Interface (CLI) section explains in
more detail this utility.
The Varnish log provides large amounts of information, thus it is usually necessary to filter it.
For example, "show me only what matches X". varnishlog does precisely that. You will
learn more about varnishlog in the Examining Varnish Server's Output chapter.
varnishstat is used to access global counters. It provides overall statistics, e.g the number
of total requests, number of objects, and more. varnishstat is particularly useful when
using it together with varnishlog to analyze your Varnish installation. The varnishstat
section explains in detail this utility.
In addition, there are other utility programs such as varnishncsa, varnishtop and
varnishhist. Appendix B: Varnish Programs explains them.
Chapter 3 Getting Started Page 35
• When you are done, verify your Varnish version, run varnishd -V
For official training courses, a varnish-plus package should already be available for
installation. When in doubt, ask your instructor to confirm which package should be installed.
You may skip this exercise if already have a well configured environment to test Varnish. In
case you get stuck, you may look at the proposed solution.
[Service]
ExecStart=
ExecStart=/usr/sbin/varnishd -a :80 -T localhost:6082 -f \
/etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m
This file overrides the ExecStart option of the default configuration shipped with Varnish
Cache. Run systemctl daemon-reload to make sure systemd picks up the new
configuration before restarting Varnish.
[3] Create a drop-in systemd service file in
/etc/systemd/system/varnishlog.service.d/customexec.conf to customize your
varnishlog configuration.
Page 36 Chapter 3 Getting Started
• /etc/varnish/varnish.params
• Variable substitution in /usr/lib/systemd/system/varnish.service:
-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT}
-T ${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT}
See Table 3 and locate the Varnish configuration file for your installation. Open and edit that
file to listen to client requests on port 80 and have the management interface on port 1234.
In Ubuntu and Debian, this is configured with options -a and -T of variable DAEMON_OPTS. In
CentOS, RHEL, and Fedora, use VARNISH_LISTEN_PORT and VARNISH_ADMIN_LISTEN_PORT
respectively.
In order for changes in the configuration file to take effect, varnishd must be restarted. The
safest way to restart Varnish is by using service varnish restart.
The default VCL file location is /etc/varnish/default.vcl. You can change this location by
editing the configuration file. The VCL file contains the backend definitions.
In this book, we use Apache as backend. Before continuing, make sure you have Apache
installed and configured to listen on port 8080. See Appendix F: Apache as Backend if you do
not know how to do it.
Edit /etc/varnish/default.vcl to use Apache as backend:
backend default {
.host = "127.0.0.1";
.port = "8080";
}
Varnish Cache Plus supports SSL/TLS encryption. To encrypt connections between Varnish
and the backend, you specify it as follows:
backend default {
.host = "host.name";
.port = "https"; # This defaults to https when SSL
.ssl = 1; # Turns on SSL support
.ssl_nosni = 0; # Disable SNI extension
.ssl_noverify = 1; # Don't verify peer
}
Page 38 Chapter 3 Getting Started
For Varnish to accept incoming encrypted connections, you need a terminator for encrypted
connections such as hitch https://github.com/varnish/hitch. Varnish Plus 4.1 has integrated
this functionality and you can easily configure it as detailed in SSL/TLS frontend support with
hitch.
Chapter 3 Getting Started Page 39
or:
or:
varnishadm vcl.list
This command does not restart varnishd, it only reloads the compiled VCL code.
The result of your configuration is resumed in Table 4.
Warning
If you have Security-Enhanced Linux (SELinux), be aware that SELinux defines ports
6081 and 6082 for varnishd. If you need to use another port number, you need either
to disable SELinux or set the boolean varnishd_connect_any variable to 1. You can
do that by executing the command sudo setsebool varnishd_connect_any 1.
Tip
Issue the command man vcl to see all available options to define a backend.
Tip
You can also configure Varnish via the Varnish Administration Console (VAC).
Chapter 3 Getting Started Page 41
Figure 3: GUI to configure Varnish via the Varnish Administration Console (VAC).
Page 42 Chapter 3 Getting Started
# http -p Hh localhost
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate, compress
Host: localhost
User-Agent: HTTPie/0.8.0
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 3256
Content-Type: text/html
Date: Wed, 18 Mar 2015 13:55:28 GMT
ETag: "2cf6-5118f93ad6885-gzip"
Last-Modified: Wed, 18 Mar 2015 12:53:59 GMT
Server: Apache/2.4.7 (Ubuntu)
Vary: Accept-Encoding
Via: 1.1 varnish-plus-v4
X-Varnish: 32770
You can test your Varnish installation by issuing the command http -p Hh localhost. If
you see the HTTP response header field Via containing varnish, then your installation is
correct.
The X-Varnish HTTP header field contains the Varnish Transaction ID (VXID) of the client
request and if applicable, the VXID of the backend transaction that stored in cache the object
delivered. X-Varnish is useful to find the correct log entries in the Varnish log. For a cache
hit, X-Varnish contains both the ID of the current request and the ID of the request that
populated the cache. You will learn more about VXIDs in the Transactions section.
You can also define and test connectivity against any backend in varnishtest. Learn how to
it by doing the Exercise: Test Apache as Backend with varnishtest.
Chapter 3 Getting Started Page 43
Varnish has a command line interface (CLI) which can control and change most of the
operational parameters and the configuration of the Varnish daemon, without interrupting
the running service. varnishadm is front-end to the Varnish interface. In practice, it is a
utility program with a set of functions that interact with the Varnish daemon varnishd. If
there are many Varnish instances running in one machine, specify the instance with the -n
option of varnishadm. Keep the following in mind when issuing commands to the Varnish
daemon:
1. Changes take effect on the running Varnish daemon instance without need to restart it.
2. Changes are not persistent across restarts of Varnish. If you change a parameter and
you want the change to persist after you restart Varnish, you need to store your
changes in the configuration file of the boot script. The location of the configuration file
is in Table 3
varnishadm uses a non-encrypted key stored in a secret file to authenticate and connect to a
Varnish daemon. You can control access by adjusting the read permission on the secret file.
varnishadm looks for the secret file in /etc/varnish/secret by default, but you can use
the -S option to specify another location. The content of the file is a shared secret, which is
a string generated under Varnish installation.
varnishadm authenticates with a challenge-response mechanism. Therefore, the shared
secret is never transmitted, but a challenge and the response to the challenge. This
authentication mechanism offers a reasonably good access control, but it does not protect
the data transmitted over the connection.
In order to avoid eavesdroppers like in the man-in-the-middle attack, we recommend that
you configure the management interface listening address of varnishd to listen only on
localhost (127.0.0.1 IP address). You configure this address with the -T option of the
varnishd command.
For convenience, varnishadm has an embedded command line tool. You can access it by
simply issuing varnishadm in the terminal.
Page 44 Chapter 3 Getting Started
Tip
Varnish provides many on-line reference manuals. To learn more about varnishadm,
issue man varnishadm. To check the Varnish CLI manual page, issue
man varnish-cli.
Tip
You can also access the varnishadm via the Varnish Administration Console (VAC). To
do that, you just have to navigate to the CONFIGURE tab and click on the Varnish
server you want to administrate. Then, varnishadm is ready to use in a terminal
emulator right in your web browser.
Figure 4: Access to varnishadm by clicking on the Varnish server that you want to
administrate.
Command Result
service varnish restart Restarts Varnish using the operating system mechanisms.
Caches are flushed.
service varnish reload Only reloads the VCL. Caches are not affected.
varnishadm vcl.load Can be used to manually reload VCL. The service
<configname> <filename> and varnish reload command does this for you
varnishadm vcl.use automatically.
<configname>
varnishadm param.set <param> Can be used to set parameters without restarting Varnish.
<value>
There are other ways to reload VCL and make parameter-changes take effect, mostly using
the varnishadm tool. However, using the service varnish reload and
service varnish restart commands is a good habit.
Note
If you want to know how the service varnish-commands work, look at the script that
runs behind the scenes. The script is in /etc/init.d/varnish.
Warning
The varnish script-configuration (located under /etc/default/ or /etc/sysconfig/) is directly
sourced as a shell script. Pay close attention to any backslashes (\) and quotation
marks that might move around as you edit the DAEMON_OPTS environmental variable.
Chapter 3 Getting Started Page 47
All the options that you can pass to the varnishd binary are documented in the
varnishd(1) manual page (man varnishd). You may want to take a moment to skim over
the options mentioned above.
For Varnish to start, you must specify a backend. You can specify a backend by two means:
1) declare it in a VCL file, or 2) use the -b to declare a backend when starting varnishd.
Though they are not strictly required, you almost always want to specify a -s to select a
storage backend, -a to make sure Varnish listens for clients on the port you expect and -T
to enable a management interface, often referred to as a telnet interface.
For both -T and -a, you do not need to specify an IP, but can use :80 to tell Varnish to
listen to port 80 on all IPs available. Make sure you do not forget the colon, as -a 80 tells
Varnish to listen to the IP with the decimal-representation "80", which is almost certainly not
what you want. This is a result of the underlying function that accepts this kind of syntax.
You can specify -p for parameters multiple times. The workflow for tuning Varnish
parameters usually is that you first try the parameter on a running Varnish through the
management interface to find the value you want. Then, you store the parameter and value
in a configuration file. This file is read every time you start Varnish.
The -S option specifies a file which contains a secret to be used for authentication. This can
be used to authenticate with varnishadm -S as long as varnishadm can read the same
secret file -- or rather the same content: The content of the file can be copied to another
machine to allow varnishadm to access the management interface remotely.
Note
Varnish requires at least one backend, which is normally specified in the VCL file. The
VCL file is passed to varnishd with the -f <filename.vcl> option. However, it is
possible to start Varnish without a VCL file. In this case, the backend is passed directly
to varnishd with the -b <hostname:port> option. -f and -b are mutually exclusive.
Page 48 Chapter 3 Getting Started
Tip
Type man varnishd to see all options of the Varnish daemon.
Chapter 3 Getting Started Page 49
vcl 4.0;
backend default {
.host = "localhost";
.port = "8080";
}
The above example defines a backend named default, where the name default is not
special. Varnish uses the first backend you specify as default. You can specify many
backends at the same time, but for now, we will only specify one to get started.
Tip
You can also add and edit your VCL code via the Varnish Administration Console (VAC).
This interface also allows you to administrate your VCL files.
Figure 6: GUI of Varnish Administration Console (VAC) with command line interface to edit your
VCL code.
Page 50 Chapter 3 Getting Started
-p hH specifies HTTPie to print only request and response headers, but not the content. The
typical HTTP response is 200 OK or 404 File not found. Feel free to try removing some of
the options and observe the effect. For more information about the HTTPie command, type
man http.
Testing Varnish with a web browser can be confusing, because web browsers have their own
cache. Therefore, it is useful to double-check web browsers requests with HTTPie or
varnishtest as explained in Fetch Data with varnishtest. For more information about the
Age response header field refer to the Age subsection.
Page 52 Chapter 4 Examining Varnish Server's Output
• log records,
• statistics out from global counters and the Varnish log,
• the log layout,
• transactions,
• the query language, and
• notable counters.
Varnish logs information of requests, caches and responses to The Varnish Shared memory
Log (VSL). Logs are available through Varnish tools with a short delay, but usually not
noticeable. The VSL is overwritten when filled-up in circular order.
The memory log overwriting has two effects. On the one hand, there is no historic data, but
on the other hand, there is an abundance of information accessible at a very high speed.
Still, you can instruct Varnish to store logs in files.
Varnish generates very large amounts of data, therefore it does not write logs to disk by
default, but only to memory. However, if you need to enable logging to disk, as when
debugging a crashing Varnish installation, you set VARNISHNCSA_ENABLED=1 or
VARNISHNCSA_ENABLED=1 in /etc/default/varnishlog or /etc/default/varnishncsa
respectively. Table 3 shows the location of the configuration file based on different
platforms.
Varnish provides specific tools to parse the content of logs: varnishlog, varnishncsa,
varnishstat, and varnishstat among others. varnishlog and varnishstat are the two
most common used tools.
Tip
All utility programs have installed reference manuals. Use the man command to
retrieve their manual pages.
Chapter 4 Examining Varnish Server's Output Page 53
Statistical tools:
If you have multiple Varnish instances running in the same machine, you need to specify
-n <name> both when starting Varnish and when using the tools. This option is used to
specify the instance of varnishd, or the location of the shared memory log. All tools
(including varnishadm) can also take a -n option.
In this course, we focus on the two most important tools: varnishlog and varnishstat.
Unlike all other tools, varnishstat does not read entries from the Varnish log, but from
global counters. You can find more details about the other Varnish tools varnishncsa,
varnishtop and varnishhist in Appendix B: Varnish Programs.
Page 54 Chapter 4 Examining Varnish Server's Output
Varnish logs transactions chronologically as Figure 7 shows. The varnishlog is one of the
most used tools and offers mechanisms to reorder transactions grouped by TCP session,
frontend- or backend worker. We talk more on transactions in the next subsection.
The various arguments of varnishlog are mostly designed to help you find exactly what
you want, and filter out the noise. On production traffic, the amount of log data that Varnish
produces is staggering, and filtering is a requirement for using varnishlog effectively. Next
section explains transactions and how to reorder them.
varnishtest starts a real varnishd process for each test, therefore it also logs in VSL.
When your Varnish test fails or you run varnishtest in verbose mode, you can see the vsl
entry for each Varnish log record. You can also use the logexpect to assert the expected
behavior in your tests.
Chapter 4 Examining Varnish Server's Output Page 55
4.3 Transactions
$ varnishlog -g <session|request|vxid|raw> -d
A transaction is one work item in Varnish and it is a set of log lines that belong together, e.g.,
a client request or a backend request. Varnish Transaction IDs (VXIDs) are applied to lots of
different kinds of work items. A unique VXID is assigned to each type of transaction. The 0
VXID is reserved for everything that Varnish does but not part of a specific transaction. You
can follow the VXID when you analyze the log through varnishlog or varnishtest.
Transaction types are:
Varnish logs are grouped by VXID by default. For example, when viewing a log for a simple
cache miss, you see logs in the order they end. That is: 1) backend request (BeReq), 2) client
request (Request) and 3) session (Session).
Each transaction has a reason, for examples:
• Client request
• ESI request
• restart
• fetch
To learn more about this topic in varnishtest, refer to the section: Example of Transactions
in varnishtest.
Chapter 4 Examining Varnish Server's Output Page 57
Figure 10 shows a client request in a cache miss scenario. In the figure, varnishlog returns
records grouped by request. For simplicity, we use the -i option to include only the Begin
and Link tags.
For more information about the format and content of all Varnish shared memory logging
(VSL) tags, see the VSL man page by typing man vsl. The workflow of Varnish is detailed in
the VCL Basics chapter.
To reproduce the example, issue http -p hH http://localhost/, and then the
varnishlog command as above. The -d option processes all recorded entries in Varnish log.
To learn more about the available varnishlog options, enter varnishlog -h or see the
varnishlog man page.
varnishlog accepts all options that are syntactically correct. The output, however, might be
different from your first interpretation. Therefore, you should make sure that your results
make sense.
Options -b and -c display only transactions coming from the backend and client
communication respectively. You can verify the meaning of your results by double checking
the filters, and separating your results with the -b and -c options.
Note
The logexpect command from varnishtest accepts the same arguments as
varnishlog.
Chapter 4 Examining Varnish Server's Output Page 59
The <record selection criteria> determines what kind of records from the transaction
group the expression applies to. The syntax is:
{level}taglist:record-prefix[field]
Page 60 Chapter 4 Examining Varnish Server's Output
For example:
Taglists are not case-sensitive, but we recommend you to follow the same format as
declared in man vsl.
The grouping and the query log processing all happens in the varnishlog API. This means
that other programs using this API automatically get grouping and query language, just as
logexpect does. See logexpect to learn more about it.
Tip
man vsl-query shows you more details about query expressions. man vsl lists all
taglists and their syntax.
Chapter 4 Examining Varnish Server's Output Page 61
There are multiple ways to provoke your backend fail. For example, misconfigure your
backend in Varnish or stop your backend.
You can filter and print specific messages from the varnishlog in many ways. The purpose
of this exercise is to use the query option -q, but you can also use the include tags option -i
or -I and the grep command.
Note
You can also use varnishtest to provoke a Service Unavailable response and assert it
by reading VSL with logexpect.
Page 62 Chapter 4 Examining Varnish Server's Output
4.6 varnishstat
Uptime mgt: 1+23:38:08 Hitrate n: 10 100 438
Uptime child: 1+23:38:08 avg(n): 0.9967 0.5686 0.3870
MAIN.cache_hit INFO
Cache hits:
Count of cache hits. A cache hit indicates that an object has been delivered to a client without fetching it from a
backend server.
Chapter 4 Examining Varnish Server's Output Page 63
Column Description
Name The name of the counter
Current The current value of the counter.
Change The average per second change over the last update interval.
Average The average value of this counter over the runtime of the Varnish daemon, or a
period if the counter can't be averaged.
Avg_10 The moving average over the last 10 update intervals.
Avg_100 The moving average over the last 100 update intervals.
Avg_1000 The moving average over the last 1000 update intervals.
varnishstat looks only at counters, which give a good representation of the general health
of Varnish. Counters, unlike the rest of the log, are not directly mapped to a single request,
but represent how many times a specific action has occurred since Varnish started. These
counters are easily found in VSL, and are typically polled at reasonable interval to give the
impression of real-time updates.
varnishstat can be used to determine the request rate, memory usage, thread usage,
number of failed backend connections, and more. varnishstat gives you information just
about anything that is not related to a specific request.
There are over a hundred different counters available. To increase the usefulness of
varnishstat, only counters with a value different from 0 are shown by default.
varnishstat can be used interactively, or it can display the current values of all the
counters with the -1 option. Both methods allow you to specify specific counters using
-f field1 -f field2 .. to limit the list.
In interactive mode, varnishstat has three areas. The top area shows process uptime and
hitrate information. The center area shows a list of counter values. The bottom area shows
the description of the currently selected counter.
Hitrate n and avg(n) are related, where n is the number intervals. avg(n) measures the
cache hit rate within n intervals. The default interval time is one second. You can configure
the interval time with the -w option.
Since there is no historical data of counters changes, varnishstat has to compute the
average while it is running. Therefore, when you start varnishstat, the values of
Hitrate n start at 1, then they increase to 10, 100 and 1000. In the above example, the
interval is one second. The hitrate average avg(n) show data for the last 10, 100, and 438
seconds. The average hitrate is 0.9967 (or 99.67%) for the last 10 seconds, 0.5686 for the
last 100 seconds and 0.3870 for the last 438 seconds.
In the above example, Varnish has served 1055 requests and is currently serving roughly
7.98 requests per second. Some counters do not have "per interval" data, but are gauges
with values that increase and decrease. Gauges start with a g_ prefix.
Page 64 Chapter 4 Examining Varnish Server's Output
Tip
You can also see many parameters in real-time graphs with the Varnish Administration
Console (VAC).
Tip
If you need to collect statistics from more than a single Varnish server, Varnish
Custom Statistics (VCS) allows you to do that. In addition, VCS allows you to define
your metrics to collect and analyze aggregated statistics, for example:
• A/B testing
• Measuring click-through rate
Counter Description
MAIN.cache_hit Indicates the number of objects delivered to clients without fetching
them from the backend
MAIN.cache_hitpass Counts how many times the hit-for-pass object has been hit, i.e.,
Varnish passes the request to the backend.
MAIN.cache_miss Shows how many requested objects were fetched from the backend
MAIN.client_req Number of parseable client requests received
MAIN.threads_limited Counts how many times varnishd hits the maximum allowed number
of threads. The maximum number of Varnish threads is given by the
parameter thread_pool_max. Issue the command
varnishadm param.show thread_pool_max to see this
parameter.
MAIN.threads_failed Increases every time pthread_create() fails. You can avoid this
situation by tuning the maximum number of processes available with
the ulimit -u command. You may also look at the thread-max
Linux parameter in /proc/sys/kernel/threads-max.
MAIN.thread_queue_len Shows the current number of sessions waiting for a thread. This
counter is first introduced in Varnish 4.
MAIN.sess_queued Contains the number of sessions that are queued because there are no
available threads immediately. Consider to increase the
thread_pool_min parameter.
MAIN.sess_dropped Counts how many times sessions are dropped because varnishd
hits the maximum thread queue length. You may consider to increase
the thread_queue_limit Varnish parameter as a solution to drop
less sessions.
MAIN.n_lru_nuked Number of least recently used (LRU) objects thrown out to make room
for new objects. If this is zero, there is no reason to enlarge your cache.
Otherwise, your cache is evicting objects due to space constraints. In
this case, consider increasing the size of your cache.
MAIN.n_object Number of cached objects
Varnish provides a large number of counters for information and debugging purposes. Table
8 presents counters that are typically important. Other counters may be relevant only for
Varnish developers when providing support.
Counters also provide feedback to Varnish developers on how Varnish works in production
environments. This feedback in turn allows Varnish to be developed according to its real
usage. Issue varnishstat -1 to list all counters with their current values.
Chapter 4 Examining Varnish Server's Output Page 67
Note
If you have many backends, consider to increase the size of the shared memory log.
For that, see the option -l in the man page of varnishd.
Tip
Remember that Varnish provides many reference manuals. To see all Varnish counter
field definitions, issue man varnish-counters.
Page 68 Chapter 4 Examining Varnish Server's Output
Counters are also accessible from varnishtest. If you are done with this exercise and have
still time, try to assert some counters as described in Exercise: Assert Counters in
varnishtest.
Chapter 5 Tuning Page 69
5 Tuning
This chapter is for the system administration course only
This section covers:
• Architecture
• Best practices
• Parameters
Perhaps the most important aspect of tuning Varnish is writing effective VCL code. For now,
however, we will focus on tuning Varnish for your hardware, operating system and network.
To be able to do that, knowledge of Varnish architecture is helpful.
It is important to know the internal architecture of Varnish for two reasons. First, the
architecture is chiefly responsible for the performance, and second, it influences how you
integrate Varnish in your own architecture.
There are several aspects of the design that were unique to Varnish when it was originally
implemented. Truly good solutions, regardless of reusing ancient ideas or coming up with
something radically different, is the aim of Varnish.
Page 70 Chapter 5 Tuning
Figure 15 shows a block diagram of the Varnish architecture. The diagram shows the data
flow between the principal parts of Varnish.
The main block is the Manager process, which is contained in the varnishd binary program.
The task of the Manager process is to delegate tasks, including caching, to child processes.
The Manager process ensures that there is always a process for each task. The main driver
for these design decisions is security, which is explain at Security barriers in Varnish
https://www.varnish-cache.org/docs/trunk/phk/barriers.html.
The Manager's command line interface (CLI) is accessible through: 1) varnishadm as
explained in The front-end varnishadm of the Varnish Command Line Interface (CLI) section,
2) the Varnish Agent vagent2, or 3) the Varnish Administration Console (VAC) (via vagent2)
The Varnish Agent vagent2 is an open source HTTP REST interface that exposes varnishd
services to allow remote control and monitoring. vagent2 offers a web UI as shown in Figure
16, but you can write your own UI since vagent2 is an open interface. Some features of
vagent2 are:
Figure 16: Varnish Agent's HTML interface; designed to showcase the various features of the Varnish
Agent.
For more information about vagent2 and installation instructions, please visit
https://github.com/varnish/vagent2.
Varnish Software has a commercial offering of a fully functional web UI called Varnish
Administration Console (VAC). For more information about VAC, refer to the Varnish
Administration Console (VAC) section.
Page 72 Chapter 5 Tuning
The Manager checks every few seconds whether the Cacher is still there. If the Manager
does not get a reply within a given interval defined in ping_interval, the Manager kills the
Cacher and starts it up again. This automatic restart also happens if the Cacher exits
unexpectedly, for example, from a segmentation fault or assert error. You can ping manually
the cacher by executing varnishadm ping.
Automatic restart of child processes is a resilience property of Varnish. This property ensures
that even if Varnish contains a critical bug that crashes the child, the child starts up again
usually within a few seconds. You can toggle this property using the auto_restart
parameter.
Note
Even if you do not perceive a lengthy service downtime, you should check whether the
Varnish child is being restarted. This is important, because child restarts introduce
extra loading time as varnishd is constantly emptying its cache. Automatic restarts
are logged into /var/log/syslog.
To verify that the child process is not being restarted, you can also check its lifetime
with the MAIN.uptime counter in varnishstat.
Varnish Software and the Varnish community at large occasionally get requests for
assistance in performance tuning Varnish that turn out to be crash-issues.
Chapter 5 Tuning Page 73
Varnish uses workspaces to reduce the contention between each thread when they need to
acquire or modify memory. There are multiple workspaces, but in Varnish 4 and later, the
most important ones are workspace_client and workspace_backend. Memory in
workspace_client is used to manipulate data at the frontend side of the Varnish state
machine, i.e., request data. workspace_backend is used to manipulate data fetched from
the backend side. As an example, think about the memory needed to normalize an object's
Host header from www.example.com to example.com before it is stored in the cache.
If you have 5 MB of workspace and are using 1000 threads, the actual memory usage is not
5 GB. The virtual memory usage will indeed be 5GB, but unless you actually use the
memory, this is not a problem. Your memory controller and operating system will keep track
of what you actually use.
To communicate with the rest of the system, the child process uses the VSL accessible from
the file system. This means that if a thread needs to log something, all it has to do is to grab
a lock, write to a memory area and then free the lock. In addition to that, each worker thread
has a cache for log-data to reduce lock contention. We will discuss more about the Threading
Model later in this chapter.
The log file is usually about 80MB, and split in two. The first part is counters, the second part
is request data. To view the actual data, a number of tools exist that parses the VSL.
Since the log data is not meant to be written to disk in its raw form, Varnish can afford to be
very verbose. You then use one of the log-parsing tools to extract the piece of information
you want -- either to store it permanently or to monitor Varnish in real-time.
If something goes wrong in the Cacher, it logs a detailed panic message to syslog. For
testing, you can induce panic to varnishd by issuing the command
varnishadm debug.panic.worker or by pressing the Induce Panic button in the Varnish
Agent web interface.
Page 74 Chapter 5 Tuning
varnishd -C -f <vcl_filename>
Configuring the caching policies of Varnish is done in the Varnish Configuration Language
(VCL). Your VCL is then translated by the VCC process to C, which is compiled by a normal C
compiler – typically gcc, and linked into the running Varnish instance. Since the VCL
compilation is done outside of the child process, there is no risk of affecting the running
Varnish instance by accidentally loading an ill-formatted VCL.
As a result, changing configuration while running Varnish is very cheap. Policies of the new
VCL takes effect immediately. However, objects cached with an older configuration may
persist until they have no more old references or the new configuration acts on them.
A compiled VCL file is kept around until you restart Varnish completely, or until you issue
vcl.discard from the management interface. You can only discard compiled VCL files after
all references to them are gone. You can see the amount of VCL references by reading the
parameter vcl.list.
Chapter 5 Tuning Page 75
• malloc
• file
• persistent (deprecated)
• mse Varnish Massive Storage Engine (MSE) in Varnish Plus only
The -s <malloc[,size]> option calls malloc() to allocate memory space for every object
that goes into the cache. If the allocated space cannot fit in memory, the operating system
automatically swaps the needed space to disk.
Varnish uses the jemalloc implementation. Although jemalloc emphasizes fragmentation
avoidance, fragmentation still occurs. Jemalloc worst case of memory fragmentation is 20%,
therefore, expect up to this percentage of additional memory usage. In addition to memory
fragmentation you should consider an additional 5% overhead as described later in this
section.
Another option is -s <file,path[,size[,granularity]]>. This option creates a file on a
filesystem to contain the entire cache. Then, the operating system maps the entire file into
memory if possible.
The -s file storage method does not retain data when you stop or restart Varnish! For
persistence, use the option -s persistent. The usage of this option, however, is strongly
discouraged mainly because of consistency issues that might arise with it.
The Varnish Massive Storage Engine (MSE) option -s <mse,path[,path...]]> is an
improved storage method for Varnish Plus only. MSE main improvements are decreased disk
I/O load and lower storage fragmentation. MSE is designed to store and handle over 100 TB
with persistence, which makes it very useful for video on demand setups.
MSE uses a hybrid of two cache algorithms, least recently used (LRU) and least frequently
used (LFU), to manage memory. Benchmarks show that this algorithm outperforms malloc
and file. MSE also implements a mechanism to eliminate internal fragmentation.
The latest version of MSE requires a bookkeeping file. The size of this bookkeeping file
depends on the cache size. Cache sizes in the order of gigabytes require a bookkeeping file
of around 1% of the storage size. Cache sizes in the order of terabytes should have a
bookkeeping file size around 0.5% of storage size.
For detailed instructions on how to configure MSE, please refer to the Varnish Plus
documentation. For more details about its features and previous versions, please visit
https://info.varnish-software.com/blog/varnish-mse-persistence.
When choosing storage backend, use malloc if your cache will be contained entirely or
mostly in memory. If your cache will exceed the available physical memory, you have two
Page 76 Chapter 5 Tuning
options: file or mse. We recommend you to use MSE because it performs much better than
file storage backend.
There is a storage overhead in Varnish, so the actual memory footprint of Varnish exceeds
what the -s argument specifies if the cache is full. The current estimated overhead is 1kB
per object. For 1 million objects, that means 1GB extra memory usage. This estimate might
slightly vary between Varnish versions.
In addition to the overhead per object, Varnish requires memory to manage the cache and
handle its own operation. Our tests show that an estimate of 5% of overhead is accurate
enough. This overhead applies equally to malloc, file or mse options.
For more details about memory usage in Varnish, please refer to
https://info.varnish-software.com/blog/understanding-varnish-cache-memory-usage.
Note
As a rule of thumb use: malloc if the space you want to allocate fits in memory, if not,
use file or mse. Remember that there is about 5% memory overhead and do not
forget to consider the memory needed for fragmentation in malloc or the disk space
for the bookkeeping file in mse.
Chapter 5 Tuning Page 77
The Varnish Shared memory Log (VSL), sometimes called shm-log or SHMLOG, is used to log
most data. VSL operates on a circular buffer. Therefore, there is no a start or an end of it, but
you can issue varnishlog -d to see old log entries.
VSL is 80MB large by default and is not persistent, unless you instruct Varnish to do
otherwise. VSL is memory space mapped under /var/lib/varnish/. To change the size of
the VSL, see the option -l in the man page of varnishd.
There is not much you have to do with the VSL, except ensure that it does not cause I/O
operations. You can avoid I/O by mounting the VSL as a temporary file storage (tmpfs). This
is typically configured in /etc/fstab, and the shm-log is normally kept under
/var/lib/varnish/ or equivalent locations. You need to restart varnishd after mounting it
as tmpfs.
Warning
Some Varnish distribution setup the file storage backend option -s file by default.
Those distribution set a path that puts the storage file in the same directory as the
shm-log. We discourage this practice.
Page 78 Chapter 5 Tuning
param.show -l
Varnish has many different parameters which can be adjusted to make Varnish act better
under specific workloads or with specific software and hardware setups. They can all be
viewed with param.show in the management interface varnishadm.
You can set up parameters in two different ways. In varnishadm, use the command
param.set <param> <value>. Alternatively, you can issue the command
varnishd -p param=value.
Remember that changes made in the management interface are not persistent. Therefore,
unless you store your changes in a startup script, they will be lost when Varnish restarts.
The general advice with regards to parameters is to keep it simple. Most of the defaults are
optimal. If you do not have a very specific need, it is generally better to use the default
values.
A few debug commands exist in the CLI, which can be revealed with help -d. These
commands are meant exclusively for development or testing, and many of them are
downright dangerous.
Tip
Parameters can also be configured via the Varnish Administration Console (VAC) as
shown in the figure below.
Chapter 5 Tuning Page 79
Figure 17: GUI to configure parameters via the Varnish Administration Console (VAC).
Page 80 Chapter 5 Tuning
The child process runs multiple threads in two thread pools. The threads of these pools are
called worker threads. Table 9 presents relevant threads.
Chapter 5 Tuning Page 81
When tuning Varnish, think about the expected traffic. The most important thread setting is
the number of cache-worker threads. You may configure thread_pool_min and
thread_pool_max. These parameters are per thread pool.
Although Varnish threading model allows you to use multiple thread pools, we recommend
you to do not modify this parameter. Based on our experience and tests, we have seen that
2 thread pools are enough. In other words, the performance of Varnish does not increase
when adding more than 2 pools.
Note
If you run across the tuning advice that suggests to have a thread pool per CPU core,
rest assured that this is old advice. We recommend to have at most 2 thread pools,
but you may increase the number of threads per pool.
Page 82 Chapter 5 Tuning
Chapter 5 Tuning Page 83
Varnish runs one thread per session, so the maximum number of threads is equal to the
number of maximum sessions that Varnish can serve concurrently. If you seem to need more
threads than the default, it is very likely that there is something wrong in your setup.
Therefore, you should investigate elsewhere before you increase the maximum value.
You can observe if the default values are enough by looking at MAIN.sess_queued through
varnishstat. Look at the counter over time, because it is fairly static right after startup.
When tuning the number of threads, thread_pool_min and thread_pool_max are the most
important parameters. Values of these parameters are per thread pool. The thread_pools
parameter is mainly used to calculate the total number of threads. For the sake of keeping
things simple, the current best practice is to leave thread_pools at the default 2 [pools].
Varnish operates with multiple pools of threads. When a connection is accepted, the
connection is delegated to one of these thread pools. Afterwards, the thread pool either
delegates the connection request to an available thread, queue the request otherwise, or
drop the connection if the queue is full. By default, Varnish uses 2 thread pools, and this has
proven sufficient for even the most busy Varnish server.
Varnish has the ability to spawn new worker threads on demand, and remove them once the
load is reduced. This is mainly intended for traffic spikes. It's a better approach to keep a few
threads idle during regular traffic, than to run on a minimum amount of threads and
constantly spawn and destroy threads as demand changes. As long as you are on a 64-bit
system, the cost of running a few hundred threads extra is very low.
The thread_pool_min parameter defines how many threads run for each thread pool even
when there is no load. thread_pool_max defines the maximum amount of threads that
could be used per thread pool. That means that with the minimum defaults 100 [threads]
and 5000 [threads] of minimum and maximums threads per pool respectively, you have:
Warning
New threads use preallocated workspace, which should be enough for the required
task. If threads have not enough workspace, the child process is unable to process the
task and it terminates. To avoid this situation, evaluate your setup and consider to
increase the workspace_client or workspace_backend parameter.
Page 84 Chapter 5 Tuning
Varnish can use several thousand threads, and has had this capability from the very
beginning. However, not all operating system kernels were prepared to deal with this
capability. Therefore the parameter thread_pool_add_delay was added to ensure that
there is a small delay between each thread that spawns. As operating systems have
matured, this has become less important and the default value of thread_pool_add_delay
has been reduced dramatically, from 20 ms to 2 ms.
There are a few, less important parameters related to thread timing. The
thread_pool_timeout is how long a thread is kept around when there is no work for it
before it is removed. This only applies if you have more threads than the minimum, and is
rarely changed.
Another less important parameter is the thread_pool_fail_delay. After the operating
system fails to create a new thread, thread_pool_fail_delay defines how long to wait for
a re-trial.
Chapter 5 Tuning Page 85
Workspaces are some of the things you can change with parameters. Sometimes you may
have to increase them to avoid running out of workspace.
The workspace_client parameter states how much memory can be allocated for each
HTTP session. This space is used for tasks like string manipulation of incoming headers.
The workspace_backend parameter indicates how much memory can be allocated to
modify objects returned from the backend. After an object is modified, its exact size is
allocated and the object is stored read-only.
As most of the parameters can be left unchanged, we will not go through all of them.
You can take a look at the list of parameter by issuing varnishadm param.show -l to
get information about what they can do.
Page 86 Chapter 5 Tuning
5.8 Timers
The timeout-parameters are generally set to pretty good defaults, but you might have to
adjust them for unusual applications. The default value of connect_timeout is 3.500
[seconds]. This value is more than enough when having the Varnish server and the backend
in the same server room. Consider to increase the connect_timeout value if your Varnish
server and backend have a higher network latency.
Keep in mind that the session timeout affects how long sessions are kept around, which in
turn affects file descriptors left open. It is not wise to increase the session timeout without
taking this into consideration.
The cli_timeout is how long the management thread waits for the worker thread to reply
before it assumes it is dead, kills it and starts it back up. The default value seems to do the
trick for most users today.
Warning
If connect_timeout is set too high, it does not let Varnish handle errors gracefully.
Note
Another use-case for increasing connect_timeout occurs when virtual machines are
involved as they can increase the connection time significantly.
Chapter 5 Tuning Page 87
Tip
More information in
https://info.varnish-software.com/blog/understanding-timeouts-varnish-cache .
Page 88 Chapter 5 Tuning
For the purpose of this exercise we use a CGI script that waits 5 seconds before responding:
http localhost/cgi-bin/sleep
To check how first_byte_timeout impacts the behavior of Varnish, analyze the output of
varnishlog and varnishstat.
Alternatively, you can use delay in a mock-up backend in varnishtest and assert VSL
records and counters to verify the effect of first_byte_timeout. The subsection Solution:
Tune first_byte_timeout and test it against mock-up server shows you how to do it.
Chapter 5 Tuning Page 89
These exercises are for educational purposes, and not intended as an encouragement to
change the values. You can learn from this exercise by using varnishstat, varnishadm and
varnishstat
If you need help, see Solution: Configure Threading with varnishadm and varnishstat or
Solution: Configure Threading with varnishtest.
Page 90 Chapter 6 HTTP
6 HTTP
This chapter is for the web-developer course only
This chapter covers:
• Protocol basics
• Requests and responses
• HTTP request/response control flow
• Statelessness and idempotence
• Cache related headers
Varnish is designed to be used with HTTP semantics. These semantics are specified in the
version called HTTP/1.1. This chapter covers the basics of HTTP as a protocol, its semantics
and the caching header fields most commonly used.
Chapter 6 HTTP Page 91
Each resource is identified by a Uniform Resource Identifier (URI), as described in Section 2.7
of [RFC7230]. A resource can be anything and such a thing can have different
representations. A representation is an instantiation of a resource. An origin server, a.k.a.
backend, produces this instantiation based on a list of request field headers, e.g.,
User-Agent and Accept-encoding.
When an origin server produces different representations of one resource, it includes a Vary
response header field. This response header field is used by Varnish to differentiate between
resource variations. More details on this are in the Vary subsection.
An origin server might include metadata to reflect the state of a representation. This
metadata is contained in the validator header fields ETag and Last-Modified.
In order to construct a response for a client request, an algorithm is used to evaluate and
select one representation with a particular state. This algorithm is implemented in Varnish
and you can customize it in your VCL code. Once a representation is selected, the payload
for a 200 (OK) or 304 (Not Modified) response can be constructed.
Chapter 6 HTTP Page 93
• A request is a message from a client to a server that includes the method to be applied
to a requested resource, the identifier of the resource, the protocol version in use and
an optional message body
• A method is a token that indicates the method to be performed on a URI
• Standard request methods are: GET, POST, HEAD, OPTIONS, PUT, DELETE, TRACE, or
CONNECT
• Examples of URIs are /img/image.png or /index.html
• Header fields are allowed in requests and responses
• Header fields allow client and servers to pass additional information
• A response is a message from a server to a client that consists of a response status,
headers and an optional message body
The first line of a request message is called Request-Line, whereas the first line of a response
message is called Status-Line. The Request-Line begins with a method token, followed by the
requested resource (URI) and the protocol version.
A request method informs the web server what sort of request this is: Is the client trying to
fetch a resource (GET), update some data (POST) at the backend, or just get the headers of a
resource (HEAD)? Methods are case-sensitive.
After the Request-Line, request messages may have an arbitrary number of header fields.
For example: Accept-Language, Cookie, Host and User-Agent.
Message bodies are optional but they must comply to the requested method. For instance, a
GET request should not contain a request body, but a POST request may contain one.
Similarly, a web server cannot attach a message body to the response of a HEAD request.
The Status-Line of a response message consists of the protocol version followed by a
numeric status code and its associated textual phrase. This associated textual phrase is also
called reason. Important is to know that the reason is intended for the human user. That
means that the client is not required to examine the reason, as it may change and it should
not affect the protocol. Examples of status codes with their reasons are: 200 OK,
404 File Not Found and 304 Not Modified.
Responses also include header fields after the Status-Line, which allow the server to pass
additional information about the response. Examples of response header fields are: Age,
ETag, Cache-Control and Content-Length.
Note
Requests and responses share the same syntax for headers and message body, but
some headers are request- or response-specific.
Page 94 Chapter 6 HTTP
GET / HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; fr; rv:1.9.2.16) \
Gecko/20110319 Firefox/3.6.16
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Cache-Control: max-age=0
The above example is a typical HTTP request that includes a Request-Line, and headers, but
no message body. The Request-Line consists of the GET request for the / resource and the
HTTP/1.1 version. The request includes the header fields Host, User-Agent, Accept,
Accept-Language, Accept-Encoding, Accept-Charset, Keep-Alive, Connection and
Cache-Control.
Note that the Host header contains the hostname as seen by the browser. The above
request was generated by entering http://localhost/ in the browser. Most browsers
automatically add a number of headers.
Some of the headers will vary depending on the configuration and state of the client. For
example, language settings, cached content, forced refresh, etc. Whether the server honors
these headers will depend on both the server in question and the specific header.
The following is an example of an HTTP request using the POST method, which includes a
message body:
ltmpl=default[...]&signIn=Sign+in&asts=
Chapter 6 HTTP Page 95
HTTP/1.1 200 OK
Server: Apache/2.2.14 (Ubuntu)
X-Powered-By: PHP/5.3.2-1ubuntu4.7
Cache-Control: public, max-age=86400
Last-Modified: Mon, 04 Apr 2011 04:13:41 +0000
Expires: Sun, 11 Mar 1984 12:00:00 GMT
Vary: Cookie,Accept-Encoding
ETag: "1301890421"
Content-Type: text/html; charset=utf-8
Content-Length: 23562
Date: Mon, 04 Apr 2011 09:02:26 GMT
X-Varnish: 1886109724 1886107902
Age: 17324
Via: 1.1 varnish
Connection: keep-alive
[data]
The example above is an HTTP response that contains a Status-Line, headers and message
body. The Status-Line consists of the HTTP/1.1 version, the status code 200 and the reason
OK. The response status code informs the client (browser) whether the server understood the
request and how it replied to it. These codes are fully defined in
https://tools.ietf.org/html/rfc2616#section-10, but here is an overview of them:
HTTP is by definition a stateless protocol meaning that each request message can be
understood in isolation. Hence, a server MUST NOT assume that two requests on the same
connection are from the same user agent unless the connection is secured and specific to
that agent.
HTTP/1.1 persists connections by default. This is contrary to most implementations of
HTTP/1.0, where each connection is established by the client prior to the request and closed
by the server after sending the response. Therefore, for compatibility reasons, persistent
connections may be explicitly negotiated as they are not the default behavior in HTTP/1.0
[https://tools.ietf.org/html/rfc7230#appendix-A.1.2]. In practice, there is a header called
Keep-Alive you may use if you want to control the connection persistence between the
client and the server.
A method is "safe" if it is read-only; i.e., the client request does not alter any state on the
server. GET, HEAD, OPTIONS, and TRACE methods are defined to be safe. An idempotent
method is such that multiple identical requests have the same effect as a single request.
PUT, DELETE and safe requests methods are idempotent.
Cacheable methods are those that allow to store their responses for future reuse. RFC7231
specifies GET, HEAD and POST as cacheable. However, responses from POST are very rarely
treated as cacheable. [https://tools.ietf.org/html/rfc7231#section-4.2]
Chapter 6 HTTP Page 97
• Expires
• Cache-Control
• Etag
• Last-Modified
• If-Modified-Since
• If-None-Match
• Vary
• Age
A cached object is a local store of HTTP response messages. These objects are stored,
controlled, retrieved and deleted by a subsystem, in this case Varnish. For this purpose,
Varnish uses the caching header fields defined in https://tools.ietf.org/html/rfc7232 and
https://tools.ietf.org/html/rfc7234.
If a matched cache is valid, Varnish constructs responses from caches. As a result, the origin
server is freed from creating and transmitting identical response bodies.
Page 98 Chapter 6 HTTP
When Varnish matches a request with a cached object (aka cache-hit), it evaluates whether
the cache or origin server should be used to construct the response. There are many rules
that should be taken into consideration when validating a cache. Most of those rules use
caching header fields.
This subsection describes first the concept of cache-hit and cache-miss. After that, it
describes three header fields commonly used to effectively match caches. These fields are
Vary, Etag and If-None-Match.
Chapter 6 HTTP Page 99
Figure 19 shows the flow diagram of a cache-hit. A cache-hit occurs when the requested
object (URI) matches a stored HTTP response message (cache). If the matched stored
message is valid to construct a response for the client, Varnish serves construct a response
and serves it without contacting the origin server.
Figure 20 shows the flow diagram of a cache-miss. A cache-miss happens when Varnish does
not match a cache. In this case, Varnish forwards the request to the origin server.
Page 100 Chapter 6 HTTP
6.5.1 Vary
If the origin server sends Vary in a response, Varnish does not use this response to satisfy a
later request unless the later request has the same values for the listed fields in Vary as the
original request. As a consequence, Vary expands the cache key required to match a new
request to the stored cache entry.
Vary is one of the trickiest headers to deal with when caching. A caching server like Varnish
does not necessarily understand the semantics of a header, or what part triggers different
variants of a response. In other words, an inappropriate use of Vary might create a very
large number of cached objects, and reduce the efficiency of your cache server. Therefore,
you must be extremely cautious when using Vary.
Caching objects taking into consideration all differences from requesters creates a very
fine-grained caching policy. This practice is not recommended, because those cached objects
are most likely retrieved only by their original requester. Thus, fine-grained caching
strategies do not scale well. This is a common mistake if Vary is not used carefully.
An example of wrong usage of Vary is setting Vary: User-Agent. This tells Varnish that for
absolutely any difference in User-Agent, the response from the origin server might look
different. This is not optimal because there are probably thousands of User-Agent strings out
there.
Another example of bad usage is when using Vary: Cookie to differentiate a response.
Again, there could be a very large number of cookies and hence a very large number of
cached objects, which are going to be retrieved most likely only by their original requesters.
The most common usage of Vary is Vary: Accept-Encoding, which tells Varnish that the
content might look different depending on the request Accept-Encoding header. For
example, a web page can be delivered compressed or uncompressed depending on the
client. For more details on how to use Vary for compressions, refer to
https://www.varnish-cache.org/docs/trunk/users-guide/compression.html.
One way to assist Vary is by building the response body from cached and non-cached
objects. We will discuss this further in the Content Composition chapter.
Varnish Test Cases (VTC) in varnishtest can also help you to understand and isolate the
behavior of Vary. For more information about it, refer to the subsection Understanding Vary
in varnishtest.
Note
Varnish can handle Accept-Encoding and Vary: Accept-Encoding, because Varnish
has support for gzip compression.
Chapter 6 HTTP Page 101
Page 102 Chapter 6 HTTP
6.5.2 ETag
Origin servers normally add metadata to further describe the representation of a resource.
This metadata is used in conditional requests. "Conditional requests are HTTP requests that
include one or more header fields indicating a precondition to be tested before applying the
method semantics to the target resource" [RFC7232]. ETag is a validator header field.
The ETag header field provides a state identifier for the requested variant (resource
representation). ETags are used to differentiate between multiple states based on changes
over time of a representation. In addition, ETags are also used differentiate between
multiple representations based on content negotiation regardless their state.
Example of an ETag header:
ETag: "1edec-3e3073913b100"
The response ETag field is validated against the request If-None-Match header field. We
will see the details of If-None-Match later in this subsection, but before, we learn about the
other validator header field: Last-Modified.
Chapter 6 HTTP Page 103
6.5.3 Last-Modified
The Last-Modified response header field is a timestamp that indicates when the variant
was last modified. This headers may be used in conjunction with If-Modified-Since and
If-None-Match.
Example of a Last-Modified header:
ETag and Last-Modified are validator header fields, which help to differentiate between
representations. Normally, origin servers send both fields in successful responses. Whether
you use one, another or both, depends on your use cases. Please refer to Section 2.4 in
https://tools.ietf.org/html/rfc7232#section-2.4 for a full description on when to use either of
them.
Page 104 Chapter 6 HTTP
6.5.4 If-None-Match
A client that has obtained a response message and stored it locally, may reuse the obtained
ETag value in future requests to validate its local cache against the selected cache in
Varnish. The obtained value from an ETag is sent from the client to Varnish in the request
If-None-Match header field. In fact, a client may have stored multiple resource
representations and therefore a client may send an If-None-Match field with multiple ETag
values to validate.
The purpose of this header field is to reuse local caches without compromising its validity. If
the local cache is valid, Varnish replies with a 304 (Not Modified) response, which does not
include a message body. In this case, the client reuses its local cache to construct the
requested resource.
Example of an If-None-Match header:
If-None-Match: "1edec-3e3073913b100"
6.5.5 If-Modified-Since
A request containing an If-Modified-Since header field indicates that the client wants to
validate one or more of its local caches by modification date. If the requested representation
has not been modified since the time specified in this field, Varnish returns a 304 (not
modified) response. A 304 response does not contain message body. This behavior is similar
to as when using If-None-Match.
Example of an If-Modified-Since header:
Tip
The subsection Understanding Last-Modified and If-Modified-Since in varnishtest
explains further these concepts with a practical VTC example.
Chapter 6 HTTP Page 107
6.6 Allowance
• How to control which caches can be served?
• Cache-Control and Pragma (for backwards compatibility only)
Varnish allows you to validate whether the stored response (cache) can be reused or not.
Validation can be done by checking whether the presented request does not contain the
no-cache directive. This subsection reviews two common header fields, Cache-Control and
Pragma, to check caching allowance.
Page 108 Chapter 6 HTTP
6.6.1 Cache-Control
The Cache-Control header field specifies directives that must be applied by all caching
mechanisms (from proxy cache to browser cache).
Table 12 summarizes the directives you may use for each context. The most relevant
directives of Cache-Control are:
Note
Cache-Control always overrides Expires.
Note
By default, Varnish does not care about the Cache-Control request header. If you
want to let users update the cache via a force refresh you need to do it yourself.
Page 110 Chapter 6 HTTP
6.6.2 Pragma
The Pragma request header is a legacy header and should no longer be used. Some
applications still send headers like Pragma: no-cache but this is for backwards compatibility
reasons only. Any proxy cache should treat Pragma: no-cache as
Cache-Control: no-cache, and should not be seen as a reliable header especially when
used as a response header.
Chapter 6 HTTP Page 111
6.7 Freshness
• Fresh object: age has not yet exceeded its freshness lifetime
• Stale object: has exceeded its freshness lifetime, i.e., expired object
When reusing a stored response (cached object), you should always check its freshness and
evaluate whether to deliver expired objects or not. A response's freshness lifetime is the
length of time between its generation by the origin server and its expiration time. A stale
(expired) object can also be reused, but only after further validation with the origin server.
As defined in RFC7234 [https://tools.ietf.org/html/rfc7234#section-4.2]:
A response's age is the time that has passed since it was generated by, or successfully
validated with, the origin server. The primary mechanism for determining freshness is for
an origin server to provide an explicit expiration time in the future, using either the
``Expires`` header field or the ``max-age`` response directive.
Page 112 Chapter 6 HTTP
6.7.1 Age
Consider what happens if you let Varnish cache content for a week. If Varnish does not
calculate the age of a cached object, Varnish might happily inform clients that the content is
fresh, but the cache could be older than the maximum allowed max-age. By age we mean an
estimate amount of time since the response was generated or successfully validated at the
origin server.
Client browsers calculate a cache duration based on the Age header field and the max-age
directive from Cache-Control. If this calculation results in a negative number, the browser
does not cache the response locally. Negative cache duration times, however, do not prevent
browsers from using the received object. Varnish does the same, if you put one Varnish
server in front of another.
You might encounter that different browsers have different behaviors. Some of them might
cache content locally, and their behavior when refreshing might be different, which can be
very confusing. This just highlights that Varnish is not the only part of your web-stack that
parses and honors cache-related headers. There might also be other caches along the way
which you do not control, like a company-wide proxy server.
Since browsers might interpret cache headers differently, it is a good idea to control them in
your cache server. In the next chapters, you will learn how to modify the response headers
Varnish sends. This also allows your origin server to emit response headers that should be
seen and used by Varnish only, not in your browser.
Chapter 6 HTTP Page 113
When browsers decide to load a resource from their local cache, requests are never sent.
Therefore this exercise and these type of tests are not possible to be simulated in
varnishtest.
Page 114 Chapter 6 HTTP
6.7.2 Expires
The Expires response header field gives the time after which the response is considered
stale. Normally, a stale cache item should not be returned by any cache (proxy cache or
client cache). The syntax for this header is:
It is recommended not to define Expires too far in the future. Setting it to 1 year is usually
enough. The use of Expires does not prevent the cached object from being updated. For
example, if the name of the resource is updated.
Expires and Cache-Control do more or less the same job, but Cache-Control gives you
more control. The most significant differences between these two headers is:
Tip
To learn more about the behavior of Expires, refer to the subsection Understanding
Expires in varnishtest.
Chapter 6 HTTP Page 115
1. Use httpheadersexample.php via your Varnish server to experiment and get a sense of
what it is all about.
2. Copy the PHP file from Varnish-Book/material/webdev/ to /var/www/html/
3. Use varnishstat -f MAIN.client_req -f MAIN.cache_hit and
varnishlog -g request -i ReqHeader,RespHeader to analyze the responses.
4. Try every link several times by clicking on them and refreshing your browser.
5. Analyze the response in your browser and the activity in your Varnish server.
6. Discuss what happens when having the Cache-Control and Expires fields in the third
link.
7. When testing Last-Modified and If-Modified-Since, does your browser issue a
request to Varnish? If the item was in cache, does Varnish query the origin server?
8. Try the Vary header field from two different browsers.
When performing this exercise, try to see if you can spot the patterns. There are many levels
of cache on the web, and you have to consider them in addition to your Varnish installation.
If it has not happened already, it is likely that the local cache of your browser will confuse
you at least a few times through this course. When that happens, pull up varnishlog,
varnishstat and another browser, or use client mock-ups in varnishtest instead of
browsers.
Chapter 7 VCL Basics Page 117
7 VCL Basics
In this chapter, you will learn the following topics:
Tip
Remember that Varnish has many reference manuals. For more details about VCL,
check its manual page by issuing man vcl.
Page 118 Chapter 7 VCL Basics
sub vcl_recv {
if (req.method != "GET" && req.method != "HEAD") {
return (pass);
}
return (hash);
}
Chapter 7 VCL Basics Page 119
Request Restart
vcl_recv
lookup
vcl_hash
vcl_pass
vcl_backend_fetch
read beresp(headers)
vcl_backend_response vcl_backend_error
cacheable?
yes hit-for-pass
vcl_deliver vcl_synth
Done
VCL is often described as a finite state machine. Each state has available certain parameters
that you can use in your VCL code. For example: response HTTP headers are only available
after vcl_backend_fetch state.
Figure 23 shows a simplified version of the Varnish finite state machine. This version shows
by no means all possible transitions, but only a typical set of them. Figure 24 and Figure 25
show the detailed version of the state machine for the frontend and backend worker
respectively.
States in VCL are conceptualized as subroutines, with the exception of the waiting state
described in Waiting State Subroutines in VCL take neither arguments nor return values.
Each subroutine terminates by calling return (action), where action is a keyword that
indicates the desired outcome. Subroutines may inspect and manipulate HTTP header fields
and various other aspects of each request. Subroutines instruct how requests are handled.
Subroutine example:
sub pipe_if_local {
if (client.ip ~ local) {
return (pipe);
}
}
To call a subroutine, use the call keyword followed by the subroutine's name:
call pipe_if_local;
Varnish has built-in subroutines that are hook into the Varnish workflow. These built-in
subroutines are all named vcl_*. Your own subroutines cannot start their name with vcl_.
Chapter 7 VCL Basics Page 121
7.2 Detailed Varnish Request Flow for the Client Worker Thread
• Figure 24 shows the detailed request flow diagram of the backend worker.
• The grayed box is detailed in Figure 25.
RESTART
cnt_restart:
Request received ESI_REQ
ok? max_restarts?
cnt_recv:
vcl_recv{} req.* SYNTH
hash purge pass pipe synth
cnt_recv:
vcl_hash{} req.*
lookup
cnt_pipe:
cnt_lookup: filter req.*->bereq.* cnt_purge:
hash lookup (waitinglist) req.* vcl_purge{} req.*
vcl_pipe{}
hit? miss? hit-for-pass? busy? bereq.* synth restart
pipe synth
cnt_lookup:
req.* send bereq,
vcl_hit{}
obj.* copy bytes until close
deliver miss restart synth pass
cnt_miss:
vcl_miss{} req.*
fetch synth restart pass
parallel
if obj expired
cnt_pass:
vcl_pass{} req.*
fetch synth restart
cnt_deliver:
cnt_synth:
Filter obj.->resp.
req.* stream?
req.* vcl_synth{}
vcl_deliver{} resp.* body
resp.*
deliver restart
restart deliver synth
V1D_Deliver
DONE
Page 122 Chapter 7 VCL Basics
Figure 24: Detailed Varnish Request Flow for the Client Worker Thread
Chapter 7 VCL Basics Page 123
Before we begin looking at VCL code, we should learn the fundamental concepts behind VCL.
When Varnish processes a request, it starts by parsing the request itself. Later, Varnish
separates the request method from headers, verifying that it is a valid HTTP request and so
on. When the basic parsing has completed, the very first policies are checked to make
decisions.
Policies are a set of rules that the VCL code uses to make a decision. Policies help to answer
questions such as: should Varnish even attempt to find the requested resource in the cache?
In this example, the policies are in the vcl_recv subroutine.
Warning
If you define your own subroutine and execute return (action); in it, control is
passed to the Varnish Run Time (VRT) environment. In other words, your
return (action); skips the built-it subroutine.
Page 124 Chapter 7 VCL Basics
Starting with Varnish 4.0, each VCL file must start by declaring its version with a special
vcl 4.0; marker at the top of the file. If you have worked with a programming language or
two before, the basic syntax of Varnish should be reasonably straightforward. VCL is inspired
mainly by C and Perl. Blocks are delimited by curly braces, statements end with semicolons,
and comments may be written as in C, C++ or Perl according to your own preferences.
Subroutines in VCL neither take arguments, nor return values. Subroutines in VCL can
exchange data only through HTTP headers.
VCL has terminating statements, not traditional return values. Subroutines end execution
when a return(*action*) statement is made. The action tells Varnish what to do next. For
example, "look this up in cache", "do not look this up in the cache", or "generate an error
message". To check which actions are available at a given built-in subroutine, see the Legal
Return Actions section or see the manual page of VCL.
VCL has two directives to use contents from another file. These directives are include and
import, and they are used for different purpose.
include is used to insert VCL code from another file. Varnish looks for files to include in the
directory specified by the vcl_dir parameter of varnishd. Note the quotation marks in the
include syntax.
import is used to load VMODs and make available their functions into your VCL code.
Varnish looks for VMODs to load in the directory specified by the vmod_dir parameter of
varnishd. Note the lack of quotation marks in the import syntax.
You can use the include and import in varnishtest. To learn more on how to test your
VCL code in a VTC, refer to the subsection VCL in varnishtest.
Chapter 7 VCL Basics Page 125
Keywords:
• call subroutine
• return(action)
• new
• set
• unset
All functions are available in all subroutines, except the listed in the table below.
Function Subroutines
hash_data vcl_hash
new vcl_init
synthetic vcl_synth, vcl_backend_error
VCL offers many simple to use built-in functions that allow you to modify strings, add bans,
restart the VCL state engine and return control to the Varnish Run Time (VRT) environment.
This book describes the most important functions in later sections, so the description at this
point is brief.
regsub() and regsuball() take a string str as input, search it with a regular-expression
regex and replace it with another string. regsub() changes only the first match, and
`regsuball() changes all occurrences.
The ban(boolean expression) function invalidates all objects in cache that match the
boolean expression. banning and purging in detailed in the Cache Invalidation chapter.
Page 126 Chapter 7 VCL Basics
Table 14: VCL built-in subroutines and their legal returns at the frontend (client) side
subroutine scope deliver fetch restart hash pass pipe synth purge lookup
vcl_deliver client x x x
vcl_hash client x
vcl_hit client x x x x x
vcl_miss client x x x x
vcl_pass client x x x
vcl_pipe client x x
vcl_purge client x x
vcl_recv client x x x x x
vcl_synth client x x
Table 15: VCL built-in subroutines and their legal returns at the backend side, vcl.load, and
vcl.discard
The table above shows the VCL built-in subroutines and their legal returns. return is a
built-in keyword that ends execution of the current VCL subroutine and continue to the next
action step in the request handling state machine. Legal return actions are: lookup, synth,
purge, pass, pipe, fetch, deliver, hash, restart, retry, and abandon.
Note
In Varnish 4 purge is used as a return action.
Chapter 7 VCL Basics Page 127
Table 17 shows the availability of variables in each VCL subroutine and whether the
variables are readable (R) or writable (W). The variables in this table are listed per
subroutine and follow the prefix req., bereq., beresp., obj., or resp.. However, predefined
variables does not strictly follow the table, for example, req.restarts is readable but not
writable. In order to see the exact description of predefined variables, consult the VCL man
page or ask your instructor.
Most variables are self-explanatory but not how they influence each other, thus a brief
explanation follows: Values of request (req.) variables are automatically assigned to
backend request (bereq.) variables. However, those values may slightly differ, because
Varnish may modify client requests. For example, HEAD requests coming from clients may be
converted to GET requests towards the backend.
Changes in backend response (beresp.) variables affect response (resp.) and object (obj.)
variables. Many of the obj. variables are set in resp., which are to be sent to the clients.
Additional variable prefixes from Table 17 are; client., server., local, remote, and
storage.. These prefixes are accessible from the subroutines at the frontend (client) side.
Yet another variable is now, which is accessible from all subroutines.
Support for global variables with a lifespan across transactions and VCLs is achieved with the
variable VMOD. This VMOD keeps the variables and its values as long as the VMOD is loaded.
Supported data types are strings, integers and real numbers. For more information about the
variable VMOD, please visit
https://github.com/varnish/varnish-modules/blob/master/docs/vmod_var.rst.
Page 128 Chapter 7 VCL Basics
Note
Recall that every transaction in Varnish is always in a state, and each state is
represented by its correspondent subroutine.
Chapter 7 VCL Basics Page 129
• We will revisit vcl_recv after we learn more about built-in functions, keywords,
variables and return actions
The built-in VCL for vcl_recv is designed to ensure a safe caching policy even with no
modifications in VCL. It has two main uses:
Policies for no caching data are to be defined in your VCL. Built-in VCL code is executed right
after any user-defined VCL code, and is always present. You can not remove built-in
subroutines, however, you can avoid them if your VCL code reaches one of the terminating
actions: pass, pipe, hash, or synth. These terminating actions return control from the VRT
(Varnish Run-Time) to Varnish.
For a well-behaving Varnish server, most of the logic in the built-in VCL is needed. Consider
either replicating all the built-in VCL logic in your own VCL code, or let your client requests
be handled by the built-in VCL code.
Page 130 Chapter 7 VCL Basics
7.8.1 Exercise: Configure vcl_recv to avoid caching all requests to the URL
/admin
1. Find and open the built-in.vcl code, and analyze the vcl_recv subroutine
2. Create your VCL code to avoid caching all URLs under /admin
3. Compile your VCL code to C language, and analyze how the built-in.vcl code is
appended
If you need help, see Solution: Configure vcl_recv to avoid caching all requests to the URL
/admin.
Page 132 Chapter 7 VCL Basics
vbf_stp_startfetch:
vcl_backend_fetch{} bereq.*
abandon fetch
send bereq,
read beresp (headers)
vbf_stp_startfetch:
bereq.*
vcl_backend_response{}
beresp.*
retry deliver
abandon
max? ok? 304? other?
vbf_stp_error:
vbf_stp_condfetch: vbf_stp_fetch:
bereq.*
vcl_backend_error{} copy obj attr setup VFPs
error abandon beresp.* RETRY
steal body fetch
retry
deliver fetch_fail? ok? fetch_fail? error? ok?
max? ok?
"backend synth"
Figure 25 shows that vcl_backend_response may terminate with one of the following
actions: deliver, retry and abandon. The deliver terminating action may or may not
insert the object into the cache depending on the response of the backend. The retry
action makes Varnish to transmit the request to the backend again by calling the
vcl_backend_fetch subroutine. The abandon action discards any response from the
backend.
Backends might respond with a 304 HTTP headers. 304 responses happen when the
requested object has not been modified since the timestamp If-Modified-Since in the
HTTP header. If the request hits a non fresh object (see Figure 2), Varnish adds the
If-Modified-Since header with the value of t_origin to the request and sends it to the
backend.
304 responses do not contain a message body. Thus, Varnish builds the response using the
body from cache. 304 responses update the attributes of the cached object.
Chapter 7 VCL Basics Page 135
7.10.1 vcl_backend_response
built-in vcl_backend_response
sub vcl_backend_response {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Surrogate-control ~ "no-store" ||
(!beresp.http.Surrogate-Control &&
beresp.http.Cache-Control ~ "no-cache|no-store|private") ||
beresp.http.Vary == "*") {
/*
* Mark as "Hit-For-Pass" for the next 2 minutes
*/
set beresp.ttl = 120s;
set beresp.uncacheable = true;
}
return (deliver);
}
Note
Varnish 3.x has a hit_for_pass return action. In Varnish 4, this action is achieved by
setting beresp.uncacheable to true. The hit-for-pass section explains this in more
detail.
Page 136 Chapter 7 VCL Basics
• 200: OK
• 203: Non-Authoritative Information
• 300: Multiple Choices
• 301: Moved Permanently
• 302: Moved Temporarily
• 304: Not modified
• 307: Temporary Redirect
• 410: Gone
• 404: Not Found
You can cache other status codes than the ones listed above, but you have to set the
beresp.ttl to a positive value in vcl_backend_response. Since beresp.ttl is set before
vcl_backend_response is executed, you can modify the directives of the Cache-Control
header field without affecting beresp.ttl, and vice versa. Cache-Control directives are
defined in RFC7234 Section 5.2.
A backend response may include the response header field of maximum age for shared
caches s-maxage. This field overrides all max-age values throughout all Varnish servers in a
multiple Varnish-server setup. For example, if the backend sends
Cache-Control: max-age=300, s-maxage=3600, all Varnish installations will cache objects
with an Age value less or equal to 3600 seconds. This also means that responses with Age
values between 301 and 3600 seconds are not cached by the clients' web browser, because
Age is greater than max-age.
A sensible approach is to use the s-maxage directive to instruct Varnish to cache the
response. Then, remove the s-maxage directive using regsub() in vcl_backend_response
before delivering the response. In this way, you can safely use s-maxage as the cache
duration for Varnish servers, and set max-age as the cache duration for clients.
Chapter 7 VCL Basics Page 137
Warning
Bear in mind that removing or altering the Age response header field may affect how
responses are handled downstream. The impact of removing the Age field depends on
the HTTP implementation of downstream intermediaries or clients.
For example, imagine that you have a three Varnish-server serial setup. If you remove
the Age field in the first Varnish server, then the second Varnish server will assume
Age=0. In this case, you might inadvertently be delivering stale objects to your client.
Page 138 Chapter 7 VCL Basics
sub vcl_backend_response {
if (bereq.url ~ "\.jpg$") {
set beresp.ttl = 60s;
}
}
The above example caches all URLs ending with .jpg for 60 seconds. Keep in mind that the
built-in VCL is still executed. That means that images with a Set-Cookie field are not
cached.
Chapter 7 VCL Basics Page 139
7.10.4 Example: Cache .jpg for 60 seconds only if s-maxage is not present
sub vcl_backend_response {
if (beresp.http.cache-control !~ "s-maxage" && bereq.url ~ "\.jpg$") {
set beresp.ttl = 60s;
}
}
When trying this out, remember that Varnish keeps the Host header field in req.http.host
and the requested resource in req.url. For example, in a request to
http://www.example.com/index.html, the http:// part is not seen by Varnish at all,
req.http.host has the value www.example.com and req.url the value /index.html. Note
how the leading / is included in req.url.
If you need help, see Solution: Avoid caching a page.
Chapter 7 VCL Basics Page 141
If you need help, see Solution: Either use s-maxage or set TTL by file type.
Tip
Divide and conquer! Most somewhat complex VCL tasks are easily solved when you
divide the tasks into smaller problems and solve them individually. Try solving each
part of the exercise by itself first.
Note
Varnish automatically parses s-maxage for you, so you only need to check if it is there
or not. Remember that if s-maxage is present, Varnish has already used it to set
beresp.ttl.
Page 142 Chapter 7 VCL Basics
The waiting state is reached when a request n arrives while a previous identical request 0 is
being handled at the backend. In this case, request 0 is set as busy and all subsequent
requests n are queued in a waiting list. If the fetched object from request 0 is cacheable, all
n requests in the waiting list call the lookup operation again. This retry will hopefully hit the
desired object in cache. As a result, only one request is sent to the backend.
The waiting state is designed to improve response performance. However, a
counterproductive scenario, namely request serialization, may occur if the fetched object is
uncacheable, and so is recursively the next request in the waiting list. This situation forces
every single request in the waiting list to be sent to the backend in a serial manner.
Serialized requests should be avoided because their performance is normally poorer than
sending multiple requests in parallel. The built-in vcl_backend_response subroutine avoids
request serialization.
Chapter 7 VCL Basics Page 143
VCL provides subroutines that allow you to affect the handling of any single request almost
anywhere in the execution chain. This provides pros and cons as any other programming
language.
This book is not a complete reference guide to how you can deal with every possible
scenario in VCL, but if you master the basics of VCL you can solve complex problems that
nobody has thought about before. And you can usually do it without requiring too many
different sources of documentation.
Whenever you are working on VCL, you should think of what that exact line you are writing
has to do. The best VCL is built by having many independent sections that do not interfere
with each other more than what they have to.
Remember that there is a built-in VCL. If your own VCL code does not reach a return
statement, the built-in VCL subroutine is executed after yours. If you just need a little
modification of a subroutine, you can use the code from
{varnish-source-code}/bin/varnishd/builtin.vcl as a template.
Page 144 Chapter 8 VCL Subroutines
8 VCL Subroutines
• Typical subroutines to customize: vcl_recv, vcl_pass, vcl_backend_fetch,
vcl_backend_response, vcl_hash, vcl_hit, vcl_miss, vcl_deliver, and vcl_synth
• If your VCL subroutine does return, you skip the built-in VCL subroutine
• The built-in VCL subroutines are always appended to yours
This chapter covers the VCL subroutines where you customize the behavior of Varnish. VCL
subroutines can be used to: add custom headers, change the appearance of the Varnish
error message, add HTTP redirect features in Varnish, purge content, and define what parts
of a cached object is unique. After this chapter, you should know where to add your custom
policies and you will be ready to dive into more advanced features of Varnish and VCL.
Note
It is strongly advised to let the default built-in subroutines whenever is possible. The
built-in subroutines are designed with safety in mind, which often means that they
handle any flaws in your VCL code in a reasonable manner.
Tip
Looking at the code of built-in subroutines can help you to understand how to build
your own VCL code. Built-in subroutines are in the file
/usr/share/doc/varnish/examples/builtin.vcl.gz or
{varnish-source-code}/bin/varnishd/builtin.vcl. The first location may change
depending on your distro.
Chapter 8 VCL Subroutines Page 145
vcl_recv is the first VCL subroutine executed, right after Varnish has parsed the client
request into its basic data structure. vcl_recv has four main uses:
1. Modifying the client data to reduce cache diversity. E.g., removing any leading "www."
in the Host: header.
2. Deciding which web server to use.
3. Deciding caching policy based on client data. For example; no caching POST requests
but only caching specific URLs.
4. Executing re-write rules needed for specific web applications.
Tip
The built-in vcl_recv subroutine may not cache all what you want, but often it's
better not to cache some content instead of delivering the wrong content to the wrong
user. There are exceptions, of course, but if you can not understand why the default
VCL does not let you cache some content, it is almost always worth it to investigate
why instead of overriding it.
Chapter 8 VCL Subroutines Page 147
sub vcl_recv {
if (req.method == "PRI") {
/* We do not support SPDY or HTTP/2.0 */
return (synth(405));
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
sub vcl_recv {
if (req.http.User-Agent ~ "iPad" ||
req.http.User-Agent ~ "iPhone" ||
req.http.User-Agent ~ "Android") {
Note
If you do use Vary: X-Device, you might want to send Vary: User-Agent to the
users after Varnish has used it. Otherwise, intermediary caches will not know that the
page looks different for different devices.
Chapter 8 VCL Subroutines Page 149
1. Copy the Host header field (req.http.Host) and URL (req.url) to two new request
headers: req.http.x-host and req.http.x-url.
2. Ensure that www.example.com and example.com are cached as one, using regsub().
3. Rewrite all URLs under http://sport.example.com to http://example.com/sport/. For example:
http://sport.example.com/index.html to http://example.com/sport/index.html.
4. Use HTTPie to verify the result.
Tip
Remember that man vcl contains a reference manual with the syntax and details of
functions such as regsub(str, regex, sub). We recommend you to leave the default
VCL file untouched and create a new file for your VCL code. Remember to update the
location of the VCL file in the Varnish configuration file and reload it.
Page 150 Chapter 8 VCL Subroutines
sub vcl_pass {
return (fetch);
}
The vcl_pass subroutine is called after a previous subroutine returns the pass action. This
actions sets the request in pass mode. vcl_pass typically serves as an important catch-all
for features you have implemented in vcl_hit and vcl_miss.
vcl_pass may return three different actions: fetch, synth, or restart. When returning the
fetch action, the ongoing request proceeds in pass mode. Fetched objects from requests in
pass mode are not cached, but passed to the client. The synth and restart return actions call
their corresponding subroutines.
Chapter 8 VCL Subroutines Page 151
8.2.1 hit-for-pass
Some requested objects should not be cached. A typical example is when a requested page
contains the Set-Cookie response header, and therefore it must be delivered only to the
client that requests it. In this case, you can tell Varnish to create a hit-for-pass object and
stores it in the cache, instead of storing the fetched object. Subsequent requests are
processed in pass mode.
When an object should not be cached, the beresp.uncacheable variable is set to true. As a
result, the cacher process keeps a hash reference to the hit-for-pass object. In this way, the
lookup operation for requests translating to that hash find a hit-for-pass object. Such
requests are handed over to the vcl_pass subroutine, and proceed in pass mode.
As any other cached object, hit-for-pass objects have a TTL. Once the object's TTL has
elapsed, the object is removed from the cache.
Page 152 Chapter 8 VCL Subroutines
sub vcl_hash {
hash_data(req.url);
if (req.http.host) {
hash_data(req.http.host);
} else {
hash_data(server.ip);
}
return (lookup);
}
vcl_hash defines the hash key to be used for a cached object. Hash keys differentiate one
cached object from another. The default VCL for vcl_hash adds the hostname or IP address,
and the requested URL to the cache hash.
One usage of vcl_hash is to add a user-name in the cache hash to identify user-specific
data. However, be warned that caching user-data should only be done cautiously. A better
alternative might be to hash cache objects per session instead.
The vcl_hash subroutine returns the lookup action keyword. Unlike other action keywords,
lookup is an operation, not a subroutine. The next state to visit after vcl_hash depends on
what lookup finds in the cache.
When the lookup operation does not match any hash, it creates an object with a busy flag
and inserts it in cache. Then, the request is sent to the vcl_miss subroutine. The busy flag
is removed once the request is handled, and the object is updated with the response from
the backend.
Subsequent similar requests that hit busy flagged objects are sent into a waiting list. This
waiting list is designed to improve response performance, and it is explain the Waiting State
section.
Note
One cache hash may refer to one or many object variations. Object variations are
created based on the Vary header field. It is a good practice to keep several variations
under one cache hash, than creating one hash per variation.
Page 154 Chapter 8 VCL Subroutines
sub vcl_hit {
if (obj.ttl >= 0s) {
// A pure unadultered hit, deliver it
return (deliver);
}
if (obj.ttl + obj.grace > 0s) {
// Object is in grace, deliver it
// Automatically triggers a background fetch
return (deliver);
}
// fetch & deliver once we get the result
return (fetch);
}
The vcl_hit subroutine typically terminate by calling return() with one of the following
keywords: deliver, restart, or synth.
deliver returns control to vcl_deliver if the TTL + grace time of an object has not
elapsed. If the elapsed time is more than the TTL, but less than the TTL + grace time, then
deliver calls for background fetch in parallel to vcl_deliver. The background fetch is an
asynchronous call that inserts a fresher requested object in the cache. Grace time is
explained in the Grace Mode section.
restart restarts the transaction, and increases the restart counter. If the number of restarts
is higher than max_restarts counter, Varnish emits a guru meditation error.
synth(status code, reason) returns the specified status code to the client and abandon
the request.
Chapter 8 VCL Subroutines Page 155
sub vcl_miss {
return (fetch);
}
The subroutines vcl_hit and vcl_miss are closely related. It is rare that you customize
them, because modification of HTTP request headers is typically done in vcl_recv. However,
if you do not wish to send the X-Varnish header to the backend server, you can remove it
in vcl_miss or vcl_pass. For that case, you can use unset bereq.http.x-varnish;.
Page 156 Chapter 8 VCL Subroutines
sub vcl_deliver {
return (deliver);
}
The vcl_deliver subroutine is simple, and it is also very useful to modify the output of
Varnish. If you need to remove a header, or add one that is not supposed to be stored in the
cache, vcl_deliver is the place to do it.
The variables most useful and common to modify in vcl_deliver are:
resp.http.*
Headers that are sent to the client. They can be set and unset.
resp.status
The status code (200, 404, 503, etc).
resp.reason
The HTTP status message that is returned to the client.
obj.hits
The count of cache-hits on this object. Therefore, a value of 0 indicates a miss. This
variable can be evaluated to easily reveal whether a response comes from a cache hit or
miss.
req.restarts
The number of restarts issued in VCL - 0 if none were made.
Chapter 8 VCL Subroutines Page 157
vcl/default-vcl_synth.vcl:
sub vcl_synth {
set resp.http.Content-Type = "text/html; charset=utf-8";
set resp.http.Retry-After = "5";
synthetic( {"<!DOCTYPE html>
<html>
<head>
<title>"} + resp.status + " " + resp.reason + {"</title>
</head>
<body>
<h1>Error "} + resp.status + " " + resp.reason + {"</h1>
<p>"} + resp.reason + {"</p>
<h3>Guru Meditation:</h3>
<p>XID: "} + req.xid + {"</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
"} );
return (deliver);
}
You can create synthetic responses, e.g., personalized error messages, in vcl_synth. To call
this subroutine you do:
Note
From vcl/default-vcl_synth.vcl, note that {" and "} can be used to make
multi-line strings. This is not limited to the synthetic() function, but one can be used
anywhere.
Page 158 Chapter 8 VCL Subroutines
Note
A vcl_synth defined object is never stored in cache, contrary to a
vcl_backend_error defined object, which may end up in cache. vcl_synth and
vcl_backend_error replace vcl_error from Varnish 3.
Chapter 8 VCL Subroutines Page 159
sub vcl_recv {
if (req.http.host == "www.example.com") {
set req.http.location = "http://example.com" + req.url;
return (synth(750, "Permanently moved"));
}
}
sub vcl_synth {
if (resp.status == 750) {
set resp.http.location = req.http.location;
set resp.status = 301;
return (deliver);
}
}
Redirecting with VCL is fairly easy – and fast. Basic HTTP redirects work when the HTTP
response is either 301 Moved Permanently or 302 Found. These response have a Location
header field telling the web browser where to redirect.
Note
The 301 response can affect how browsers prioritize history and how search engines
treat the content. 302 responses are temporary and do not affect search engines as
301 responses do.
Page 160 Chapter 8 VCL Subroutines
If you need help, see Solution: Modify the HTTP response header fields.
Chapter 8 VCL Subroutines Page 161
9 Cache Invalidation
• Cache invalidation is an important part of your cache policy
• Varnish automatically invalidates expired objects
• You can proactively invalidate objects with Varnish
• You should define your cache invalidation rules before caching objects specially in
production environments
1. HTTP PURGE
2. Banning
4. Surrogate keys
• For websites with the need for cache invalidation at a very large scale
• Varnish Software's implementation of surrogate keys
• Flexible cache invalidation based on cache tags
Chapter 9 Cache Invalidation Page 163
Table 17: Comparison Between: Purge, Softpurge, Bans, Force Cache Misses and Surrogate keys
(hashtwo/xkey)
Force Cache
Purge Soft Purge Bans Misses Surrogate keys
Targets Specific object Specific object Regex patterns One specific All objects with
(with all its (with all its object (with all a common
variants) variants) its variants) hashtwo key
Frees Immediately After grace After pattern is No Immediately
memory time checked and
matched
Scalability High High High if used High High
properly
CLI No No Yes No No
VCL Yes Yes Yes Yes Yes
Availability Varnish Cache Varnish Cache Varnish Cache Varnish Cache Hashtwo
VMOD in
Varnish Plus
4.0 or xkey
VMOD in
Varnish Cache
4.1
Whenever you deal with caching, you have to eventually deal with the challenge of cache
invalidation, or content update. Varnish has different mechanisms to addresses this
challenge, but which one to use?
There is rarely a need to pick only one solution, as you can implement many of them.
However, you can try to answer the following questions:
• If you need to invalidate more than one item at a time, consider using bans or
hashtwo/xkey.
• If it takes a long time to pull content from the backend into Varnish, consider forcing
cache misses by using req.hash_always_miss.
Chapter 9 Cache Invalidation Page 165
The rest of the chapter teaches you more about these cache invalidation mechanisms.
Note
Purge and hashtwo/xkey work very similar. The main difference is that they act on
different hash keys.
Page 166 Chapter 9 Cache Invalidation
A purge is what happens when you pick out an object from the cache and discard it along
with its variants. A resource can exist in multiple Vary:-variants. For example, you could
have a desktop version, a tablet version and a smartphone version of your site, and use the
Vary HTTP header field in combination with device detection to store different variants of the
same resource.
Usually a purge is invoked through HTTP with the method PURGE. A HTTP PURGE is another
request method just as HTTP GET. Actually, you can call the PURGE method whatever you
like, but PURGE has become the de-facto naming standard. Squid, for example, uses the
PURGE method name for the same purpose.
Purges apply to a specific object, since they use the same lookup operation as in vcl_hash.
Therefore, purges find and remove objects really fast!
There are, however, two clear down-sides. First, purges cannot use regular-expressions, and
second, purges evict content from cache regardless the availability of the backend. That
means that if you purge some objects and the backend is down, Varnish will end up having
no copy of the content.
Chapter 9 Cache Invalidation Page 167
• You may add actions to be executed once the object and its variants is purged
• Called after the purge has been executed
sub vcl_purge {
return (synth(200, "Purged"));
}
Note
Cache invalidation with purges is done by calling return (purge); from vcl_recv in
Varnish 4. The keyword purge; from Varnish 3 has been retired.
Page 168 Chapter 9 Cache Invalidation
sub vcl_recv {
if (req.method == "PURGE"){
return (purge);
}
}
In the example above, return (purge) ends execution of vcl_recv and jumps to
vcl_hash. When vcl_hash calls return(lookup), Varnish purges the object and then calls
vcl_purge.
You can test this code with HTTPie by issuing:
Alternatively, you can test it with varnishtest as in the subsection PURGE in varnishtest.
In order to control the IP addresses that are allowed to send PURGE, you can use Access
Control Lists (ACLs). A purge example using ACLs is in the Access Control Lists (ACLs)
section.
Chapter 9 Cache Invalidation Page 169
• Send a PURGE request to Varnish from your backend server after an article is published.
You are provided with article.php, which fakes an article. It is recommended to create a
separate php file to implement purging.
article.php
<?php
header("Cache-Control: max-age=10");
$utc = new DateTimeZone("UTC");
$date = new DateTime("now", $utc);
$now = $date->format( DateTime::RFC2822 );
?>
If you need help, see Solution: PURGE an article from the backend.
Tip
Remember to place your php files under /var/www/html/.
Page 170 Chapter 9 Cache Invalidation
acl purgers {
"127.0.0.1";
"192.168.0.0"/24;
}
sub vcl_recv {
# allow PURGE from localhost and 192.168.0...
if (req.method == "PURGE") {
if (!client.ip ~ purgers) {
return (synth(405, "Purging not allowed for " + client.ip));
}
return (purge);
}
}
sub vcl_purge {
set req.method = "GET";
return (restart);
}
The restart return action allows Varnish to re-run the VCL state machine with different
variables. This is useful in combination with PURGE, in the way that a purged object can be
immediately restored with a new fetched object.
Every time a restart occurs, Varnish increments the req.restarts counter. If the number of
restarts is higher than the max_restarts parameter, Varnish emits a guru meditation error.
In this way, Varnish safe guards against infinite loops.
Warning
Restarts are likely to cause a hit against the backend, so do not increase
max_restarts thoughtlessly.
Chapter 9 Cache Invalidation Page 171
9.3 Softpurge
• Sets TTL to 0
• Allows Varnish to serve stale content to users if the backend is unavailable
• Asynchronous and automatic backend fetching to update object
Softpurge is cache invalidation mechanism that sets TTL to 0 but keeps the grace value of a
cached object. This is useful if you want to build responses using the cached object while
updating it.
Softpurge is a VMOD part of varnish-modules https://github.com/varnish/varnish-modules.
For installation and usage details, please refer to its own documentation
https://github.com/varnish/varnish-modules/blob/master/docs/vmod_softpurge.rst.
Tip
The xkey VMOD has the softpurge functionality too.
Page 172 Chapter 9 Cache Invalidation
9.4 Banning
• Use ban to invalidate caches on cache hits
• Frees memory on ban patterns matching
• Examples in the varnishadm command line interface:
• Example in VCL:
• ban("req.url ~ /foo");
sub vcl_recv {
if (req.method == "BAN") {
ban("req.http.host == " + req.http.host +
" && req.url == " + req.url);
# Throw a synthetic page so the request won't go to the backend.
return(synth(200, "Ban added"));
}
}
Banning in the context of Varnish refers to adding a ban expression that prohibits Varnish to
serve certain objects from the cache. Ban expressions are more useful when using
regular-expressions.
Bans work on objects already in the cache, i.e., it does not prevent new content from
entering the cache or being served. Cached objects that match a ban are marked as
obsolete. Obsolete objects are expunged by the expiry thread like any other object with
obj.ttl == 0.
Ban expressions match against req.* or obj.* variables. Think about a ban expression as;
"the requested URL starts with /sport", or "the cached object has a header field with value
matching lighttpd". You can add ban expressions in three ways: 1) VCL code, 2) use a
customized HTTP request method, or 3) issuing commands in the varnishadm CLI.
Ban expressions are inserted into a ban-list. The ban-list contains:
• ID of the ban,
• timestamp when the ban entered the ban-list,
• counter of objects that have matched the ban expression,
• a C flag for completed that indicates whether a ban is invalid because it is duplicated,
• the ban expression.
Chapter 9 Cache Invalidation Page 173
To inspect the current ban-list, issue the ban.list command in the CLI:
Varnish tests bans whenever a request hits a cached object. A cached object is checked
against bans added after the last checked ban. That means that each object checks against
a ban expression only once.
Bans that match only against obj.* are also checked by a background worker thread called
the ban lurker. The parameter ban_lurker_sleep controls how often the ban lurker tests
obj.* bans. The ban lurker can be disabled by setting ban_lurker_sleep to 0.
Bans can free memory in a very scalable manner if used properly. Bans free memory only
after a ban expression hits an object. However, since bans do not prevent new backend
responses from being inserted in the cache, client requests that trigger the eviction of an
object will most likely insert a new one matching the ban. Therefore, ban lurker banning is
more effective when freeing memory, as we shall see next.
Note
You should avoid ban expressions that match against req.*, because these
expressions are tested only by client requests, not the ban lurker. In other words, a
req.* ban expression will be removed from the ban list only after a request matches
it. Consequently, you have the risk of accumulating a very large number of ban
expressions. This might impact CPU usage and thereby performance.
Therefore, we recommend you to avoid req.* variables in your ban expressions, and
to use obj.* variables instead. Ban expressions using only obj.* are called
lurker-friendly bans.
Note
If the cache is completely empty, only the last added ban stays in the ban-list.
Tip
You can also execute ban expressions via the Varnish Administration Console (VAC).
Page 174 Chapter 9 Cache Invalidation
Figure 25: Executing ban expressions via the Varnish Administration Console (VAC).
Chapter 9 Cache Invalidation Page 175
Ban expressions are checked in two cases: 1) when a request hits a cached object, or 2)
when the ban lurker wakes up. The first case is efficient only if you know that the cached
objects to be banned are frequently accessed. Otherwise, you might accumulate a lot of ban
expressions in the ban-list that are never checked. The second case is a better alternative
because the ban lurker can help you keep the ban-list at a manageable size. Therefore, we
recommend you to create ban expressions that are checked by the ban lurker. Such ban
expressions are called lurker-friendly bans.
Lurker-friendly ban expressions are those that use only obj.*, but not req.* variables.
Since lurker-friendly ban expressions lack of req.*, you might need to copy some of the
req.* contents into the obj structure. In fact, this copy operation is a mechanism to
preserve the context of client request in the cached object. For example, you may want to
copy useful parts of the client context such as the requested URL from req to obj.
Page 176 Chapter 9 Cache Invalidation
The following snippet shows an example on how to preserve the context of a client request
in the cached object:
sub vcl_backend_response {
set beresp.http.x-url = bereq.url;
}
sub vcl_deliver {
# The X-Url header is for internal use only
unset resp.http.x-url;
}
Now imagine that you just changed a blog post template that requires all blog posts that
have been cached. For this you can issue a ban such as:
Since it uses a lurker-friendly ban expression, the ban inserted in the ban-list will be
gradually evaluated against all cached objects until all blog posts are invalidated. The
snippet below shows how to insert the same expression into the ban-list in the vcl_recv
subroutine:
sub vcl_recv {
if (req.method == "BAN") {
Setting a request in pass mode instructs Varnish to always ask a backend for content,
without storing the fetched object into cache. The vcl_purge removes old content, but what
if the web server is down?
Setting req.has_always_miss to true tells Varnish to look up the content in cache, but
always miss a hit. This means that Varnish first calls vcl_miss, then (presumably) fetches
the content from the backend, cache the updated object, and deliver the updated content.
The distinctive behavior of req.hash_always_miss occurs when the backend server is down
or unresponsive. In this case, the current cached object is untouched. Therefore, client
requests that do not enable req.hash_always_miss keep getting the old and untouched
cached content.
Two important use cases for using req.hash_always_miss are when you want to: 1) control
who takes the penalty for waiting around for the updated content (e.g. a script you control),
and 2) ensure that content is not evicted before it is updated.
Note
Forcing cache misses do not evict old content. This means that causes Varnish to have
multiple copies of the content in cache. In such cases, the newest copy is always used.
Keep in mind that duplicated objects will stay as long as their time-to-live is positive.
Chapter 9 Cache Invalidation Page 179
The idea is that you can use any arbitrary string for cache invalidation. You can then key
your cached objects on, for example, product ID or article ID. In this way, when you update
the price of a certain product or a specific article, you have a key to evict all those objects
from the cache.
So far, we have discussed purges and bans as methods for cache invalidation. Two important
distinctions between them is that purges remove a single object (with its variants), whereas
bans perform cache invalidation based on matching expressions. However, there are cases
where none of these mechanisms are optimal.
Hashtwo/xkey creates a second hash key to link cached objects based on cache tags. This
hash keys provide the means to invalidate cached objects with common cache tags.
In practice, hashtwo/xkey create cache invalidation patterns, which can be tested and
invalidated immediately just as purges do. In addition, hashtwo/xkey is much more efficient
than bans because of two reasons: 1) looking up hash keys is much more efficient than
traversing ban-lists, and 2) every time you test a ban expression, it checks every object in
the cache that is older than the ban itself.
The hashtwo and xkey VMOD are pre-built for supported versions and can be installed using
regular package managers from the Varnish Software repositories. Once your repository is
properly configured, as indicated in Solution: Install Varnish, issue the following commands
to install the hashtwo VMOD:
On Debian or Ubuntu:
Finally, you can use this VMOD by importing it in your VCL code:
import hashtwo;
Page 180 Chapter 9 Cache Invalidation
Tip
The xkey VMOD has a softpurge function as well.
Chapter 9 Cache Invalidation Page 181
HTTP/1.1 200 OK
Server: Apache/2.2.15
X-HashTwo: 8155054
X-HashTwo: 166412
X-HashTwo: 234323
GET / HTTP/1.1
Host: www.example.com
X-HashTwo-Purge: 166412
import hashtwo;
sub vcl_recv {
if (req.http.X-HashTwo-Purge) {
if (hashtwo.purge(req.http.X-HashTwo-Purge) != 0) {
return (purge);
} else {
return (synth(404, "Key not found"));
}
}
}
On an e-commerce site the backend application adds the X-HashTwo HTTP header field for
every product that is included in a web page. The header for a certain page might look like
the one above. If you use xkey instead of hashtwo, you should rename that header so you do
not get confused.
Normally the backend is responsible for setting these headers. If you were to do it in VCL, it
will look something like this:
sub vcl_backend_response {
set beresp.http.X-HashTwo = "secondary_hash_key";
}
Page 182 Chapter 9 Cache Invalidation
In the VCL code above, the hashtwo key to be purged is the value in the X-HashTwo-Purge
HTTP header. In order to keep the web pages in sync with the database, you can set up a
trigger in your database. In that way, when a product is updated, an HTTP request towards
Varnish is triggered. For example, the request above invalidates every cached object with
the matching hashtwo header in hashtwo.purge(req.http.X-HashTwo-Purge) or
xkey.purge(req.http.X-Key-Purge) for the xkey VMOD.
After purging, Varnish should respond something like:
Warning
You should protect purges with ACLs from unauthorized hosts.
Chapter 10 Saving a Request Page 183
10 Saving a Request
This chapter is for the system administration course only
Table 19 shows how different mechanisms are mapped to their saving meaning. This chapter
explains how to make your Varnish setup more robust by using these mechanisms.
Page 184 Chapter 10 Saving a Request
10.1 Directors
• Loadable VMOD
• Contains 1 or more backends
• All backends must be known
• Selection methods:
• round-robin
• fallback
• random
vcl 4.0;
backend one {
.host = "localhost";
.port = "80";
}
backend two {
.host = "127.0.0.1";
.port = "81";
}
sub vcl_init {
new round_robin_director = directors.round_robin();
round_robin_director.add_backend(one);
round_robin_director.add_backend(two);
sub vcl_recv {
set req.backend_hint = round_robin_director.backend();
}
Chapter 10 Saving a Request Page 185
Varnish can have several backends defined, and it can set them together into clusters for
load balancing purposes. Backend directors, usually just called directors, provide logical
groupings of similar web servers by re-using previously defined backends. A director must
have a name.
There are several different director selection methods available, they are: random,
round-robin, fallback, and hash. The next backend to be selected depends on the selection
method. You can specify the timeout before unused backend connections are closed by
setting the backend_idle_timeout parameter. How to tune this and other parameters is
further explained in the Tuning section.
A round-robin director takes only a backend list as argument. This director type picks the
first backend for the first request, then the second backend for the second request, and so
on. Once the last backend have been selected, backends are selected again from the top. If
a health probe has marked a backend as sick, a round-robin director skips it.
A fallback director will always pick the first backend unless it is sick, in which case it would
pick the next backend and so on. A director is also considered a backend so you can actually
stack directors. You could for instance have directors for active and passive clusters, and put
those directors behind a fallback director.
Random directors are seeded with either a random number or a hash key. Next section
explains their commonalities and differences.
Note
Health probes are explain in the Health Checks section.
Note
Directors are defined as loadable VMODs in Varnish 4. See the vmod_directors man
page for more information and examples.
Warning
If you declare backend servers, but do not use them, varnishd returns error by
default. You can avoid this situation by turning off the runtime parameter
vcc_err_unref. However, this practice is strongly discouraged. Instead, we advise
to declare only what you use.
Page 186 Chapter 10 Saving a Request
sub vcl_init {
new h = directors.hash();
h.add_backend(one, 1); // backend 'one' with weight '1'
h.add_backend(two, 1); // backend 'two' with weight '1'
}
sub vcl_recv {
// pick a backend based on the cookie header of the client
set req.backend_hint = h.backend(req.http.cookie);
}
The random director picks a backend randomly. It has one per-backend parameter called
weight, which provides a mechanism for balancing the selection of the backends. The
selection mechanism of the random director may be regarded as traffic distribution if the
amount of traffic is the same per request and per backend. The random director also has a
director-wide counter called retries, which increases every time the director selects a sick
backend.
Both, the random and hash director select a backend randomly. The difference between
these two is the seed they use. The random director is seeded with a random number,
whereas the hash director is seeded with a hash key.
Hash directors typically use the requested URL or the client identity (e.g. session cookie) to
compute the hash key. Since the hash key is always the same for a given input, the output of
the hash director is always the same for a given hash key. Therefore, hash directors always
select the same backend for a given input. This is also known as sticky session load
balancing. You can learn more about sticky sessions in
https://info.varnish-software.com/blog/proper-sticky-session-load-balancing-varnish.
Hash directors are useful to load balance in front of other Varnish caches or other web
accelerators. In this way, cached objects are not duplicated across different cache servers.
Note
In Varnish 3 there is a client director type, which is removed in Varnish 4. This client
director type is a special case of the hash director. Therefore, the semantics of a client
director type are achieved using hash.backend(client.identity).
Chapter 10 Saving a Request Page 187
backend server1 {
.host = "server1.example.com";
.probe = {
.url = "/healthtest";
.timeout = 1s;
.interval = 4s;
.window = 5;
.threshold = 3;
}
}
You can define a health check for each backend. A health check defines a probe to verify
whether a backend replies on a given URL every given interval.
The above example causes Varnish to send a request to
http://server1.example.com/healthtest every 4 seconds. This probe requires that at least 3
requests succeed within a sliding window of 5 request.
Varnish initializes backends marked as sick. .initial is another variable of .probe. This
variable defines how many times the probe must succeed to mark the backend as healthy.
The .initial default value is equal to .threshold – 1.
When Varnish has no healthy backend available, it attempts to use a graced copy of the
cached object that a request is looking for. The next section Grace Mode explains this
concept in detail.
You can also declare standalone probes and reuse them for several backends. It is
particularly useful when you use directors with identical behaviors, or when you use the
same health check procedure across different web applications.
import directors;
probe www_probe {
.url = "/health";
}
Page 188 Chapter 10 Saving a Request
backend www1 {
.host = "localhost";
.port = "8081";
.probe = www_probe;
}
backend www2 {
.host = "localhost";
.port = "8082";
.probe = www_probe;
}
sub vcl_init {
new www = directors.round_robin();
www.add_backend(www1);
www.add_backend(www2);
}
Note
Varnish does not send a Host header with health checks. If you need that, you can define
an entire request using .request instead of .url.
backend one {
.host = "example.com";
.probe = {
.request =
"GET / HTTP/1.1"
"Host: www.foo.bar"
"Connection: close";
}
}
Note
The healthy function is implemented as VMOD in Varnish 4. req.backend.healthy
from Varnish 3 is replaced by std.healthy(req.backend_hint). Do not forget to
include the import line: import std;
Chapter 10 Saving a Request Page 189
• varnishadm backend.list:
Every health test is recorded in the shared memory log with 0 VXID (see Transactions). If
you want to see Backend_health records in varnishlog, you have to change the default
grouping by XVID to raw:
Backend_health records are led by 0, which is the VXID number. The rest of the probe
record is in the following format:
Backend_health - %s %s %s %u %u %u %f %f %s
| | | | | | | | |
| | | | | | | | +- Probe HTTP response
| | | | | | | +---- Average response time
| | | | | | +------- Response time
| | | | | +---------- Probe window size
| | | | +------------- Probe threshold level
| | | +---------------- Number of good probes in window
| | +------------------- Probe window bits
| +---------------------- Status message
+------------------------- Backend name
Page 190 Chapter 10 Saving a Request
Most of the fields are self-descriptive, but we clarify next the Probe window bits and Status
message.
The Probe window bits field details the last probe with the following format:
%c %c %c %c %c %c %c
| | | | | | |
| | | | | | +- H -- Happy
| | | | | +---- R -- Good Received (response from the backend received)
| | | | +------- r -- Error Received (no response from the backend)
| | | +---------- X -- Good Xmit (Request to test backend sent)
| | +------------- x -- Error Xmit (Request to test backend not be sent)
| +---------------- 6 -- Good IPv6
+------------------- 4 -- Good IPv4
• Still healthy
• Back healthy
• Still sick
• Went sick
Note that Still indicates unchanged state, Back and Went indicate a change of state. The
second word, healthy or sick, indicates the present state.
Another method to analyze health probes is by calling varnishadm debug.health in Varnish
4.0 or varnishadm backend.list -p in Varnish 4.1. This command presents first data from
the last Backend_health log:
Oldest Newest
================================================================
44444444444444444444444444444444444444444444--44----444444444444 Good IPv4
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX--XX----XXXXXXXXXXXX Good Xmit
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR--RR----RRRRRRRRRRRR Good Recv
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH--HH----HHHHHHHHHHHH Happy
Chapter 10 Saving a Request Page 191
The main goal of grace mode is to avoid requests to pile up whenever a popular object has
expired in cache. To understand better grace mode, recall Figure 2 which shows the lifetime
of cached objects. When possible, Varnish delivers a fresh object, otherwise Varnish builds a
response from a stale object and triggers an asynchronous refresh request. This procedure is
also known as stale-while-revalidate.
The typical way to use grace is to store an object for several hours after its TTL has elapsed.
In this way, Varnish has always a copy to be delivered immediately, while fetching a new
object asynchronously. This asynchronous fetch ensures that graced objects do not get older
than a few seconds, unless there are no available backends.
The following VCL code illustrates a typical use of grace:
sub vcl_hit {
if (obj.ttl >= 0s) {
# Normal hit
return (deliver);
} elsif (std.healthy(req.backend_hint)) {
# The backend is healthy
# Fetch the object from the backend
return (fetch);
} else {
# No fresh object and the backend is not healthy
if (obj.ttl + obj.grace > 0s) {
# Deliver graced object
# Automatically triggers a background fetch
return (deliver);
} else {
# No valid object to deliver
# No healthy backend to handle request
# Return error
return (synth(503, "Backend is down"));
}
}
}
Graced objects are those with a grace time that has not yet expired. The grace time is stored
in obj.grace, which default is 10 seconds. You can change this value by three means:
Chapter 10 Saving a Request Page 193
Note
obj.ttl and obj.grace are countdown timers. Objects are valid in cache as long as
they have a positive remaining time equal to obj.ttl + obj.grace.
Page 194 Chapter 10 Saving a Request
or set in VCL:
In this timeline example, it is assumed that the object is never refreshed. If you do not want
that objects with a negative TTL are delivered, set beresp.grace = 0. The downside of this
is that all grace functionality is disabled, regardless any reason.
Chapter 10 Saving a Request Page 195
#!/bin/sh
sleep 10
echo "Content-type: text/plain"
echo "Cache-control: max-age=10, stale-while-revalidate=20"
echo
echo "Hello world"
date
With this exercise you should see that as long as the cached object is within its TTL, Varnish
delivers the cached object as normal. Once the TTL expires, Varnish delivers the graced
copy, and asynchronously fetches an object from the backend. Therefore, after 10 seconds
of triggering the asynchronous fetch, an updated object is available in the cache.
Page 196 Chapter 10 Saving a Request
sub vcl_backend_response {
if (beresp.status == 503) {
return (retry);
}
}
Note
In Varnish 3.0 it is possible to do return (restart) after the backend response
failed. This is now called return (retry), and jumps back up to vcl_backend_fetch.
Chapter 10 Saving a Request Page 197
• Backends with objects below the threshold can be selected to serve other
objects
• Backends with objects above the threshold are marked as sick for all objects
Saint mode complements regular Health Checks by marking backend sicks for specific
object. Saint mode is a VMOD that maintains a blacklist of objects and related backends.
Each blacklisted object has a TTL, which denotes the time it stays in the blacklist.
If the number of blacklisted objects for a backend are below a threshold, the backend is
considered partially sick. Requests for blacklisted objects might be sent to another backend.
When the number of blacklisted objects for a backend exceeds a threshold, the backend is
marked as sick for all requests.
vcl/saintmode.vcl below is typical usage of saint mode. In this example, a request with a 500
response status would be retried to another backend.
vcl 4.0;
import saintmode;
import directors;
sub vcl_init {
# create two saint mode backends with threshold of 5 blacklisted objects
new sm1 = saintmode.saintmode(server1, 5);
new sm2 = saintmode.saintmode(server2, 5);
sub vcl_backend_fetch {
Page 198 Chapter 10 Saving a Request
sub vcl_backend_response {
if (beresp.status > 500) {
# the failing backend is blacklisted 5 seconds
saintmode.blacklist(5s);
# retry request in a different backend
return (retry);
}
}
An alternative is to build the response with a stale object. For that, you would
return(abandon), restart the request in vcl_synth, check for req.restarts in vcl_recv.
To get a better idea on how to do it, please take a look the stale-if-error snippet in
https://github.com/fgsch/vcl-snippets/blob/master/v4/stale-if-error.vcl.
The fine-grained checks of saint mode help to spot problems in malfunctioning backends. For
example, if the request for the object foo returns 200 OK HTTP response without content
(Content-Length = 0), you can blacklist that specific object for that specific backend. You
can also print the object with std.log and filter it in varnishlog.
Note
For more information, please refer to its own documentation in
https://github.com/varnish/varnish-modules/blob/master/docs/vmod_saintmode.rst.
Chapter 10 Saving a Request Page 199
Tip
Varnish only accepts hostnames for backend servers that resolve to a maximum of
one IPv4 address and one IPv6 address. The parameter prefer_ipv6 defines which IP
address Varnish prefer.
Page 200 Chapter 10 Saving a Request
sub vcl_recv {
if (req.method == "PURGE") {
if (client.ip ~ local) {
return (purge);
} else {
return (synth(405));
}
}
}
An Access Control List (ACL) declaration creates and initializes a named list of IP addresses
and ranges, which can later be used to match client or server IP addresses. ACLs can be
used for anything. They are typically used to control the IP addresses that are allowed to
send PURGE or ban requests, or even to avoid the cache entirely.
You may also setup ACLs to differentiate how your Varnish servers behave. You can, for
example, have a single VCL program for different Varnish servers. In this case, the VCL
program evaluates server.ip and acts accordingly.
ACLs are fairly simple to create. A single IP address or hostname should be in quotation
marks, as "localhost". ACL uses the CIDR notation to specify IP addresses and their
associated routing prefixes. In Varnish's ACLs the slash "/" character is appended outside
the quoted IP address, for example "192.168.1.0"/24.
To exclude an IP address or range from an ACL, and exclamation mark "!" should precede
the IP quoted address. For example !"192.168.1.23". This is useful when, for example, you
want to include all the IP address in a range except the gateway.
Chapter 10 Saving a Request Page 201
Warning
If you declare ACLs, but do not use them, varnishd returns error by default. You can
avoid this situation by turning off the runtime parameter vcc_err_unref. However,
this practice is strongly discouraged. Instead, we advise to declare only what you use.
Page 202 Chapter 10 Saving a Request
10.8 Compression
• Where to compress? backend or Varnish?
• Parameter to toggle: http_gzip_support
• VCL variable beresp.do_gzip to zip and beresp.do_gunzip to unzip
sub vcl_backend_response {
if (beresp.http.content-type ~ "text") {
set beresp.do_gzip = true;
}
}
It is sensible to compress objects before storing them in cache. Objects can be compressed
either at the backend or your Varnish server, so you have to make the decision on where to
do it. Factors that you should take into consideration are:
• where to store the logic of what should be compressed and what not
• available CPU resources
Also, keep in mind that files such as JPEG, PNG, GIF or MP3 are already compressed. So you
should avoid compressing them again in Varnish.
By default, http_gzip_support is on, which means that Varnish follows the behavior
described in https://www.varnish-cache.org/docs/trunk/phk/gzip.html and
https://www.varnish-cache.org/docs/trunk/users-guide/compression.html. If you want to have
full control on what is compressed and when, set the http_gzip_support parameter to off,
and activate compression based on specific rules in your VCL code. Implement these rules in
vcl_backend_response and then set beresp.do_gzip or beresp.do_gunzip as the
example above.
If you compose your content using Edge Side Includes (ESI), you should know that ESI and
gzip work together. Next chapter explains how to compose your content using Varnish and
Edge Side Includes (ESI).
Note
Compression in Varnish uses and manipulates the Accept-Encoding and
Content-Encoding HTTP header fields. Etag validation might also be weakened.
Refer to https://www.varnish-cache.org/docs/trunk/phk/gzip.html and
https://www.varnish-cache.org/docs/trunk/users-guide/compression.html for all details
about compression.
Chapter 10 Saving a Request Page 203
Page 204 Chapter 11 Content Composition
11 Content Composition
This chapter is for the web-developer course only
This chapter teaches you how to glue content from independent sources into one web page.
• A front page
• Articles or sub-pages
• A login-box or "home bar"
• Static elements, like CSS, JavaScript and graphics
To truly utilize Varnish to its full potential, start by analyzing the structure of the website.
Ask yourself this:
• What makes web pages in your server different from each other?
• Does the differences apply to entire pages, or only parts of them?
• How can I let Varnish to know those differences?
Beginning with the static elements should be easy. Previous chapters of this book cover how
to handle static elements. How to proceed with dynamic content?
An easy solution is to only cache content for users that are not logged in. For news-papers,
that is probably enough, but not for web-shops.
Web-shops re-use objects frequently. If you can isolate the user-specific bits, like the
shopping cart, you can cache the rest. You can even cache the shopping cart, if you tell
Varnish when to change it.
The most important lessons is to start with what you know.
Page 206 Chapter 11 Content Composition
11.2 Cookies
• Be careful when caching cookies!
• Cookies are frequently used to identify unique users, or user's choices.
• They can be used for anything from identifying a user-session in a web-shop to opting
for a mobile version of a web page.
• Varnish can handle cookies coming from two different sources:
• req.http.Cookie header field from clients
• beresp.http.Set-Cookie header field from servers
Note
If you need to handle cookies, consider using the cookie VMOD from
https://github.com/varnish/varnish-modules/blob/master/docs/vmod_cookie.rst. This
VMOD handles cookies with convenient parsing and formatting functions without the
need of regular-expressions.
Chapter 11 Content Composition Page 207
Varnish uses a different hash value for each cached resource. Resources with several
representations, i.e. variations containing the Vary response header field, share the same
hash value in Varnish. Despite this common hash value, caching based on the
Vary: Cookie response header is not advised, because of its poor performance. For a more
detailed explanation on Vary, please refer to the Vary subsection.
Note
Consider using Edge Side Includes to let Varnish build responses that combine content
with and without cookies, i.e. combining caches and responses from the origin server.
Page 208 Chapter 11 Content Composition
• /common/ -- no cookies
• /user/ -- has user-cookies
• /voucher/ -- has only the voucher-cookie
• etc.
11.2.3 Exercise: Handle Cookies with Vary and hash_data with HTTPie
In this exercise you have to use two cache techniques; first Vary and then hash_data(). The
exercise uses the Cookie header field, but the same rules apply to any other field. For that,
prepare the testbed and test with HTTPie:
Vary: Part 1:
1. Write a VCL program to force Varnish to cache client requests with cookies.
2. Send two client requests for the same URL; one for user Alice and one for user Bob.
3. Does Varnish use different backend responses to build and deliver the response to the
client?
4. Make cookies.php send the Vary: Cookie response header field, then analyze the
response to the client.
5. Remove beresp.http.Vary in vcl_backend_response and see if Varnish still honors the
Vary header.
Vary: Part 2:
hash_data(): Part 1:
1. Write another VCL program or add conditions to differentiate requests handled by Vary
and hash_data().
2. Add hash_data(req.http.Cookie); in vcl_hash.
3. Check how multiple values of Cookie give individual cached objects.
hash_data(): Part 2:
1. Purge the cache again and check the result after using hash_data() instead of
Vary: Cookie.
This exercise is all about Vary and hash mechanisms. These mechanisms can also be tested
and learned through varnishtest. If you have time and are curious enough, please do the
Exercise: Handle Cookies with Vary and hash_data() in varnishtest. After solving these
exercises, you will understand very well how Vary and hash_data(); work.
Page 210 Chapter 11 Content Composition
Edge Side Includes or ESI is a small markup language for dynamic web page assembly at the
reverse proxy level. The reverse proxy analyses the HTML code, parses ESI specific markup
and assembles the final result before flushing it to the client. Figure 27 depicts this process.
With ESI, Varnish can be used not only to deliver objects, but to glue them together. The
most typical use case for ESI is a news article with a most recent news box at the side. The
article itself is most likely written once and possibly never changed, and can be cached for a
long time. The box at the side with most recent news, however, changes frequently. With
ESI, the article can include a most recent news box with a different TTL.
When using ESI, Varnish fetches the news article from a web server, then parses the
<esi:include src="/url" /> ESI tag, and fetches the URL via a normal request. Either
finding it already cached or getting it from a web server and inserting it into cache.
The TTL of the ESI element can be 5 minutes while the article is cached for two days. Varnish
delivers the two different objects in one glued page. Thus, Varnish updates parts
independently and makes possible to combine content with different TTL.
Chapter 11 Content Composition Page 211
sub vcl_backend_response {
set beresp.do_esi = true;
}
You can also strip off cookies per ESI element. This is done in vcl_recv.
Varnish only supports three ESI tags:
• <esi:include>: calls the page defined in the src attribute and replaces the ESI tag
with the content of src.
• <esi:remove>: removes any code inside this opening and closing tag.
• <!--esi ``(content) -->``: Leaves (content) unparsed. E.g., the following does not
process the <esi:include> tag:
<!--esi
This ESI tag is not processed: <esi:include src="example">
-->
varnishtest is a useful tool to understand how ESI works. The subsection Understanding ESI
in varnishtest contains a Varnish Test Case (VTC) using ESI.
Note
Varnish outputs ESI parsing errors in varnishstat and varnishlog.
Page 212 Chapter 11 Content Composition
<HTML>
<BODY>
<?php
header( 'Content-Type: text/plain' );
<esi:include src="/cgi-bin/date.cgi"/>
</BODY>
</HTML>
#! /bin/sh
sub vcl_backend_response {
if (bereq.url == "/esi-date.php") {
set beresp.do_esi = true; // Do ESI processing
set beresp.ttl = 1m; // Sets a higher TTL main object
} elsif (bereq.url == "/cgi-bin/esi-date.cgi") {
set beresp.ttl = 30s; // Sets a lower TTL on
// the included object
}
}
Then reload your VCL (see Table 6 for reload instructions) and issue the command
http http://localhost/esi-date.php. The output should show you how Varnish replaces
Chapter 11 Content Composition Page 213
the ESI tag with the response from esi-date.cgi. Note the different TTLs from the glued
objects.
Page 214 Chapter 11 Content Composition
See the suggested solutions of Exercise: Handle Cookies with Vary and hash_data() in
varnishtest to get an idea on how to solve this exercise. Try to avoid return (hash); in
vcl_recv and return (deliver); in vcl_backend_response as much as you can. This is a
general rule to make safer Varnish setups.
During the exercise, make sure you understand all the cache mechanisms at play. You can
also try removing the Vary: Cookie header from esi-user.php.
You may also want to try PURGE. If so, you have to purge each of the objects, because
purging just /esi-top.php does not purge /esi-user.php.
Chapter 11 Content Composition Page 215
During development of different web pages to be ESI-glued by Varnish, you might not need
Varnish all the time. One important reason for this, is to avoid caching during the
development phase. There is a solution based on JavaScript to interpret ESI syntax without
having to use Varnish at all. You can download the library at the following URL:
• http://www.catalystframework.org/calendar/static/2008/esi/ESI_Parser.tar.gz
Once downloaded, extract it in your code base, include esiparser.js and include the
following JavaScript code to trigger the ESI parser:
With AJAX it is not possible by default to send requests across another domain. This is a
security restriction imposed by browsers. If this represents an issue for your web pages, you
can be easily solve it by using Varnish and VCL.
Chapter 11 Content Composition Page 217
function getMasqueraded()
{
$("#result").load( "/masq/robots.txt" );
}
</script>
</head>
<body>
<h1>Cross-domain Ajax</h1>
<ul>
<li><a href="javascript:getNonMasqueraded();">
Test a non masqueraded cross-domain request
</a></li>
<li><a href="javascript:getMasqueraded();">
Test a masqueraded cross-domain request
</a></li>
</ul>
<h1>Result</h1>
<div id="result"></div>
</body>
</html>
Use the provided ajax.html page. Note that function getNonMasqueraded() fails because
the origin is distinct to the google.com domain. Function getMasqueraded() can do the job
if a proper VCL code handles it. Write the VCL code that masquerades the Ajax request to
http://www.google.com/robots.txt.
If you need help, see Solution: Write a VCL that masquerades XHR calls.
Page 218 Chapter 12 Varnish Plus Software Components
For more information about the complete Varnish Plus offer and their documentation, please
visit:
• https://www.varnish-software.com/what-is-varnish-plus
• https://www.varnish-software.com/resources/
Chapter 12 Varnish Plus Software Components Page 219
• GUI
• API
• Super Fast Purger
The Varnish Administration Console (VAC) consists of a GUI and an API. VAC is most
commonly used in production environments where real-time graphs and statistics help
identify bottlenecks and issues within Varnish Cache servers. VAC is a management console
for groups of Varnish Cache servers, also known as cache groups. A cache group is a
collection of Varnish Cache servers that have identical configuration. Attributes on a cache
group includes:
VAC distributes and store VCL files for you. A parameter set is a list of Varnish cache
parameters. These parameters can be applied to one or more cache groups simultaneously,
as long as all cache groups consist of cache servers of the same version.
VAC ships with a JSON-based RESTful API to integrate your own systems with the VAC. All
actions performed via the user interface can be replicated with direct access to the API. This
includes fetching all real-time graph data.
The Super Fast Purger is a high performance cache invalidation delivery mechanism for
multiple installations of Varnish. Super Fast Purger is capable of distributing purge requests
to cache groups across data centers via the Restful interface. Super Fast Purger uses HMAC
as security mechanism to protect your purge requests and thus ensure data integrity.
In order to install VAC on either Debian/Ubuntu or Red Hat Enterprise, one would require
access to the Varnish Plus Software repository. As a Varnish Plus customer, you have access
to the installation guide document. This document has instructions to install, configure,
maintain, and troubleshoot your VAC installation If you have any questions on how to set up
your repository or where to obtain the VAC installation guide, please ask the instructor or
send us an email to [email protected].
Figures 26, Figures 27 and Figures 28 show screenshots of the GUI. You may also be
interested in trying the VAC demo at https://vacdemo.varnish-software.com. The instructor of
the course provides you the credentials.
Page 220 Chapter 12 Varnish Plus Software Components
Varnish Custom Statistics (VCS) is our data stream management system (DSMS)
implementation for Varnish. VCS allows you to analyze the traffic from multiple Varnish
servers in near real-time to compute traffic statistics and detect critical conditions. This is
possible by continuously extracting transactions with the vcs-key tags in your VSL. Thus,
VCS does not slow down your Varnish servers.
You can add as many custom vcs-key tags as you need in your VCL code. This allows you to
define your own metrics.
VCS can be used to produce statistical data or even apply complex event processing
techniques. Thus, VCS offers endless opportunities for tracking all aspects of websites'
behavior. Typical cases include:
• A/B testing
• Measuring click-through rate
• Track slow pages and cache misses
• Analyze what is "hot" right now in a news website
• Track changes in currency conversions in e-commerce
• Track changes in Stock Keeping Units (SKUs) <behavior in e-commerce
• Track number of unique consumers of HLS/HDS/DASH video streams
Page 224 Chapter 12 Varnish Plus Software Components
VCS is a great tool when you want to test some functionality in your backend. For that, you
can separate your requests into different groups, handle their requests accordingly, analyze
the results and conclude whether your new functionality should be applied to all groups. This
type of tests are called A/B testing. If you want to learn how to implement A/B testing in
Varnish, please refer to
https://info.varnish-software.com/blog/live-ab-testing-varnish-and-vcs.
Figure 32 and Figure 33 are screenshots of the VCS GUI. These screenshots are from the
demo on http://vcsdemo.varnish-software.com. Your instructor can provide you credential for
you to try the demo online.
Note
For further details on VCS, please look at its own documentation at
https://www.varnish-software.com/resources/.
Chapter 12 Varnish Plus Software Components Page 225
VCS uses the time-based tumbling windows technique to segment the data stream into finite
parts. These windows are created based on the vcs-key tag that you specify in your VCL
code. Each window aggregates the data within a configurable period of time.
Table 20 shows the data model in VCS. This table is basically a representation of two
windows seen as two records in a conventional database. In this example, data shows two
windows of 30 seconds based on the example.com vcs-key. For presentation purposes in
this page, the distribution of this table is of a database that grows from left to right.
The VCS data model has the following fields:
Page 226 Chapter 12 Varnish Plus Software Components
vcs-key
common key name for transactions making this record
timestamp
Timestamp at the start of the window
n_req
Number of requests
n_req_uniq
Number of unique requests, if configured
n_miss
Number of backend requests (i.e. cache misses) Number of hits can be calculated as
n_hit = n_req - n_miss
avg_restart
Average number of VCL restarts triggered per request
n_bodybytes
Total number of bytes transferred for the response bodies
ttfb_miss
Average time to first byte for requests that ended up with a backend request
ttb_hit
Average time to first byte for requests that were served directly from varnish cache
resp_1xx -- resp_5xx
Counters for response status codes.
reqbytes
Number of bytes received from clients.
respbytes
Number of bytes transmitted to clients.
berespbytes
Number of bytes received from backends.
bereqbytes
Number of bytes transmitted to backends.
You can think of each window as a record of a traditional database that resides in memory.
This database is dynamic, since the engine of VCS updates it every time a new window
(record) is available. VCS provides an API to retrieve this data from the table above in JSON
format:
{
"example.com": [
{
"timestamp": "2013-09-18T09:58:30",
"n_req": 76,
"n_req_uniq": "NaN",
Chapter 12 Varnish Plus Software Components Page 227
"n_miss": 1,
"avg_restarts": 0.000000,
"n_bodybytes": 10950,
"ttfb_miss": 0.000440,
"ttfb_hit": 0.000054,
"resp_1xx": 0,
"resp_2xx": 76,
"resp_3xx": 0,
"resp_4xx": 0,
"resp_5xx": 0,
...
},
{
"timestamp": "2013-09-18T09:58:00",
"n_req": 84,
"n_req_uniq": "NaN",
"n_miss": 0,
"avg_restarts": 0.000000,
"n_bodybytes": 12264,
"ttfb_miss": "NaN",
"ttfb_hit": 0.000048,
"resp_1xx": 0,
"resp_2xx": 84,
"resp_3xx": 0,
"resp_4xx": 0,
"resp_5xx": 0,
...
},
...
]
}
Page 228 Chapter 12 Varnish Plus Software Components
Examples:
For vcs-key with names ending with .gif, retrieve a list of the top 10:
/match/(.*)%5C.gif$/top
/all/top_ttfb/50
The VCS API queries the VCS data model and the output is in JSON format. The API responds
to requests for the following URLs:
/key/<vcs-key>
Retrieves stats for a single vcs-key. <vcs-key> name must be URL encoded.
/match/<regex>
Retrieves a list of vcs-key matching the URL encoded regular-expression. Accepts the
query parameter verbose=1, which displays all stats collected for the <vcs-keys>
matched.
/all
Retrieves a list of all the <vcs-keys> currently in the data model.
For /match/<regex> and /all, VCS can produce sorted lists. For that, you can append one
of the following sorting commands.
/top
Sort based on number of requests.
/top_ttfb
Sort based on the ttfb_miss field.
/top_size
Sort based on the n_bodybytes field.
/top_miss
Sort based on the n_miss field.
/top_respbytes
Sort based on number of bytes transmitted to clients.
/top_reqbytes
Sort based on number of bytes received from clients.
Chapter 12 Varnish Plus Software Components Page 229
/top_berespbytes
Sort based on number of bytes fetched from backends.
/top_bereqbytes
Sort based on number of bytes transmitted to backends.
/top_restarts
Sort based on the avg_restarts field.
/top_5xx, /top_4xx, ..., /top_1xx
Sort based on number of HTTP response codes returned to clients for 5xx, 4xx, 3xx, etc.
/top_uniq
Sort based on the n_req_uniq field.
Further, a /k parameter can be appended, which specifies the number of keys to include in
the top list. If no k value is provided, the top 10 is displayed.
Note
For installation instructions, please refer to
http://files.varnish-software.com/pdfs/installation-guide_vcs-latest.pdf. Once you have
installed all necessary components, take a look at the man pages of vstatd and
vstatdprobe for more documentation.
Page 230 Chapter 12 Varnish Plus Software Components
The Varnish High Availability agent (vha-agent) is a content replicator with the aim of
copying the cached objects from an origin Varnish server to a neighboring Varnish server.
This increases resiliency and performance, specially when backend traffic surges.
vha-agent reads the log of Varnish, and for each object insertion detected it fires a request
to the neighboring Varnish server. This server fetches the object from the origin Varnish
server. As a result, the same object is cached in both servers with only one single backend
fetch.
This solution requires vha-agent to be installed on the origin Varnish server, and some
simple VCL configuration on the replicated Varnish server. Ideally, vha-agent is installed on
both servers so they can both replicate object insertions from each other in an active/active
configuration.
Typical uses of VHA include:
The replication of cached objects may bring the need for multiple cache invalidation. For that
purpose, you can use the Varnish Administration Console (VAC). Remember: you should
Page 232 Chapter 12 Varnish Plus Software Components
define the rules on how to invalidate cached objects before caching them in production
environments.
Chapter 12 Varnish Plus Software Components Page 233
• Hitch: network proxy that terminates SSL/TLS connections and forwards the
unencrypted traffic
• Configuration file: /etc/hitch/hitch.conf
• Configure Varnish to listen to PROXY requests in /etc/varnish/varnish.params
Backend encryption is useful for deployments with geographically distributed origin servers
such as CDNs. Varnish supports SSL/TLS encryption to secure communication on both:
backend and frontend without third-party solutions. SSL/TLS configuration for connections
between Varnish and the backend is described in Exercise: Configure Varnish.
Varnish Plus integrates hitch, which can have tens of thousands of listening sockets and
hundreds of thousands of certificates. Following are the steps to configure Varnish to accept
SSL/TLS connections with hitch.
$ /etc/pki/tls/certs/make-dummy-cert your-cdn.pem
For the purposes of this book, we create a dummy key and certification file
concatenated in the .pem file. See
https://docs.varnish-software.com/tutorials/hitch-letsencrypt/ or
https://github.com/varnish/hitch/blob/master/docs/certificates.md for alternative
methods.
4. Configure hitch in /etc/hitch/hitch.conf:
frontend = "[*]:443"
backend = "[127.0.0.1]:6081"
pem-file = "/path/to/your-cdn.pem"
Page 234 Chapter 12 Varnish Plus Software Components
ciphers = "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH"
prefer-server-ciphers = off
ssl-engine = ""
workers = 1
backlog = 100
keepalive = 3600
chroot = ""
user = "hitch"
group = "hitch"
quiet = off
syslog = on
syslog-facility = "daemon"
daemon = on
write-ip = off
write-proxy-v1 = on
write-proxy-v2 = off
proxy-proxy = off
sni-nomatch-abort = off
DAEMON_OPTS="-a :6081,PROXY"
7. Start hitch:
At the moment of writing this text, service hitch start did not output starting errors.
You should check whether hitch has started, if not, try the following command to debug:
$ /usr/sbin/hitch --pidfile=/run/hitch/hitch.pid \
--config=/etc/hitch/hitch.conf
Note
Hitch has its own documentation in its man pages man hitch and man hitch.conf.
Additional information at https://github.com/varnish/hitch.
Chapter 13 Appendix A: Resources Page 235
13 Appendix A: Resources
Community driven:
• https://www.varnish-cache.org
• https://www.varnish-cache.org/docs
• http://repo.varnish-cache.org
• https://www.varnish-cache.org/trac/wiki/VCLExamples
• Public mailing lists: https://www.varnish-cache.org/trac/wiki/MailingLists
• Public IRC channel: #varnish at irc.linpro.no
Commercial:
• https://www.varnish-software.com/resources
• http://planet.varnish-cache.org
• https://www.varnish-software.com
• http://repo.varnish-software.com (for service agreement customers)
• [email protected] (for existing customers, with SLA)
• [email protected]
Page 236 Chapter 14 Appendix B: Varnish Programs
• varnishlog
• varnishncsa
• varnishhist
• varnishtop
Administration:
• varnishadm
Global counters:
• varnishstat
• varnishtest
varnishlog, varnishadm and varnishstat are explained in the Examining Varnish Server's
Output chapter. Next sections explain varnishtop, varnishncsa, and varnishhist.
Chapter 14 Appendix B: Varnish Programs Page 237
14.1 varnishtop
$varnishtop -i BereqURL,RespStatus
varnishtop groups tags and their content together to generate a sorted list of the most
frequently appearing tag/tag-content pair. This tool is sometimes overlooked, because its
usefulness is visible after you start filtering. The above example lists status codes that
Varnish returns.
Two of the perhaps most useful variants of varnishtop are:
• varnishtop -i BereqURL: creates a list of requested URLsx at the backend. Use this to
find out which URL is the most fetched.
• varnishtop -i RespStatus: lists what status codes Varnish returns to clients.
varnishtop uses the varnishlog API, therefore you may also combine tag lists as in the
above example. Even more, you may apply Query Language -q options. For example,
varnishtop -q 'respstatus > 400' shows you counters for responses where client seem
to have erred.
Some other possibly useful examples are:
• varnishtop -i ReqUrl: displays what URLs are most frequently requested from clients.
• varnishtop -i ReqHeader -C 'User-Agent:.*Linux.*': lists User-Agent headers
that contain the regular-expression and ignoring case Linux string. This example is
useful to filter requests from Linux users, since most web browsers in Linux report
themselves as Linux.
• varnishtop -i RespStatus: lists status codes received in clients from backends.
• varnishtop -i VCL_call: shows what VCL functions are used.
• varnishtop -i ReqHeader -I Referrer shows the most common referrer addresses.
Page 238 Chapter 14 Appendix B: Varnish Programs
14.2 varnishncsa
10.10.0.1 - - [24/Aug/2008:03:46:48 +0100] "GET \
http://www.example.com/images/foo.png HTTP/1.1" 200 5330 \
"http://www.example.com/" "Mozilla/5.0"
If you already have tools in place to analyze NCSA Common log format, varnishncsa can be
used to print the VSL in this format. varnishncsa dumps everything pointing to a certain
domain and subdomains.
Filtering works similar to varnishlog.
Chapter 14 Appendix B: Varnish Programs Page 239
14.3 varnishhist
1:1, n = 71 localhost
#
#
#
#
##
###
###
###
###
###
| ###
| ###
| | ###
|||| ### #
|||| #### #
|##|##### # # # # #
+-------+-------+-------+-------+-------+-------+-------+-------+-------
|1e-6 |1e-5 |1e-4 |1e-3 |1e-2 |1e-1 |1e0 |1e1 |1e2
The varnishhist utility reads the VSL and presents a continuously updated histogram
showing the distribution of the last n requests. varnishhist is particularly useful to get an
idea about the performance of your Varnish Cache server and your backend.
The horizontal axis shows a time range from 1e-6 (1 microsecond) to 1e2 (100 seconds). This
time range shows the internal processing time of your Varnish Cache server and the time it
takes to receive a response from the backend. Thus, this axis does not show the time
perceived at the client side, because other factors such as network delay may affect the
overall response time.
Hits are marked with a pipe character ("|"), and misses are marked with a hash character
("#"). These markers are distributed according to the time taken to process the request.
Therefore, distributions with more markers on the left side represent a faster performance.
When the histogram grows in vertical direction more than what the terminal can display, the
1:m rate changes on the left top corner. Where m represent the number of times that each
marker is found. On the right top corner, you can see the name of the host.
Page 240 Chapter 14 Appendix B: Varnish Programs
14.5 varnishtest
• Script driven program used to test the configuration of Varnish, run regression tests,
and develop VMODs
• Useful for system administrators, web developers, and VMODs developers
Varnish is distributed with many utility programs. varnishtest is a script driven program
that allows you create client mock-ups, simulate transactions, fetch content from mock-up or
real backends, interact with your actual Varnish configuration and assert expected
behaviors.
You can use varnishtest when configuring your Varnish installation, i.e., writing VCL code,
or developing VMODs. varnishtest has its own language: the Varnish Test Case (VTC)
language. This language has a fairly simple syntax. In fact, when designing your caching
algorithm or any other functionality in Varnish, we recommend you first to write Varnish Test
Cases (VTCs) as part of your design. VTCs are also useful to reproduce bugs when filing a
bug report.
There are many .vtc files included in Varnish Cache under bin/varnishtest/tests/. Think
about those files as a learning source. Further documentation of varnishtest is found in its
man page, bin/varnishtest/tests/README and
https://www.varnish-cache.org/docs/trunk/reference/varnishtest.html.
Page 242 Chapter 14 Appendix B: Varnish Programs
b00001.vtc
server s1 {
rxreq
txresp
} -start
client c1 {
txreq
rxresp
varnishtest does not follow the unit testing framework (up/test/assert/tear down) nor
behavior-driven development (given/when/then). Depending on your use case, there might
be test preparations, executions and assertions all over the place. VTC is not compiled but
simply interpreted on the fly.
There is a naming convention for VTC files. Files starting with b as the example above
contain basic functionality tests. The naming scheme is in
Varnish-Cache/bin/varnishtest/tests/README or https://raw.githubusercontent.com/varn
ish/Varnish-Cache/master/bin/varnishtest/tests/README.
All VTC programs start by naming the test:
server s1 {
rxreq
txresp
} -start
Chapter 14 Appendix B: Varnish Programs Page 243
All server declarations must start with s. In the code above, s1 receives a request rxreq,
and transmits a response txresp. -start boots s1 and makes available the macros
${s1_addr} and ${s1_port} with the IP address and port of your simulated backend. You
may also start a declaration at a later point in your code, for example server s1 -start.
To declare an instance of your real Varnish server:
varnish v1 declares an instance of your real Varnish server, i.e., varnishd. The names for
Varnish servers must start with v. This instance is controlled by the manager process, and
-start forks a child, which is the actual cacher process. You will learn about the manager
and cacher in The Parent Process: The Manager and The Child Process: The Cacher sections.
There are many ways to configure varnishd. On way is by passing arguments with -arg as
in -arg "-b ${s1_addr}:${s1_port}". -b is a varnishd option to define the backend. In
this case, we use the IP address and port of the simulated backend s1, but you can also use
a real backend. Therefore, varnishtest can be used as integration tool when testing your
real backend.
There are other ways to define backends. The most common one is perhaps by defining
them in your VCL code, as we shall see in the next section.
To simulate a client:
client c1 {
txreq
rxresp
expect resp.http.via ~ "varnish"
} -run
Simulated clients in varnishtest start with c. In this example, c1 transmits one request
and receives one response.
Since Varnish is a proxy, we expect to receive the response from the backend via Varnish.
Therefore, c1 expects varnish in the via HTTP header field. We use tilde ~ as match
operator of regular-expressions because the exact text in resp.http.via depends on the
Varnish version you have installed.
Finally, you start client c1 with the -run command.
Page 244 Chapter 14 Appendix B: Varnish Programs
You might have noticed that we used -start for v1, but -run for c1. The difference
between these commands is that -run executes -start -wait.
Varnish is a multi-threaded program. Therefore, each instance in varnishtest, i.e., s1, v1
and c1, is executed by a different thread. Sometimes, you will need some sort of
synchronization mechanism to ensure you avoid race conditions or other non-intuitive
behaviors. For those cases, you can use the -wait command.
-wait tells the executor of varnishtest to wait for a given instance to complete before
proceeding to the next instruction in your VTC program. To illustrate this, see the difference
between:
varnishtest "Synchronized"
server s1 {
rxreq
txresp
}
server s1 -start
server s1 -wait
and:
varnishtest "Unsynchronized"
server s1 {
rxreq
txresp
}
client c1 -run
The second test fails in comparison to the first one, because varnishtest times out while
waiting for s1 to receive a request and transmit a response. Therefore, you typically start
Varnish servers with the -start command, but start clients with the -run command.
Note
You will learn more about the Threading Model of Varnish in its own section.
Note
Note that we do not instantiate a Varnish server in the examples, but connect the
client directly to the server. For that purpose we use ${s1_sock}. This macro
translates to the IP address and port of s1.
Page 246 Chapter 14 Appendix B: Varnish Programs
$varnishtest b00001.vtc
# top TEST b00001.vtc passed (1.458)
To run your test, you simply issue the command above. By default, varnishtest outputs
the summary of passed tests, and a verbose output for failed tests only. If you want to
always get a verbose output, run varnishtest with the -v option.
A passed test means that you have the most basic Varnish configuration correct in the
testbed varnishtest. In the next section we explain how to configure Varnish in the way you
normally would do after your tests have passed or when the varnishtest testbed is not
enough for your needs.
There is much more to explain about varnishtest, but before that, you must learn more
about the fundamentals of Varnish. We will introduce new concepts and make a more
advanced use of varnishtest as we progress in the book.
Chapter 14 Appendix B: Varnish Programs Page 247
• Use VTC to test the Server and Via HTTP header fields.
In this exercise you have to define a backend pointing to your Apache server and use
assertions with expect. If you need help, take a look at Solution: Test Apache as Backend
with varnishtest.
Page 248 Chapter 14 Appendix B: Varnish Programs
# Good enough when you are sure that the value is well formatted
# and in a valid range:
varnish v1 -cli "param.set default_ttl 50"
# Bad test:
varnish v1 -cli "param.set default_ttl -1"
Parameters can also be set in varnishtest. To execute commands via the CLI, you have
three options: -cli "command", -cliok "command" and -clierr "status" "command".
-cli executes a command without checking the return status. -cliok executes a command
and expects it to return OK 200 status. -clierr executes a command and checks whether
the expected return status matches.
Note
You have to instruct varnishtest to assert the expected behavior as much as you
can. For example, varnish v1 -cli "param.set default_ttl -1" does not fail
because -cli does not assert the return status.
Chapter 14 Appendix B: Varnish Programs Page 249
Note
The macro ${bad_ip} translates to 192.0.2.255. This IP address is for test use only,
and it is used here because we do not need a backend to set parameters in Varnish.
However, we must always declare at least one backend when varnishd is to be
started.
Note
Note that we do not start v1, because in this example, we do not need to start the
cacher process.
Page 250 Chapter 14 Appendix B: Varnish Programs
server s1 {
rxreq
txresp
} -start
client c1 {
txreq
rxresp
delay 1
txreq
rxresp
expect resp.http.Age == 1
} -run
You can use the delay command in varnishtest. The unit of the command are seconds and
it also accepts float numbers. For more information about the Age response header field
refer to the Age subsection.
The Age value depends on the time to live (TTL) value of the cached object. We will learn
more about it in The Initial Value of beresp.ttl section.
Chapter 14 Appendix B: Varnish Programs Page 251
server s1 {
rxreq
txresp -hdr "Date: Thu, 01 Jan 2015 00:00:00 GMT" \
-hdr "Expires: Thu, 01 Jan 2015 00:00:01 GMT"
} -start
client c1 {
txreq
rxresp
} -run
delay 3
In Varnish, an expired object is an object that has exceeded the TTL + grace + keep time.
In the example above, the Expires header field sets TTL to 1, and changes default_grace
from 10 to 2. default_keep is already 0, but we show it explicitly anyway.
Tip
Take a look at s00000.vtc and s00001.vtc in
Varnish-Cache/bin/varnishtest/tests/.
Tip
To get more information about n_expire, issue man varnish-counters.
Page 252 Chapter 14 Appendix B: Varnish Programs
$varnishtest -v b00001.vtc
(...)
**** v1 0.3 vsl| 0 CLI - Rd vcl.load "boot" (...)
(...)
**** v1 0.4 vsl| 1000 Begin c sess 0 HTTP/1
(...)
**** v1 0.4 vsl| 1002 Begin b bereq 1001 fetch
(...)
**** v1 0.4 vsl| 1002 End b
**** v1 0.4 vsl| 1001 Begin c req 1000 rxreq
(...)
**** v1 0.4 vsl| 1001 End c
(...)
**** v1 0.4 vsl| 1000 End c
(...)
**** v1 0.5 vsl| 0 CLI - EOF on CLI connection (...)
Above is a snippet of how Varnish logs are displayed in varnishtest. varnishtest does not
group logs by default as varnishlog does. Still, varnishtest allows you to group the
transactions for assertions with the command logexpect.
varnishtest starts client transactions in 1000. Note the VXID 0 for Varnish specific records.
Chapter 14 Appendix B: Varnish Programs Page 253
14.5.9 logexpect
server s1 {
rxreq
txresp
} -start
logexpect l1 -v v1
logexpect l1 {
expect * * ReqURL /favicon.ico
} -start
client c1 {
txreq -url "/favicon.ico"
rxresp
} -run
logexpect l1 -wait
logexpect is a program that uses the varnishlog API. Therefore, it is able to group and
query the Varnishlog just as varnishlog does. In addition, logexpect allows you to assert
what you are expecting to appear in VSL.
Note logexpect l1 -wait at the end of the VTC script above. Without it, the test would
finish successfully without concluding the assert in l1, because varnishtest would not wait
for it. -wait instructs the executor of varnishtest to wait until l1 is done.
Below is the synopsis of arguments and options of logexpect:
-v <varnish-instance>
-d <0|1> (head/tail mode)
-g <grouping-mode>
-q <query>
logexpect lN -v <id> [-g <grouping>] [-d 0|1] [-q query] [vsl arguments] {
expect <skip> <vxid> <tag> <regex>
}
• Write a Varnish test to check the counters for cache misses, cache hits, and number of
cached objects.
• Use cache_miss, cache_hit, and n_object counters respectively.
server s1 {
# Backend VXID=1002
rxreq
expect req.url == "/same-url"
expect req.http.foobar == "1"
txresp -hdr "Vary: Foobar" -hdr "Snafu: 1" -body "1111\n"
# Backend VXID=1004
rxreq
expect req.url == "/same-url"
expect req.http.foobar == "2"
txresp -hdr "Vary: Foobar" -hdr "Snafu: 2" -body "2222\n"
} -start
client c1 {
txreq -url "/same-url" -hdr "Foobar: 1"
rxresp
expect resp.status == 200
# First client request with VXID=1001
# Request misses. Creates backend request with VXID=1002
expect resp.http.X-Varnish == "1001"
expect resp.http.snafu == "1"
expect resp.body == "1111\n"
your backend should handle the lack of that header field specifically. You can test it as the
following assertion shows:
rxreq
expect req.http.foobar == <undef>
txresp -hdr "Vary: Foobar" -hdr "Snafu: 3" -body "3333\n"
Be aware that the lack of a header field sent by a client is not the same as sending the field
with an empty value. Therefore, requests like:
rxreq
expect req.http.foobar == ""
txresp -hdr "Vary: Foobar" -hdr "Snafu: 4" -body "4444\n"
server s1 {
# Request 1
rxreq
txresp -hdr "Last-Modified: Wed, 11 Sep 2013 13:36:55 GMT" \
-body "Geoff Rules"
# Request 2
rxreq
expect req.http.if-modified-since == "Wed, 11 Sep 2013 13:36:55 GMT"
txresp -status 304
# There will be no need to handle a third request in this example.
} -start
varnish v1 -vcl+backend {
sub vcl_backend_response {
set beresp.ttl = 2s;
set beresp.grace = 5s;
# beresp.was_304 is ``true`` if the response from the backend was
# a positive result of a conditional fetch (``304 Not Modified``).
set beresp.http.was-304 = beresp.was_304;
}
} -start
client c1 {
txreq
rxresp
expect resp.status == 200
expect resp.body == "Geoff Rules"
# this was not a conditional fetch
expect resp.http.was-304 == "false"
} -run
delay 3
} -run
delay 1
sub vcl_backend_response {
set beresp.ttl = 2s;
set beresp.grace = 5s;
You will learn all details about VCL in the following sections, but for now it is enough to
understand that this code sets the time to live TTL and grace time of cached objects to 2 and
5 seconds respectively. Recall the object lifetime from Figure 2 to understand the expected
behavior.
The code also adds a HTTP response header field was-304 with the boolean value of the
beresp.was_304. This variable is set to true if the response from the backend was a
positive result of a conditional fetch (304 Not Modified).
We hope that this exercise motivates you to use varnishtest when designing your cache
policies. As you can see, varnishtest is very precise when testing caching objects against
different time settings.
Note
beresp.was_304 is a variable available in Varnish 4.1
Page 260 Chapter 14 Appendix B: Varnish Programs
server s1 {
rxreq
txresp -hdr "Cache-control: max-age=3" -body "FOO"
rxreq
txresp -body "FOOBAR"
} -start
client c1 {
txreq
rxresp
expect resp.bodylen == 3
delay 2
txreq
rxresp
expect resp.bodylen == 3
} -run
The example above shows how Cache-control: max-age=3 overwrites TTL for cached
objects. The default TTL is 120 seconds, but we set it here to 1 just to explicitly show that
the cached object is not expired after a delay of 2 seconds, because max-age=3. Therefore,
the second assert:
expect resp.bodylen == 3
from the first txresp, and you will that the second request will contain a body length of 5.
Chapter 14 Appendix B: Varnish Programs Page 261
Tip
Take a look at b00941.vtc, b00956.vtc and r01578.vtc in
Varnish-Cache/bin/varnishtest/tests/ to learn more.
Page 262 Chapter 14 Appendix B: Varnish Programs
varnish v1 -vcl {
sub vcl_recv {
if (req.url ~ "^/admin/"){
return (pass);
}
}
} -start
varnishtest allows you to insert VCL code with the -vcl directive when declaring a Varnish
server. This VCL code is inserted above the subroutines in built-in code in
{varnish-source-code}/bin/varnishd/builtin.vcl. Since builtin.vcl already includes
vcl 4.0;, you do not need to add it in varnishtest.
varnishtest allows you insert VCL code from an external file using the
include "foo.vcl"; directive, or load VMODs using the import foo; directive. For
examples on how to use include and import, refer to the available VTC files in your Varnish
distribution under the directory varnish-cache-plus/bin/varnishtest/tests/.
Chapter 14 Appendix B: Varnish Programs Page 263
server s1 {
rxreq
txresp -hdr "foo: 1"
rxreq
txresp -hdr "foo: 2"
} -start
varnish v1 -vcl+backend {
acl purgers {
"127.0.0.1";
"192.168.0.0"/24;
}
sub vcl_recv {
if (req.method == "PURGE") {
if (!client.ip ~ purgers) {
return (synth(405));
}
return (purge);
}
}
} -start
client c1 {
txreq
rxresp
expect resp.http.foo == 1
txreq
rxresp
expect resp.http.foo == 1
client c1 {
txreq
rxresp
expect resp.http.foo == 2
} -run
Page 264 Chapter 14 Appendix B: Varnish Programs
if (!client.ip ~ purgers)
Chapter 14 Appendix B: Varnish Programs Page 265
client c1 {
txreq -req BAN
rxres
} -run
You can send PURGE, BAN and REFRESH requests in varnishtest, so your VCL program acts
accordingly. Remember that you still need specify the requested URL in txreq if the URL is
other than root /. We advise you to search for purge and ban in
Varnish-Cache/bin/varnishtest/tests/ to learn more on how to invalidate caches.
Page 266 Chapter 14 Appendix B: Varnish Programs
server s1 {
rxreq
txresp -hdr "Cache-Control: max-age=30, stale-while-revalidate=30"
rxreq
txresp -hdr "Cache-Control: max-age=0, stale-while-revalidate=30"
rxreq
txresp -hdr "Cache-Control: max-age=30, stale-while-revalidate=30" \
-hdr "Age: 40"
rxreq
txresp -status 500 \
-hdr "Cache-Control: max-age=30, stale-while-revalidate=30"
} -start
varnish v1 -vcl+backend {
sub vcl_backend_response {
set beresp.http.grace = beresp.grace;
set beresp.http.ttl = beresp.ttl;
}
} -start
client c1 {
txreq -url /1
rxresp
expect resp.http.grace == 30.000
expect resp.http.ttl == 30.000
txreq -url /2
rxresp
expect resp.http.grace == 30.000
expect resp.http.ttl == 0.000
txreq -url /3
rxresp
expect resp.http.grace == 30.000
expect resp.http.ttl == -10.000
txreq -url /4
rxresp
expect resp.http.grace == 10.000
expect resp.http.ttl == 0.000
} -run
Chapter 14 Appendix B: Varnish Programs Page 267
This example shows you how the HTTP response header field Cache-Control sets max-age
to ttl and stale-while-revalidate to grace. ttl and grace are attributes of cached
objects. The VCL code in v1 includes these attributes in the HTTP response header fields
http.ttl and http.grace that are sent to the client. c1 asserts the values of these fields.
Page 268 Chapter 14 Appendix B: Varnish Programs
1. Write a VTC program that forces Varnish to cache client requests with cookies.
2. Send two client requests for the same URL; one for user Alice and one for user Bob.
3. Does Varnish use different backend responses to build and deliver the response to the
client?
4. Make the your simulated server send the Vary: Cookie response header field, then
analyze the response to the client.
5. Remove beresp.http.Vary in vcl_backend_response and see if Varnish still honors the
Vary header.
Vary: Part 2:
hash_data(): Part 1:
1. Write another VTC program or add conditions and asserts to differentiate requests
handled by Vary and hash_data().
2. Add hash_data(req.http.Cookie); in vcl_hash.
3. Check how multiple values of Cookie give individual cached objects.
hash_data(): Part 2:
1. Purge the cache again and check the result after using hash_data() instead of
Vary: Cookie.
This exercise is all about Vary and hash mechanisms. After this exercise, you should have a
very good idea on how Vary and hash_data() work. If you need help, see Solution: Handle
Cookies with Vary in varnishtest or Solution: Handle Cookies with hash_data() in varnishtest.
Chapter 14 Appendix B: Varnish Programs Page 269
server s1 {
rxreq
txresp -body {
<html>
Before include
<!--esi <esi:include src="/body"/> -->
After include
}
rxreq
expect req.url == "/body"
txresp -body {
Included file
}
} -start
varnish v1 -vcl+backend {
sub vcl_backend_response {
set beresp.do_esi = true;
}
} -start
client c1 {
txreq
rxresp
expect resp.status == 200
expect resp.bodylen == 67
}
client c1 -run
varnish v1 -expect esi_errors == 0
In the result:
you can see the HTML document after ESI has been processed.
Chapter 15 Appendix C: Extra Material Page 271
15.1 ajax.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<script type="text/javascript"
src="http://ajax.googleapis.com/ajax/libs/jquery/1.4/jquery.min.js">
</script>
<script type="text/javascript">
function getNonMasqueraded()
{
$("#result").load( "http://www.google.com/robots.txt" );
}
function getMasqueraded()
{
$("#result").load( "/masq/robots.txt" );
}
</script>
</head>
<body>
<h1>Cross-domain Ajax</h1>
<ul>
<li><a href="javascript:getNonMasqueraded();">
Test a non masqueraded cross-domain request
</a></li>
<li><a href="javascript:getMasqueraded();">
Test a masqueraded cross-domain request
</a></li>
</ul>
<h1>Result</h1>
<div id="result"></div>
</body>
</html>
Chapter 15 Appendix C: Extra Material Page 273
15.2 article.php
<?php
header("Cache-Control: max-age=10");
$utc = new DateTimeZone("UTC");
$date = new DateTime("now", $utc);
$now = $date->format( DateTime::RFC2822 );
?>
15.3 cookies.php
<?php
header( 'Content-Type: text/plain' );
print( "The following cookies have been received from the server\n" );
15.4 esi-top.php
<?php
header('Content-Type: text/html');
header('Cache-Control: max-age=30, s-maxage=3600');
$utc = new DateTimeZone("UTC");
$date = new DateTime("now", $utc);
$now = $date->format( DateTime::RFC2822 );
$setc = "";
if( isset($_POST['k']) and $_POST['k'] !== '' and
isset($_POST['v']) and $_POST['v'] !== '') {
$k=$_POST['k'];
$v=$_POST['v'];
$setc = "Set-Cookie: $k=$v";
header("$setc");
?><meta http-equiv="refresh" content="1" />
<h1>Refreshing to set cookie <?php print $setc; ?></h1><?php
}
?>
<html><head><title>ESI top page</title></head><body><h1>ESI Test page</h1>
<p>This is content on the top-page of the ESI page.
The top page is cached for 1 hour in Varnish,
but only 30 seconds on the client.</p>
<p>The time when the top-element was created:</p><h3>
<?php
15.5 esi-user.php
<?php
header('Content-Type: text/html');
header('Cache-Control: max-age=30, s-maxage=20');
header('Vary: Cookie');
$utc = new DateTimeZone("UTC");
$date = new DateTime("now", $utc);
$now = $date->format( DateTime::RFC2822 );
?>
<p>This is content on the user-specific ESI-include. This part of
the page is cached in Varnish separately since it emits
a "Vary: Cookie"-header. We can not affect the client-cache of
this sub-page, since that is determined by the cache-control
headers on the top-element.</p>
<p>The time when the user-specific-element was created:</p><h3>
<?php
</ul>
Chapter 15 Appendix C: Extra Material Page 277
Page 278 Chapter 15 Appendix C: Extra Material
15.6 httpheadersexample.php
<?php
date_default_timezone_set('UTC');
define( 'LAST_MODIFIED_STRING', 'Sat, 09 Sep 2000 22:00:00 GMT' );
$headers = array(
'Date' => date( 'D, d M Y H:i:s', time() ),
);
case "cache-control":
$headers['Cache-Control'] = "public, must-revalidate,
max-age=3600, s-maxage=3600";
break;
case "cache-control-override":
$headers['Expires'] = toUTCDate($expires_date);
$headers['Cache-Control'] = "public, must-revalidate,
max-age=2, s-maxage=2";
break;
case "last-modified":
$headers['Last-Modified'] = LAST_MODIFIED_STRING;
$headers['Etag'] = md5( 12345 );
case "vary":
$headers['Expires'] = toUTCDate($expires_date);
Chapter 15 Appendix C: Extra Material Page 279
$headers['Vary'] = 'User-Agent';
break;
}
sendHeaders( $headers );
}
<li><a href="<?=$_SERVER['PHP_SELF']?>?h=last-modified">
Test Last-Modified/If-Modified-Since response header fields</a></li>
<li><a href="<?=$_SERVER['PHP_SELF']?>?h=vary">
Test Vary response header field</a></li>
<ol>
</body>
</html>
Chapter 15 Appendix C: Extra Material Page 281
15.7 purgearticle.php
<?php
header( 'Content-Type: text/plain' );
header( 'Cache-Control: max-age=0' );
$hostname = 'localhost';
$port = 80;
$URL = '/article.php';
$debug = true;
$curlOptionList = array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_CUSTOMREQUEST => 'PURGE',
CURLOPT_HEADER => true ,
CURLOPT_NOBODY => true,
CURLOPT_URL => $finalURL,
CURLOPT_CONNECTTIMEOUT_MS => 2000
);
$fd = false;
if( $debug == true ) {
print "\n---- Curl debug -----\n";
$fd = fopen("php://output", 'w+');
$curlOptionList[CURLOPT_VERBOSE] = true;
$curlOptionList[CURLOPT_STDERR] = $fd;
}
$curlHandler = curl_init();
curl_setopt_array( $curlHandler, $curlOptionList );
curl_exec( $curlHandler );
curl_close( $curlHandler );
if( $fd !== false ) {
fclose( $fd );
}
}
?>
Page 282 Chapter 15 Appendix C: Extra Material
15.8 test.php
<?php
$cc = "";
if( isset($_GET['k']) and $_GET['k'] !== '' and
isset($_GET['v']) and $_GET['v'] !== '') {
$k=$_GET['k'];
$v=$_GET['v'];
$cc = "Cache-Control: $k=$v";
header("$cc");
}
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head></head>
<body>
<h1>Cache-Control Header:</h1>
<?php
print "<pre>$cc</pre>\n";
?>
<hr/>
<h1>Links for testing</h1>
<form action="/test.php" method="GET">
Key: <input type="text" name="k">
Value: <input type="text" name="v">
<input type="submit">
</form>
</body>
</html>
Chapter 15 Appendix C: Extra Material Page 283
15.9 set-cookie.php
<?php
header("Cache-Control: max-age=0");
$cc = "";
if( isset($_POST['k']) and $_POST['k'] !== '' and
isset($_POST['v']) and $_POST['v'] !== '') {
$k=$_POST['k'];
$v=$_POST['v'];
$setc = "Set-Cookie: $k=$v";
header("$setc");
}
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head></head>
<body>
<h1>Set-Cookie Header:</h1>
<?php
print "<pre>$setc</pre>\n";
?>
<hr/>
<h1>Links for testing</h1>
<form action="/set-cookie.php" method="POST">
Key: <input type="text" name="k">
Value: <input type="text" name="v">
<input type="submit">
</form>
</body>
</html>
Page 284 Chapter 15 Appendix C: Extra Material
The script aims to replace most of the syntactical changes in VCL code from Varnish 3 to
Varnish 4, but it is not exhaustive. That said, you should use it under your own responsibility.
You can download the script from https://github.com/fgsch/varnish3to4. Usage and
up-to-date details about the script is at the same web address.
Chapter 16 Appendix D: VMOD Development Page 285
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
This appendix explains the concepts you should know to develop your own VMODs. The
appendix takes you through the simplest possible VMOD: the Hello, World VMOD.
To learn most out of this appendix, you should have understood at least the following
chapters of this book: Design Principles, Getting Started, VCL Basics, VCL Subroutines.
Page 286 Chapter 16 Appendix D: VMOD Development
A VMOD is a shared library with some C functions which can be called from VCL code. The
standard (std) VMOD, for instance, is a VMOD included in Varnish Cache. We have already
used the std VMOD in this book to check whether a backend is healthy by calling
std.healthy().
VCL is the domain specific language of Varnish. This language is very powerful and efficient
for most tasks of a cache server. However, sometimes you might need more functionalities,
e.g., look up an IP address in a database.
VCL allows you to add inline C code, but this is not the most convenient approach. If you use
the memory management API provided by Varnish, your VMODs are generally easier to
maintain, more secure, and much easier to debug in collaboration with other developer. In
addition, VMODs do not require to reload VCL files to take effect.
When writing VMODs, you should test them towards varnishtest. In fact, we recommend
you first to write your tests as part of your design, and then implement your VMOD.
Therefore, we introduce next varnishtest before proceeding with the implementation the
VMOD itself.
Chapter 16 Appendix D: VMOD Development Page 287
With varnishtest you can create mock-ups of clients and origin servers to interact with
your Varnish installation. This is useful to simulate transactions and provoke a specific
behavior. You can use varnishtest when writing VCL code or VMODs. varnishtest is also
useful to reproduce bugs when filing a bug report.
The best way to learn how to create Varnish tests is by running the ones included in Varnish
Cache and then write your own tests based on them. Tests of Varnish Cache are in
bin/varnishtest/tests/. You should also take a look at the README file
bin/varnishtest/tests/README to learn about the naming convention.
Page 288 Chapter 16 Appendix D: VMOD Development
16.2.1 VTC
helloworldtest.vtc:
server s1 {
rxreq
txresp
} -start
varnish v1 -vcl+backend {
sub vcl_deliver {
set resp.http.hello = "Hello, World";
}
} -start
client c1 {
txreq -url "/"
rxresp
expect resp.http.hello == "Hello, World"
}
client c1 -run
client c1 -run
client c1 -run
varnishtest does not follow the unit testing framework (up/test/assert/tear down) nor
behavior-driven development (given/when/then). Depending on your use case, there might
be test preparations, executions and assertions all over the place. VTC is not compiled but
simply interpreted on the fly. When run, the above script simulates an origin server s1, starts
a real Varnish instance v1, and simulates a client c1.
server s1 declares a simulated origin server that receives a request rxreq, and transmits a
response txresp. The -start directive makes an instance of that declaration to be started.
-start can also be at a later point in your code as server s1 -start.
varnish v1 -vcl+backend declares an instance of your real Varnish server and loads the
VCL code inside the brackets. varnishtest controls v1 through the manager process (see
Chapter 16 Appendix D: VMOD Development Page 289
The Parent Process: The Manager). The +backend directive injects the backend (origin server
s1) to the VCL code. Alternatively, you might want to add backends manually, for example:
varnish v1 -vcl {
backend default {
.host = "${s1_addr}";
.port = "${s1_port}";
}
}
Once s1 is started, the macros ${s1_addr} and ${s1_port} with the IP address and port of
your simulated backend are automatically made available. Since varnishtest launches a
real varnishd instance, it is possible to use a real backend instead of mock servers. Thus,
you can test your actual backend.
client c1 declares a simulated client that transmits a request txreq for the slash URL:
-url "/". c1 receives a response rxresp.
varnishtest supports assertions with the keyword expect. For example, c1 expects the
response header field resp.http.hello with value Hello, World. Assertions can be inside
the declaration of the origin server and client, but not inside the Varnish server. Since
Varnish is a proxy, checking requests and responses in it is irrelevant. Instead, you have
access to the counters exposed by varnishstat at any time:
Assertions are evaluated by reading the shared memory log, which ensures that your
assertions are tested against a real Varnish server. Therefore, Varnish tests might take a bit
longer than what you are used to in other testing frameworks. Finally, client c1 -run
starts the simulated client c1.
You will learn how to extend this script to test your VMOD later in this chapter, but before,
run and analyze the output of helloworldtest.vtc to understand varnishtest better.
Page 290 Chapter 16 Appendix D: VMOD Development
$varnishtest helloworldtest.vtc
top TEST helloworldtest.vtc passed (1.554)
Note
man varnishtest shows you all options
Chapter 16 Appendix D: VMOD Development Page 291
VMODs require the source code from Varnish Cache that you are running. The easiest way to
be sure you have everything in place, is to build your Varnish Cache from source. Although
you can also develop VMODs against a Varnish installation from a package repository. The
git repository and building instructions are at
https://github.com/varnish/Varnish-Cache.git.
Once you have built Varnish Cache, build the libvmod-example from
https://github.com/varnish/libvmod-example by following the instructions in the repository.
When building the libvmod-example, make sure that its branch matches with the branch
version of Varnish-Cache. For example, to build libvmod-example against Varnish Cache
4.0, make sure that you checkout the branch 4.0 in both, the libvmod-example and
Varnish-Cache.
Next, we explain the content inside libvmod-example.
Page 292 Chapter 16 Appendix D: VMOD Development
vmod_example.vcc:
$Module example
$Init init_function
$Function STRING hello(STRING)
vcc_if.h:
struct VCL_conf;
struct vmod_priv;
/* Functions */
VCL_STRING vmod_hello(VRT_CTX, VCL_STRING);
int init_function(struct vmod_priv *, const struct VCL_conf *);
In vmod_example.vcc you declare the module name, initialization function and other
functions you need. Definitions are stored in files with .vcc extension. Please note the $
sign leading the definitions in the vmod_example.vcc.
$Module example
$Init init_function
The second line declares an optional initial function, which is called when a VCL program
loads this VMOD.
We advise you to write the documentation of your VMOD in the .vcc file, but if you prefer to
have it in another location, such as the README.rst file, you can change the source for
documentation in Makefile.am. For example, change from:
vmod_example.3: src/vmod_example.man.rst
to:
vmod_example.3: README.rst
VCL_STRING
vmod_hello(const struct vrt_ctx *ctx, VCL_STRING name)
{
char *p;
unsigned u, v;
You reserve the memory you need by calling WS_Reserve(ctx->ws, 0). It is important to
release the memory you used with WS_Release(ctx->ws, v), otherwise you are introducing
a memory leak.
Chapter 16 Appendix D: VMOD Development Page 295
Every worker thread has its own workspace ws in virtual memory. This workspace is a
contiguous char array defined in cache/cache.h as:
struct ws {
unsigned magic;
#define WS_MAGIC 0x35fac554
char id[4]; /* identity */
char *s; /* (S)tart of buffer */
char *f; /* (F)ree/front pointer */
char *r; /* (R)eserved length */
char *e; /* (E)nd of buffer */
};
magic and WS_MAGIC are used for sanity checks by workspace functions. The id field is self
descriptive. The parts that most likely you are interested in are the SFRE fields.
s and e point to the start and end of the char array respectively. f points to the currently
available memory, it can be seen as a head that moves forward every time memory is
allocated. f can move up to the end of the buffer pointed by e.
r points to the reserved memory space of the workspace. This space is reserved to allow
incremental allocation. You should remember to release this space by calling
WS_Release(struct ws *ws, unsigned bytes) once your VMOD does not need it any
longer.
The cache/cache.h is automatically included when you compile your .vcc file. Next, we
describe in detail the headers that are included in vmod_example.c.
Page 296 Chapter 16 Appendix D: VMOD Development
16.3.4 Headers
#include "vrt.h"
#include "cache/cache.h"
#include "vcc_if.h"
The vrt.h header provides data structures and functions needed by compiled VCL programs
and VMODs. cache.h declares the function prototypes for the workspace memory model
among others. These functions are implemented in cache_ws.c.
cache.h:
void WS_Init(struct ws *ws, const char *id, void *space, unsigned len);
unsigned WS_Reserve(struct ws *ws, unsigned bytes);
void WS_MarkOverflow(struct ws *ws);
void WS_Release(struct ws *ws, unsigned bytes);
void WS_ReleaseP(struct ws *ws, char *ptr);
void WS_Assert(const struct ws *ws);
void WS_Reset(struct ws *ws, char *p);
char *WS_Alloc(struct ws *ws, unsigned bytes);
void *WS_Copy(struct ws *ws, const void *str, int len);
char *WS_Snapshot(struct ws *ws);
int WS_Overflowed(const struct ws *ws);
void *WS_Printf(struct ws *ws, const char *fmt, ...) __printflike(2, 3);
vcc_if.h is generated out from the definitions in your .vcc file. This header contains the
declaration of your VMOD functions in C code.
Chapter 16 Appendix D: VMOD Development Page 297
$./autogen.sh
$./configure
$make
$make check
#make install
The source tree is based on travis and Autotools to configure the building. More detailed
building instructions are in:
https://github.com/varnish/libvmod-example/blob/master/README.rst
Note that make check calls varnishtest with the needed options.
Page 298 Chapter 16 Appendix D: VMOD Development
In this subsection you will learn how to build your own VMOD on top of libvmod-example.
rename-vmod-script prepares a basic VMOD that can be extended to your needs:
$./rename-vmod-script your_vmod_name
We have prepared a cowsay VMOD for you to follow easier this subsection, but you can
create your own VMOD from scratch as we explain further. The cowsay VMOD adds the
output of the Linux program, which generates ASCII pictures of a cow or a different animal
with a message. You can download the code of the cowsay VMOD from
https://github.com/aondio/libvmod-cowsay.git.
Chapter 16 Appendix D: VMOD Development Page 299
server s1 {
rxreq
txresp
} -start
varnish v1 -vcl+backend {
import cowsay from "${vmod_topbuild}/src/.libs/libvmod_cowsay.so";
sub vcl_recv {
if (req.url ~ "/cowsay") {
set req.http.x-cow = cowsay.cowsay_canonical();
}
}
sub vcl_deliver {
set resp.http.x-cow = req.http.x-cow;
}
} -start
client c1 {
txreq -url "/cowsay"
rxresp
expect resp.http.x-cow == "Cowsay: Hello World!"
} -run
Snippet of test02.vtc:
sub vcl_recv {
if (req.url ~ "/cowsay") {
return(synth(700, "OK"));
}
}
sub vcl_synth {
if (resp.status == 700) {
set resp.status = 200;
set resp.http.Content-Type = "text/plain; charset=utf-8";
synthetic(cowsay.cowsay_vsb());
return (deliver);
}
}
...
Page 300 Chapter 16 Appendix D: VMOD Development
client c1 {
txreq -url "/cowsay"
rxresp
expect resp.body == {** mybody **
^__^
(oo)\_______
(__)\ )\/\
||----w |
|| ||
}
} -run
Snippet of test03.vtc:
sub vcl_recv {
if (req.url ~ "/cowsay" || req.url ~ "/bunny") {
return(synth(700, "OK"));
}
}
sub vcl_synth {
if (resp.status == 700) {
set resp.status = 200;
set resp.http.Content-Type = "text/plain; charset=utf-8";
if (req.url ~ "/cowsay"){
synthetic(cowsay.cowsay_friends("cow", "moo"));
}
if (req.url ~ "/bunny"){
synthetic(cowsay.cowsay_friends("bunny", "Varnish"));
}
return (deliver);
}
}
...
client c1 {
txreq -url "/cowsay"
rxresp
expect resp.body == {** moo **
^__^
(oo)\_______
(__)\ )\/\
||----w |
|| ||
Chapter 16 Appendix D: VMOD Development Page 301
}
} -run
client c2 {
txreq -url "/bunny"
rxresp
expect resp.body == {** Varnish **
(\/)
(..)
(")(")
}
} -run
We advise you to start designing your tests. test01.vtc, test02.vtc and test03.vtc are
examples in https://github.com/franciscovg/libvmod-cowsay.git. test01.vtc shows how the
HTTP response header field is assigned. When client c1 requests the /cowsay URL, Varnish
server v1 assigns the output of the VMOD function cowsay_canonical() to the HTTP
request header field req.http.x-cow.
test02.vtc shows how to alter the message body in the vcl_synth subroutine. In this
second test, we use cowsay.cowsay_vsb(), which return the cowsay ASCII picture. The
important difference between cowsay_canonical() and cowsay.cowsay_vsb() is the
library used to manipulate the returned string. We discuss this difference later in this
section.
test03.vtc reuses most of test02.vtc. test03.vtc replaces cowsay.cowsay_vsb() for
cowsay.cowsay_friends(arg1, arg2), and adds some basic conditions.
cowsay.cowsay_friends() returns the ASCII figure given in arg1, which utters arg2. After
the design of your tests, you declare the functions in the .vcc file and implement them in
the .c file.
• Add more assertions using the keyword expect to check values of Varnish counters
Page 302 Chapter 16 Appendix D: VMOD Development
16.4.2 vmod_cowsay.vcc
cowsay_canonical() uses the canonical string library, and cowsay_vsb() uses the Varnish
String Buffer string library. The VSB library is very useful to manipulate strings. Therefore we
recommend you to use it instead of the canonical string libraries.
Note
Although VCL allows it, we do not recommend to assign multiple lines to HTTP context
header fields.
Chapter 16 Appendix D: VMOD Development Page 303
16.4.3 vmod_cowsay.c
int
init_function(struct vmod_priv *priv, const struct VCL_conf *conf)
{
/* init global state valid for the whole VCL life */
cow =
"\n ^__^\n"
" (oo)\\_______\n"
" (__)\\ )\\/\\\n"
" ||----w |\n"
" || ||\n";
/* this 'cow' is now available for every other functions that will
be defined in this vmod */
return (0);
}
VCL_STRING
vmod_cowsay_friends(VRT_CTX, VCL_STRING animal, VCL_STRING talk)
{
unsigned u;
struct vsb *vsb;
u = WS_Reserve(ctx->ws, 0);
vsb = VSB_new(NULL, ctx->ws->f, u, VSB_AUTOEXTEND);
if(!strcmp(animal, "cow")) {
VSB_printf(vsb, "** %s **\n", talk);
VSB_cat(vsb, cow);
}
if(!strcmp(animal, "bunny")) {
VSB_printf(vsb, "** %s **\n", talk);
VSB_cat(vsb, baby_bunny());
}
VSB_finish(vsb);
WS_Release(ctx->ws, VSB_len(vsb) + 1);
return (vsb->s_buf);
}
We explain here two relevant functions in this file. The first is the init_function(), where
we declare a global variable holding the cow figure. The second part is the
vmod_cowsay_friends() function, where we use string manipulation functions provided by
VSB. vmod_cowsay_vsb() is a simplified version of vmod_cowsay_friends(). The
implementation of cowsay_canonical() is practically the same as vmod_hello().
Finally, it is time to make, make check and make install your VMOD. Note that
make check calls varnishtest with the needed options.
Page 304 Chapter 16 Appendix D: VMOD Development
16.5 Resources
• https://www.varnish-cache.org/vmods
The best way to learn more about VMODs is by writing them and seeing how others VMOD
works. There are many VMODs written by the Varnish community and Varnish Software.
Please take a look at the list in https://www.varnish-cache.org/vmods.
In addition, you can also look at the following blogs, which were used, besides other sources,
to write this section:
• http://blog.zenika.com/index.php?post/2012/08/21/Creating-a-Varnish-module
• http://blog.zenika.com/index.php?post/2012/08/27/Introducing-varnishtest
• http://blog.zenika.com/index.php?post/2013/07/31/Creating-a-Varnish-4-module
Chapter 17 Appendix E: Varnish Three Letter
Acronyms Page 305
curl is the tool typically used to transfer data from or to a server, but you might want to use
something else, like HTTPie, which has a very pretty color printing in the terminal. To install
HTTPie in Ubuntu or Debian:
Next:
1. Verify that Apache works by typing http -h localhost. You should see a 200 OK
response from Apache.
2. Change Apache's port from 80 to 8080. In Ubuntu or Debian, you do this in
/etc/apache2/ports.conf and /etc/apache2/sites-enabled/000-default.conf. In
CentOS, RHEL or Fedora, edit /etc/httpd/conf/httpd.conf.
3. Restart Apache. In Ubuntu or Debian type service apache2 restart. In CentOS, RHEL
or Fedora:
http -h localhost:8080
Page 308 Chapter 19 Appendix G: Solutions
19 Appendix G: Solutions
This appendix contains the solutions of exercises throughout the book.
Chapter 19 Appendix G: Solutions Page 309
Varnish is already distributed in many package repositories, but those packages might
contain an outdated Varnish version. Therefore, we recommend you to use the packages
provided by varnish-software.com for Varnish Cache Plus or varnish-cache.org for Varnish
Cache. Please be advised that we only provide packages for LTS releases, not all the
intermediate releases. However, these packages might still work fine on newer releases.
All software related to Varnish Cache Plus including VMODs are available in RedHat and
Debian package repositories. These repositories are available on
http://repo.varnish-software.com/, using your customer specific username and password.
All the following commands are for systemd Ubuntu or Debian and must be executed with
root permissions. First, make sure you have apt-transport-https:
To use the varnish-software.com repository and install Varnish Cache Plus 4.0 or 4.1 on
Ubuntu 14.04 trusty:
$ curl https://<username>:<password>@repo.varnish-software.com/GPG-key.txt \
| apt-key add -
To use the varnish-cache.org repository and install Varnish Cache 4.0 or 4.1 on Ubuntu 14.04
trusty:
If you are installing Varnish Cache 4.1, replace varnish-4.0 for varnish-4.1 in the
command above.
If you are installing Varnish Cache Plus 4.0 or 4.1, add the repositories for VMODs in
/etc/apt/sources.list.d/varnish-4.0-plus.list or
/etc/apt/sources.list.d/varnish-4.1-plus.list respectively:
# Remember to replace 4.x, DISTRO and RELEASE with what applies to your system.
# 4.x=(4.0|4.1)
# DISTRO=(debian|ubuntu),
# RELEASE=(precise|trusty|wheezy|jessie)
varnish-4.x-plus
$ apt-get update
To install Varnish-Cache:
$ varnishd -V
To use Varnish Cache Plus 4.0 or 4.1 repositories on systemd Fedora/RHEL7+/CentOS 7+,
put the following in /etc/yum.repos.d/varnish-4.0-plus.repo or
/etc/yum.repos.d/varnish-4.1-plus.repo, and change 4.x for the version you want to
install:
[varnish-4.x-plus]
name=Varnish Cache Plus
baseurl=https://<username>:<password>@repo.varnish-software.com/redhat
/varnish-4.x-plus/el$releasever
enabled=1
gpgcheck=0
[varnish-admin-console]
name=Varnish Administration Console
baseurl=
https://<username>:<password>@repo.varnish-software.com/redhat
/vac/el$releasever
enabled=1
gpgcheck=0
Chapter 19 Appendix G: Solutions Page 311
$ yum update
$ yum install varnish-plus
$ yum install varnish-plus-vmods-extra
$ varnishd -V
Note
More details on Varnish Plus installation can be found at
http://files.varnish-software.com/pdfs/varnish-cache-plus-manual-latest.pdf
Page 312 Chapter 19 Appendix G: Solutions
client c1 {
txreq
rxresp
varnishtest "Counters"
server s1 {
rxreq
txresp
} -start
client c1 {
txreq
rxresp
}
client c1 -run
client c1 -run
client c1 -run
feature SO_RCVTIMEO_WORKS
server s1 {
rxreq
delay 1.5
txresp -body "foo"
} -start
client c1 {
txreq
rxresp
expect resp.status == 503
} -run
server s1 {
rxreq
delay 0.5
txresp -body "foo"
} -start
client c1 {
txreq
rxresp
expect resp.status == 200
} -run
In this example, we introduce -vcl+backend and feature in VTC. -vcl+backend is one way
to pass inline VCL code and backend to v1. In this example, v1 receives no inline VCL injects
declaration of the backend s1. Thus, -vcl+backend{} is equivalent to
-arg "-b ${s1_addr}:${s1_port}" in this case.
feature checks for features to be present in the test environment. If the feature is not
present, the test is skipped. SO_RCVTIMEO_WORKS checks for the socket option SO_RCVTIMEO
before executing the test.
b00006.vtc is copied from Varnish-Cache/bin/varnishtest/tests/b00023.vtc We
advise you to take a look at the many tests under
Chapter 19 Appendix G: Solutions Page 315
vcl 4.0;
backend default {
.host = "127.0.0.1";
.port = "8080";
}
sub vcl_recv {
if (req.url ~ "^/admin") {
return(pass);
}
}
In this suggested solution, the backend is configured to a local IP address and port. Since we
are not running this code, you can configure it as you want.
Note the use of the match comparison operator ~ in regular expression.
In the output of the compiler you should be able to find your VCL code and the built-in VCL
code appended to it.
Chapter 19 Appendix G: Solutions Page 317
server s1 {
rxreq
txresp
} -start
varnish v1 \
-arg "-p vsl_mask=+WorkThread" \
-arg "-p thread_pool_min=10" \
-arg "-p thread_pool_max=15" \
-arg "-p thread_pools=1" \
-vcl+backend {}
varnish v1 -start
logexpect l1 -v v1 -g raw {
expect * 0 WorkThread {^\S+ start$}
expect * 0 WorkThread {^\S+ end$}
} -start
# Herder thread might sleep up to 5 seconds. Have to wait longer than that.
delay 6
# Herder thread might sleep up to 5 seconds. Have to wait longer than that.
delay 6
The test above shows you how to set parameters in two ways; passing the argument -p to
varnishd or calling param.set. -p vsl_mask=+WorkThread is used to turn on WorkThread
debug logging.
Chapter 19 Appendix G: Solutions Page 319
The test proves that varnishd starts with the number of threads indicated in
thread_pool_min. Changes in thread_pool_min and thread_pool_max are applied by the
thread herder, which handles the thread pools and adds or removes threads if necessary. To
learn more about other maintenance threads, visit
https://www.varnish-cache.org/trac/wiki/VarnishInternals.
c00001.vtc is a simplified version of Varnish-Cache/bin/varnishtest/tests/r01490.vtc.
Page 320 Chapter 19 Appendix G: Solutions
/* Alternative 1 */
if (req.http.host == "sport.example.com") {
set req.http.host = "example.com";
set req.url = "/sport" + req.url;
}
/* Alternative 2 */
if (req.http.host ~ "^sport\.") {
set req.http.host = regsub(req.http.host,"^sport\.", "");
set req.url = regsub(req.url, "^", "/sport");
}
}
Then you verify your results by issuing the following command and analyzing the output:
varnishlog -i ReqHeader,ReqURL
varnishtest solution:
server s1 {
rxreq
expect req.http.Host == "example.com"
expect req.http.ReqURL == "/sport/index.html"
txresp
}
varnish v1 -vcl {
backend default {
.host = "93.184.216.34"; # example.com
.port = "80";
Chapter 19 Appendix G: Solutions Page 321
}
sub vcl_recv {
set req.http.x-host = req.http.host;
set req.http.x-url = req.url;
set req.http.host = regsub(req.http.host, "^www\.", "");
/* Alternative 1 */
if (req.http.host == "sport.example.com") {
set req.http.host = "example.com";
set req.url = "/sport" + req.url;
}
/* Alternative 2 */
# if (req.http.host ~ "^sport\.") {
# set req.http.host = regsub(req.http.host,"^sport\.", "");
# set req.url = regsub(req.url, "^", "/sport");
# }
}
} -start
client c1 {
txreq -url "/index.html" -hdr "Host: sport.example.com"
rxresp
} -run
Page 322 Chapter 19 Appendix G: Solutions
// Suggested solution B
sub vcl_backend_response {
if (bereq.url ~ "^/index\.html" || bereq.url ~ "^/$") {
set beresp.uncacheable = true;
}
}
There are many ways to solve this exercise, and this solution is only one of them. The first
condition checks the presence of s-maxage and handles .jpg and .html files to make them
cacheable for 30 and 10 seconds respectively. If s-maxage is present with a positive TTL, we
consider the response cacheable by removing beresp.http.Set-Cookie.
Page 324 Chapter 19 Appendix G: Solutions
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}
Chapter 19 Appendix G: Solutions Page 325
The suggested solution forces a 503 error by misconfiguring .port in the default backend.
You can also force a 503 response by using ${bad_ip} in varnishtest. The macro
${bad_ip} translates to 192.0.2.255.
vtc/b00011.vtc
varnish v1 -vcl {
backend foo {
.host = "${bad_ip}";
.port = "9080";
}
/* Customize error responses */
sub vcl_backend_error {
if (beresp.status == 503){
set beresp.status = 200;
synthetic( {"
<html><body><!-- Here goes a more friendly error message. -->
</body></html>
"} );
return (deliver);
}
}
} -start
Page 326 Chapter 19 Appendix G: Solutions
client c1 {
txreq -url "/"
rxresp
expect resp.status == 200
} -run
Note that in the proposed solution the client receives a 200 response code.
Chapter 19 Appendix G: Solutions Page 327
<?php
header( 'Content-Type: text/plain' );
header( 'Cache-Control: max-age=0' );
$hostname = 'localhost';
$port = 80;
$URL = '/article.php';
$debug = true;
$curlOptionList = array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_CUSTOMREQUEST => 'PURGE',
CURLOPT_HEADER => true ,
CURLOPT_NOBODY => true,
CURLOPT_URL => $finalURL,
CURLOPT_CONNECTTIMEOUT_MS => 2000
);
$fd = false;
if( $debug == true ) {
print "\n---- Curl debug -----\n";
$fd = fopen("php://output", 'w+');
$curlOptionList[CURLOPT_VERBOSE] = true;
$curlOptionList[CURLOPT_STDERR] = $fd;
}
$curlHandler = curl_init();
curl_setopt_array( $curlHandler, $curlOptionList );
curl_exec( $curlHandler );
curl_close( $curlHandler );
if( $fd !== false ) {
fclose( $fd );
}
Page 328 Chapter 19 Appendix G: Solutions
}
?>
Chapter 19 Appendix G: Solutions Page 329
solution-purge-from-backend.vcl
acl purgers {
"127.0.0.1";
}
sub vcl_recv {
if (req.method == "PURGE") {
if (!client.ip ~ purgers) {
return (synth(405, "Not allowed."));
}
return (purge);
}
}
Page 330 Chapter 19 Appendix G: Solutions
if (req.method == "BAN") {
ban("obj.http.x-url ~ " + req.http.x-ban-url +
" && obj.http.x-host ~ " + req.http.x-ban-host);
return (synth(200, "Ban added"));
}
if (req.method == "REFRESH") {
set req.method = "GET";
set req.hash_always_miss = true;
}
}
sub vcl_backend_response {
set beresp.http.x-url = bereq.url;
set beresp.http.x-host = bereq.http.host;
}
sub vcl_deliver {
# We remove resp.http.x-* HTTP header fields,
# because the client does not neeed them
unset resp.http.x-url;
unset resp.http.x-host;
}
Chapter 19 Appendix G: Solutions Page 331
server s1 {
rxreq
expect req.url == "/cookie.php"
txresp -hdr "Vary: Cookie"
rxreq
expect req.url == "/cookie.php"
txresp -hdr "Vary: Cookie"
rxreq
expect req.url == "/article.html"
txresp -hdr "Vary: Cookie"
rxreq
expect req.url == "/cookie.php"
txresp -hdr "Vary: Cookie"
rxreq
expect req.url == "/article.html"
txresp -hdr "Vary: Cookie"
} -start
varnish v1 -vcl+backend {
sub vcl_recv{
if (req.method == "PURGE") {
return (purge);
}
else if (req.http.Cookie){
# Forces Varnish to cache requests with cookies
return (hash);
}
}
sub vcl_backend_response{
# Uncomment to remove effect from Vary
# unset beresp.http.Vary;
}
} -start
client c1 {
txreq -url "/cookie.php" -hdr "Cookie: user: Alice"
rxresp
expect resp.http.X-Varnish == "1001"
txreq -url "/cookie.php" -hdr "Cookie: user: Bob"
rxresp
expect resp.http.X-Varnish == "1003"
txreq -url "/cookie.php" -hdr "Cookie: user: Alice"
Page 332 Chapter 19 Appendix G: Solutions
rxresp
expect resp.http.X-Varnish == "1005 1002"
txreq -url "/article.html" -hdr "Cookie: user: Alice"
rxresp
expect resp.http.X-Varnish == "1006"
txreq -url "/article.html" -hdr "Cookie: user: Alice"
rxresp
expect resp.http.X-Varnish == "1008 1007"
} -run
client c1 {
txreq -req PURGE -url "/cookie.php"
rxresp
} -run
client c1 {
txreq -url "/cookie.php" -hdr "Cookie: user: Alice"
rxresp
expect resp.http.X-Varnish == "1012"
txreq -url "/article.html" -hdr "Cookie: user: Bob"
rxresp
expect resp.http.X-Varnish == "1014"
} -run
Vary and hash_data() might behave very similar at first sight and they might even seem
like alternatives for handling cookies. However, cached objects are referenced in different
ways.
If Varnish is forced to store responses with cookies, Vary ensures that Varnish stores
resources per URL and Cookie. If Vary: Cookie is used, objects are purged in this way:
server s1 {
rxreq
expect req.url == "/cookie.php"
txresp
rxreq
expect req.url == "/cookie.php"
txresp
rxreq
expect req.url == "/article.html"
txresp
rxreq
expect req.url == "/cookie.php"
txresp
rxreq
expect req.url == "/article.html"
txresp
} -start
varnish v1 -vcl+backend {
sub vcl_recv{
if (req.method == "PURGE") {
return (purge);
}
else if (req.http.Cookie){
# Forces Varnish to cache requests with cookies
return (hash);
}
}
sub vcl_hash{
hash_data(req.http.Cookie);
}
} -start
client c1 {
txreq -url "/cookie.php" -hdr "Cookie: user: Alice"
rxresp
expect resp.http.X-Varnish == "1001"
txreq -url "/cookie.php" -hdr "Cookie: user: Bob"
rxresp
expect resp.http.X-Varnish == "1003"
Page 334 Chapter 19 Appendix G: Solutions
client c1 {
txreq -req PURGE -url "/cookie.php" -hdr "Cookie: user: Alice"
rxresp
} -run
client c1 {
txreq -url "/cookie.php" -hdr "Cookie: user: Alice"
rxresp
expect resp.http.X-Varnish == "1012"
txreq -url "/article.html" -hdr "Cookie: user: Bob"
rxresp
expect resp.http.X-Varnish == "1014"
} -run
hash_data(req.http.Cookie) adds the request header field Cookie to the hash key. So
Varnish can discern between backend responses linked to a specific request header field.
To purge cached objects in this case, you have to specify the header field used in
hash_data():
vcl 4.0;
backend localhost{
.host = "127.0.0.1";
.port = "8080";
}
backend google {
.host = "173.194.112.145";
.port = "80";
}
sub vcl_recv{
if (req.url ~ "^/masq") {
set req.backend_hint = google;
set req.http.host = "www.google.com";
set req.url = regsub(req.url, "^/masq", "");
return (hash);
} else {
set req.backend_hint = localhost;
}
}