A Comparison of The Performance and Scalability of Xen and KVM Hypervisors PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

A comparison of the performance and scalability of

Xen and KVM hypervisors


Stefan Gabriel Soriga

Mihai Barbulescu

Information and Communication Technology Dept.


University POLITEHNICA of Bucharest
Bucharest, Romania
[email protected]

Information and Communication Technology Dept.


University POLITEHNICA of Bucharest
Bucharest, Romania
[email protected]

Abstract Virtualization is the fundamental technology for


corporate data center consolidation and cloud computing.
Companies and academic institutions consolidate their servers
through virtualization in search for efficient hardware resource
usage, lower energy consumption, improved fault tolerance, and
increased security. In this paper, we investigate the performance
and scalability of two open source virtualization platforms: Xen
and Kernel-based Virtual Machine (KVM). We analyze the
influence of the number of virtual machines on the performance
of several benchmarks that measure compute power, network,
and I/O throughput. Virtual machines are running on top of a
single physical host. The results show that these two
virtualization solutions have similar behavior. An important
degradation in the performance is noticed when the number of
virtual machines equals or exceeds the number of physical core
available. Hyper-threading boots the performance by a small
margin. The effect of virtualization on I/O throughput is even
more noticeable, although virtual machines were using LVM
logical volumes as disk images in our tests.
Keywordsserver virtualization; hypervizor; Xen; KVM;
benchmark

I.

INTRODUCTION

In recent years server virtualization has become very


important because it consolidates workloads for more efficient
resource utilization. The traditional server-per-workload
paradigm is typically underutilized. Because many of these
workloads dont play well with others and each needs its own
dedicated machine, most servers use only a small fraction of
their overall processing capabilities (10% to 20% of their
possible capacity) [1]. Virtualization has the capability of
isolating the running environments, so workloads of different
type can share the same hardware. Server virtualization
addresses this very issue. We say that a computer server is
virtualized when a single physical machine is made to appear
as multiple isolated virtual machines (VM). The server is called
host and the virtual machines running on it are called guests.
Each guest has its own virtual processor, memory and
peripheral interfaces, and is capable of running its own
operating system, called guest OS.
The virtual imitation of the hardware layer that allows the
guest OS to run without modifications, unaware that it is
running on emulated hardware, is called full virtualization. In
full virtualization, the guest OS is running unmodified, but the

virtual machine suffer serious performance penalties due to the


extra layers of abstraction provided by an additional software
that simulates the hardware behavior, called hypervisor. The
hypervisor runs on top of the physical hardware and creates the
virtual environment on which the virtual machines operate. Its
functions include scheduling, memory management, and
resource control for various virtual machines.
More performance oriented, the para-virtualized machine
approach (PVM) requires a modified guest OS capable of
making system calls to the hypervisor, rather than executing
machine I/O instructions that the hypervisor simulates.
Whenever an instruction is to be executed, a call is made to the
hypervisor and the hypervisor performs the necessary task on
behalf of the guest. Para-virtualization provides near-native
performance and the appearance of complete and secure
isolation at the hardware level. The disadvantage consists in the
required modified operating system.
In this respect, the Hardware Virtual Machine (HVM)
represents a more flexible solution. HVM is a platform
virtualization approach that enables efficient full virtualization
using help from hardware capabilities, primarily virtualization
extensions in processors. Hardware-assisted virtualization was
added to x86 processor (Intel VT-x or AMD-V) in 2006. HVM
does not need a modified kernel and the guest OS has direct
access to processor, so the performance in HVM will be better
than that in full virtualization.
In the following sections we will evaluate Xen and KVM,
two open source hypervisors that can run natively on Linux.
The rest of this paper is organized as follows. Section II
describes briefly the architecture of Xen and KVM. Section III
evaluates the performance and scalability of the hypervisors
using a series of benchmarks. Section IV discusses some
related works, and Section V summarizes our findings.
II.

HYPERVISORS DESCRIPTION

A. Xen hypervisor
Xen is an open source hypervisor originally developed at
the University of Cambridge and now distributed by Citrix
Systems, Inc. The first public release of Xen occurred in 2003
[6]. It is designed for various hardware platforms, especially
x86, and supports a wide range of guest operating systems,

including Windows, Linux, Solaris and versions of the BSD


family.
Xen architecture consists of a hypervisor, a host OS and a
number of guests. In Xen terminology, the host OS is referred
to as the Domain 0 (dom0) and the guest OS is referred to as
Domain U (domU). Dom0 is created automatically when the
system boots. Although it runs on top of the hypervisor itself,
and has virtual CPUs (vCPUs) and virtual memory, as any
other guest, Dom0 is privileged. It provides device drivers for
the I/O devices, it runs user space services and management
tools, and is permitted to use the privileged control interface to
hypervisor. It also performs all other OS related functions. Like
a DomU, Dom0 can be any operating system in the Linux,
Solaris, and BSD family. As we see, Xen separates the
hypervisor execution from Dom0. In this way, the series of
tasks that are not related to processing the virtualized guests
performed by Dom0 do not influence the hypervisor, ensuring
maximum performance.
Xen employs para-virtualization from the very beginning.
Through para-virtualization, Xen can achieve very high
performance, but it has the disadvantage of supporting Linux
only; and that Linux has to have a modified kernel and
bootloader, and a fixed layout with two partitions, one for hard
disk and one for swap.
Xen also implements support for hardware-assisted
virtualization. In this configuration, it does not require
modifying the guest OS, which make it possible to host
Windows guests.
B. KVM hypervisor
KVM is a hardware-assisted virtualization developed by
Qumranet, Inc and was merged with upstream mainline Linux
kernel in 2007, giving the Linux kernel native virtualization
capabilities. KVM make use of the virtualization extensions
Intel VT-x and AMD-V. In 2008, Red Hat, Inc acquired
Qumranet.
KVM is a kernel module to the Linux kernel, which
provides the core virtualization infrastructure and turns a Linux
host into a hypervisor. Scheduling of processes and memory is
handled through the kernel itself. Device emulation is handle
by a modified version of QEMU [7]. The guest is actually
executed in the user space of the host and it looks like a regular
process to the underlying host kernel.
A normal Linux process has two modes of execution:
kernel mode and user mode. KVM adds a third one: guest
mode. When a guest process is executing non-I/O guest code, it
will run in guest mode. All the guests running under KVM are
just regular linux processes on the host. Each and every virtual
CPU of your KVM guests is implemented using a Linux
thread. The Linux scheduler is responsible for scheduling a
virtual CPU, as it is a normal thread. This brings the advantage
that you can set priorities and affinity for these processes using
normal adjusting commands, and use all the common linux
utilities related to processes. Also, control groups can be
created and used to limit resources that each guest can consume
on a host. In KVM, guest physical memory is just a chunk of
host virtual memory. So it can be swapped, shared, backed by
large pages, backed by a disk file, and is also NUMA aware.

KVM supports I/O para-virtualization using virtio


subsystem. Virtio is a virtualization standard for device
(network, disk, etc.) drivers where the guests device driver is
aware of running in a virtual environment, and communicates
directly with the hypervisor. This enables the guests to get high
performance network and disk operations. Virtio is different,
but architecturally similar to Xen para-virtualized device
drivers.
C. Linux support for Xen and KVM hypervisors
After the acquisition of XenSource by Citrix Systems, Inc,
in 2007, and the acquisition of Qumranet by Red Hat, Inc, the
Linux community started to migrate from Xen to KVM. In
2009, Red Hat decided to move forward with KVM instead of
Xen and integrated KVM into Red Hat Enterprise Linux
(RHEL) distribution. As a result, Xen was present in RHEL 5.x
distributions, but no longer supported in the next iteration,
RHEL 6.x. Having vendor-supplied kernels increases the
likelihood that sites would deploy a type of hypervisor or
another, and this was a big step forward for KVM. Xen is an
open source project but not so tightly integrated with the Linux
community and Linux kernel as KVM. KVM is part of the
Linux kernel, and inherits from it everything from device
management, CPU scheduling, sophisticated memory
management (NUMA, huge pages) or scalable I/O stack,
isolation, security and hardware enablement. Instead, Xen
hypervisor is a separate project and a completely separate code
base. This means that Xen kernel does not support all the
hardware and features that are available on Linux kernel.
Without being part of upstream kernel, development of Xen is
entirely a separate process and you need to install the Xen
kernel on bare metal and build a special guest Dom0 to manage
it and to provide device drivers. But today this is not anymore
the complicated process of compiling the source code of a
modified kernel. Linux mainline kernel tree from version 3.0
onwards includes every component needed for Linux to run
both as a Dom0 and a guest. This means that Linux 3.0 or
higher can be used unmodified as a Dom0 and that paravirt
drivers for Xen in a DomU used for accelerated disk and NIC
are available in any Linux 3.x+ guest, just like KVM virtio
drivers. As a result of the inclusion of Xen in mainline kernel
currently it is available in CentOS 6.x through Xen4Centos
project. The Xen4Centos project is the result of collaboration
between the Xen Project, the Citrix Xen open source teams, the
CentOS developers, GoDaddy Cloud Operations team, and
Rackspace Hosting. Xen4CentOS delivers the Xen stack, a
Linux kernel based on the 3.4 mainline tree, libvirt and QEMU
with associated tools that target Xen support in the distribution.
Although CentOS is a clone of RHEL, Red Hat will obviously
continue to focus on KVM as their primary method of
virtualization. Overall, KVM does have a slight advantage in
the Linux camp of being the default mainline hypervisor. If
you're getting a recent Linux kernel, you've already got KVM
built in.
III.

PERFORMANCE AND SCALABILITY EVALUATION

When evaluating a virtualization platform, the first thing


we would like to find out is how many virtual machines can
run on the host. This number largely depends on the nature of
the virtual machines, whether they are compute-intensive, diskintensive, or network-intensive. In this section we show the

results obtained from the different benchmarks targeted to


stress exactly these parts. Initially, we describe the hardware
and software used in the host and in the guests.
A. Hardware and software configuration
The hardware we used as host for our evaluation is a HP
ProLiant DL180 G6 with the following characteristics:
2 Intel Xeon X5670 six-core @ 2.93 GHz CPUs, with
support for Intel-VT hardware extension;
192 GB DDR3 ECC Registered RAM @ 1066 MHz;
8 HP Compaq x 450 GB SAS, 15K RPM disks
connected to a Smart Array P212 RAID controller
equipped with 256 MB RAM cache memory (disks
configured with RAID 1+0 option, for a total size of 1.6
TB);
1 Broadcom NetXtreme II BCM5709 1Gbps LAN
adapter;
1 Intel 82576 1Gbps LAN adapter. This is used as
bridge for the virtual machines.
The host is connected to a Cisco 2960G gigabit switch with
two network cables, one for each LAN adapter.
In terms of software, the hypervisors were running on a
Centos 6.4 Linux distribution. The guests are using the same
distribution. The hypervisors and the guests have been installed
together on the same hard disk within separate disk partitions,
respectively. In order to obtain the highest possible
performance and scalability of the disk space for the guest
environments we used Logical Volume Manager (LVM) for
partitioning. The disk layout comprised a volume group
dedicated to hypervisors (named hosts), split in two logical
volumes, one for each root partition of the hypervisors, and
another separate volume group (named vms) dedicated to
virtual machines. A separate logical volume represents each
virtual machines hard disk. The main benefit of using LVM in
a system virtualization environment is that it allows for
dynamically enlarging, reducing, adding and deleting disk
space in the form of logical volumes during runtime, taking
snapshots, and cloning virtual machines. In addition to
flexibility, LVM based virtual machine images have less
overhead, because they are not accessed through a filesystem.
The drawback is that you cannot overcommit storage, but in
our evaluation we try to find the right line in respect to
performance, so were not overcommitting anything: CPU,
memory, or storage.
Version of the hypervisors used:

qemu-kvm-0.12.1

xen-hypervisor-4.2.2

Regardless of the type of hypervisor or virtualization


method, all virtual machines were 1 vCPU, 2 GB RAM, and 20
GB hard disk partitioned in a 4 GB swap and 16 GB root
partition. According to conducted tests, VMs were
communicating with the outside world through a bridge
interface configured on the host (one or two interfaces of the
Intel NIC).

B. Benchmarks
We conduct experiments to compare these two
virtualization platforms in respect to one another and evaluate
the overhead introduced by them in comparison to a nonvirtualized environment. We run three sets of benchmarks
targeted to measure the most critical performance parameters
of a machine (CPU, memory, network, and disk access), with
the following tools:

CPU: Hep-Spec06 v1.1

Network: iperf 2.0.5

Disk I/O: iozone 3.408

Each of the benchmarks was executed concurrently on an


increasing number of VMs from 1 to 23 to determine how the
performance degrades as the hosts load increases.
C. CPU performance and scalability
The focus of these CPU tests is to assess the performance
of the hypervisors measuring the number of vCPUs supported.
We tested the same parameters for KVM and Xen, and
compared them with a non-virtualized machine. We intent to
drive the CPU usage in VMs to 100%, and measure the
performance of the guests in order to quantify the differences
for these two virtualization solutions. To accomplish this task,
Hep-Spec06 benchmark is used. Hep-Spec06 is the standard
processor performance benchmarking method in High Energy
Physics (HEP) community. It is based on a subset of the SPEC
CPU2006 benchmark and intended to stress the computer
processor, the memory architecture, the compilers, and the
chipset. Hep-Spec06 does very little I/O, either through NICs
or disk controllers, and doesnt stress the operating system.
This gives us the opportunity to saturate the servers processor
while eliminating other system bottlenecks. Also, the
benchmark has a memory footprint around 2 GB, the same
capacity available to each virtual machine. Since we used an
x86_64 platform, all the experiments were conducted in 64-bit
mode. All the new hardware is 64-bit capable nowadays and
theres little interest in running 32-bit OS or applications.
The maximum number of guests that the hardware can host
depends on the number of physical core. In our testing we have
followed various best practices. It is a well-known fact that
over allocating the number of vCPUs affects not only the
performance of the VM, but the physical server and all the
other VMs on the server. With the objective to analyze the
influence of the hyper-threading (HT), we have run several
tests turning hyper-threading on and off. When HT was on, we
included the logical core of hyper-threading in the number of
physical core [1,8,9]. So to evaluate the impact of the number
of virtual machines deployed per host on the performance, we
considered 11 and 23 VMs, when HT was disabled and
enabled, respectively.
Another well-known guideline is to dedicate one physical
core exclusively to hypervisor and only the rest to the virtual
machines [10,11]. In Xen, the disk and network drivers are
running on Dom0. Dedicating a core only for Dom0 makes
sure Dom0 always has free CPU time to process the I/O
requests for the guests, giving better performance. Moreover,
we have dedicated a fixed amount of 2 GB of memory, and

disabled memory ballooning for Dom0. This optimization is


beneficial especially when running I/O intensive tasks, but
does not hurt for compute intensive tasks either.
Fig. 1 shows the performance comparison of the
hypervisors with and without hyper-threading. Initially, a
single VM was run to determine a base level of performance.
As expected there is a penalty in performance when the number
of VMs per host is increased. The performance of Hep-Spec06
while using hyper-threaded cores is higher than that with HT
off, when the number of VMs exceeds the number of core.
When the number of VMs is 23, tests with hyper-threading on
results in almost 16% better performance for both hypervisors.
Keep in mind that hyper-threading increased the performance
of the bare metal with 30%, from a score of 193.82 to almost
230.8. Interestingly, with 11 VMs deployed, Xen without HT
performs better than KVM with HT on. Nonetheless, there is a
45% to 50% degradation in performance between 11, 12 VMs
case and 23 VMs running in the same time.
Fig. 2 shows a comparison between the Hep-Spec06
benchmark running concurrently on 11 and 12 VMs and the
same benchmark run on physical machine, with HT off. The
performance loss is acceptable. The benefits obtained by
virtualization overcome the minimal performance loss. Xen
performs better again. The average score for 11 VMs running
concurrently was 14.67 for KVM and 16.23 for Xen. When we
increased the number of VMs to 12, the score average
performance level at 14.09 for KVM, and 15.69 for Xen, but
the physical machine was better used.

Fig. 1. Performance comparison of the hypervisors versus the number of VMs


deployed when there is no I/O generated

We run the next tests on 11 VMs, without hyper-threading


because it is easier to do performance comparisons.
D. Network throughput performance
The goal of this experiment is to measure the network
performance of the virtual machines with respect to
throughput. The same hardware and software configuration is
used, with the addition of an external ProLiant DL370 G7
machine, which runs CentOS 6.4 too. No changes were made
in the network parameters of the external machine, hosts, and
guests. The external machine is connected via a Gigabit link to
the same switch as the host. Iperf has been used to measure the
throughput between the VMs and the external machine both in
inbound and outbound directions. VMs are iperf servers in
inbound tests, and iperf clients in outbound tests. A time
parameter of 60 seconds is used. All other default settings
remain the same single TCP stream, 23.2 KB TCP window
size. Each test is repeated 5 times in a row, to make a more
accurate measure.
We started with an initial baseline test. Iperf is run
between the host and the external machine along the 1 Gbps
line to determine the actual line performance. The best line
speed achieved in this test is around 957 Mbps. Then a single
virtual machine is started on the host. Running iperf results in
the usage close to 98% of the network bandwidth. Then the
number of virtual machines running concurrently is increased.
When two VMs are started, the total throughput stays in the
95% to 98% range. However, the bandwidth of VM is reduced
evenly per VM, 475 Mbps each. Fig. 3 shows the results when
are started 2, 4, 8, and 11 VMs.

Fig. 2. Bare metal vs. 11 and 12 VMs aggregate without hyper-threading

Fig. 3. Network throughput vs. number of concurrent VMs

As the number of VMs increases, the bandwidth per VM


decreases. Even the standard deviation is in the same range.
Next tables show the low, high, and average bandwidth for the
VMs and the total bandwidth in Mbps, for KVM and Xen.
They show only inbound throughput as outbound is perfectly
symmetric.
TABLE I. Low, high, and average bandwidth for KVM VMs for Inbound
traffic
No. VMs
2
4
8
11

Low
475
159
70
46

High
475
316
136
136

Average
475
237.5
119.5
86.76

Total
950
950
956.2
954.8

TABLE II. . Low, high, and average bandwidth for KVM VMs for Inbound
traffic
No. VMs
2
4
8
11

Low
474
159
53.9
54

High
476
316
158
158

Average
475
237.5
118.9
86.84

Total
950
950
952.7
954.5

Fig. 4. I/O throughput on bare metal vs. 1 VM

E. Disk I/O throughput


We decided to use a LVM logical volume as disk for virtual
machines. This solution does not limit the I/O performance and
is flexible enough when having to deal with backup, recovery,
and migration. We used virtio disk controller and the default
caching mode (writethrough). The virtio devices are the
paravirtual interfaces for disk I/O. Both hypervisors have this
feature.
The benchmark consists in running iozone ten times in a
row, with the following parameters:
Record size of 64 KB;
File to test size set to 4 GB, double the amount of
memory found in VMs.
Include flush (fsync, fflush) and close() in the timing
calculations.
Use direct I/O for all file operations; tells the filesystem
that all operations are to bypass the buffer cache and go
directly to disk. Direct I/O operations means that the
guest cache is bypassed.
Enable O_RSYNC and O_SYNC.

Fig. 5. I/O throughput comparison for 11 VMs running kvm and xen

The operations mean the following:


Write indicates the performance of writing a new file to
the filesystem.
Rewrite indicates the performance of writing to an
existing file.

This test measured the maximum throughput for six core


operations related to disk performance: write, rewrite, read, reread, random write, and random read. The values given in the
figures represent the average throughput of each operation
(from 10 iterations).

Read indicates the performance of reading a file that


already exists in the filesystem.

Fig. 4 shows the read/write speed performance for bare metal


machine compared to a virtual one, either Xen or KVM. Both
solutions perform badly, especially in disk writing. Fig. 5
shows disk performances when 11 VMs concurrently access
the disk. In this circumstances both solutions performs very
badly compared to the real machine, for all operations.

Random write indicates the performance of writing to a


file in various random locations.

Re-read indicates the performance of reading a file


again.

Random read indicates the performance of reading a file


by reading random information from the file.

IV.

RELATED WORK

There are many papers related to the comparison and


performance evaluation of Xen and KVM, using different
benchmarks. More recent articles are those of Deshane et al.
[2], Xianghua Xu et al. [3], and Andrea Chierici et al. [4, 5].
They all performed quantitative comparisons of Xen and KVM
focusing on overall performance and scalability of the virtual
machines, performance isolation, network and I/O
performance, using tools like hep-spec06, iperf, netperf,
iozone, SysBench, and bonnie++. They tested the hypervisors
in configurations suited primarily on flexibility, like disk on a
file approach or non-virtio drivers for KVM virtual machines.
On the other hand, this technology is evolving very fast. Every
new version of the hypervisors offers optimizations and
features that can influence the decision regarding the
implementation of one over the other. In this paper, we
compared the performance and scalability of virtual machines
in an environment configured particularly for performance,
using the latest versions of hypervisors.
V.

CONCLUSIONS

Our benchmarks showed that on the basis of scalability


both hypervisors perform very well. In terms of CPU
performance provided, KVM and Xen are comparable, with a
small advantage for Xen. The performance drops with the
number of VMs that exceeds the number of physical core.
Hyper-threading has a positive impact, although on a small
margin of 16%.
Network performance is practically the same for both
hypervisors. The guests accessed the network through a bridge
interface configured a on the host. As shown with the
networking testing, the default it is to distribute the bandwidth
as evenly as possible between all VMs. Both hypervisors do
this very well, with practically the same output.
Compared to bare metal, virtual disk I/O is the most
problematic aspect for virtualization. Particularly when
multiple machines concurrently access the disk. Even for a
single VM, the I/O throughput is around 30% to 50% lower
than that of the real hardware. On the other hand, the
performance of this version of KVM is very good compared

with previous versions. The optimizations added to this version


of KVM are obvious.
VI.

FUTURE WORK

These tests gave us some input on future investigations


using the Windows operating system in the near future. But in
the long term, besides server consolidation, we are moving to a
storage area network environment with NetApp storage and
fibre channel (FC). In this case, we intent to compare two open
source solutions for enterprise class consolidation: oVirt and
XenServer. One is based on KVM, the other on Xen.
XenServer is available as a free open source virtualization
platform since version 6.2.
REFERENCES
[1]

SGI technical white paper, Virtualization Best Practices on SGI Altix


UV 1000 using Red Hat Enterprise Linux 6.0 KVM, april 2011.
[2] T. Deshane, Z. Shepherd, J. Matthews, M. Ben-Yehuda, A. Shah and B.
Rao,Quantitative Comparison of Xen and KVM, Xen Summit., June
23-24, 2008.
[3] Xianghua Xu, Feng Zhou, Jian Wan Yucheng Jiang, Quantifying
Performance Properties of Virtual Machine,International Symposium
on Information Science and Engieering, 2008.
[4] A. Chierici and R. Veraldi, A quantitative comparison between xen and
kvm, 17th International Conference on Computing in High Energy and
Nuclear Physics (CHEP09), J. Phys.: Conf. Ser. 219 042005, Part 4,
2010.
[5] A. Chierici and D. Salomoni, Increasing performance in KVM
virtualization within a Tier-1 environment, International Conference on
Computing in High Energy and Nuclear Physics (CHEP2012), J. Phys.:
Conf. Ser. 396 032024, Part 3, 2012.
[6] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris,
Alex Ho, Rolf Neugebauer, Ian Pratt and Andrew Warfield, Xen and
the Art of Virtualization,vezi Proceedings of the ACM Symposium on
Operating Systems Principles, 2003.
[7] A. Singh, An Introduction to Virtualization, kernelthread.com,
January 2004
[8] HP Technical white paper, Performance characterization of Citrix
XenServer on HP BladeSystem, , feb. 2010.
[9] HP Technical white paper, Scalability of bare-metal and virtualized HP
ProLiant servers in 32- and 64-bit HP Server Based Computing
environments, 2012.
[10] IBM technical white paper, Best practices for KVM, 2nd Ed, april
2012.
[11] IBM technical white paper, Tunning KVM for performance, 2nd Ed,
april 2012.

You might also like