Load Balancing With LVS-NAT, Keepalived, and Iptables

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Load Balancing with LVS-NAT, Keepalived, and iptables

To begin with we need a quick overview of load balancing in general and some background on Linux
Virtual Server (LVS) specifically. The basic idea behind load balancing is taking a group of requests and
dividing them up over multiple servers. These requests could be HTTP, FTP, SMTP, or any other
network service for that matter. This is where LVS comes in, LVS implements transport-layer load
balancing inside the Linux kernel (layer-4 switching through packet forwarding). LVS has 3 types of
packet forwarding:
1. Network Address Translation (LVS-NAT)
2. IP Tunneling (LVS-TUN)
3. Direct Routing (LVS-DIR)

We are going to focus on using LVS-NAT to load balance HTTP requests. We will have 2 physical
machines (load1 and load2) in an active/backup configuration that will handle the load balancing for us.

When using LVS-NAT all inbound and outbound network traffic will pass through the active load
balancer, so in addition to load balancing, the load balancer also acts as a gateway and a firewall.
Security is a very high priority (see security sidebar) because the load balancer acts as a gateway and a
firewall.

Security sidebar:
The load balancer will be the only public IP that can be attacked directly if secured properly. Take extra
precautions as you see fit, but here are some suggestions.
Don’t use standard ports if possible, for example move SSH off of port 22.
Do not allow root access via password, if you allow it at all only allow it with ssh key exchange.
Add as few user accounts as possible and enforce very strong passwords through pam.
Do not install any unnecessary services on the load balancer like nfs, iscsi, cups, etc.
Install intrusion detection like rkhunter and tripwire.
Install brute force detection and blocking like fail2ban and pam_abl.
Monitor the server with Nagios, Zabbix or similar.

To complete our overview lets take a quick look at the network set-up. The load balancers sit between
the internet and your private web servers. So the load balancer needs 2 logical network connections, the
internet connection and the private connection. Here is where it gets a little tricky. Each of those
networks will get 1 IP that is assigned to the physical hardware (i.e. load1’s IPs are only for load1 and
will not switch over to load2 if load1 goes down) plus there will be 1 virtual IP (VIP) on each of those
networks that will float between the load balancers if one goes down. This means you will need three
IPs for each network. Internet IPs like xxx.xxx.xx1 for load1, xxx.xxx.xx2 for load2, and xxx.xxx.xx3 as
the VIP (the VIP is where you will point your HTTP DNS) as well as three private IPs like:
192.168.20.1 for load1, 192.168.20.2 for load2, and 192.168.20.254 as a VIP for the internal gateway
that your web servers will use for internet access.

1 of 22
The logical network looks like this:

Step 1: Planning
The first step is planning. We need to evaluate what hardware is needed, what you have, and how it will
fit together. In this scenario everything will be redundant, and there will not be a single point of failure
in the system. You can reduce your costs if you feel you can risk running without the redundancy.

Since the load balancing is mainly network based you do not need a lot of CPU, RAM, or hard drive
storage resources. Two 1U servers each with dual Intel L54XX or faster CPUs, 8GB or more RAM, a
pair of 250GB hard drives in software raid1, and two NICs should be plenty (I always recommend the L
type CPUs, the L stands for low power and electricity is always the most expensive ongoing cost). You
will also need an add-on dual port NIC so each load balancer will have four network connections.

You will also need four network switches. Two on the internet side (properly cross connected and
attached to redundant internet drops from your ISP or collocation facility) and two for the private
network (also properly cross connected).

Finally you will need your web server hardware, based on your specific needs. I recommend 3 or more
web servers. You want to have enough servers that if you lose a server (N-1) the rest of the servers are
only running at about 70% of their capacity. This is to prevent a cascade failure in case there is a spike
in traffic at the same time the server is down and the rest of the servers are near capacity. If you can
afford it I recommend N+2.

Now we are ready to get started installing the system.

Step 2: How to connect it all


First let’s wire up the hardware. The connection from the internet to the load balancers is the first
network. This is called the red network and it is wired with red network cable in the rack. The red
network is completely untrusted. All traffic on that network is exposed to the internet and should be
treated with the highest level of security you can apply to it. Then there is the yellow network, this is the
network between the load balancers and the web/application servers and data servers. This is wired with
yellow cable in the rack.

2 of 22
Let’s say that you have the following network connections on your servers:
• eth0 (onboard port closest to the power supply)
• eth1 (onboard port farthest from the power supply)
• eth2 (add-on card port closest to the power supply)
• eth3 (add-on card port farthest from the power supply)

To connect the servers to the network, the load balancers will use bonded network connections; two
connections to each network for redundancy. Connect your first load balancer (load1); attach eth0 to
your first red switch (redA) and eth2 to your second red switch (redB). Switches redA and redB are also
connected to the internet with some kind of failover connection provided by your ISP, like HRSP, BGP,
or STP. This is done for network redundancy; if you lose one switch you will not lose half of your
network because all servers are cross connected to two switches.

Now that the red network is wired for load1 we will connect the yellow network. Attach eth1 to your
first yellow switch (yellowA) and eth3 to your second yellow switch (yellowB).

Remember the yellow connections are a mirror of the red; if possible use the same port numbers so you
know which is which. For example port 1 on redA, redB, yellowA, and yellowB are all connected to
load1.

Now simply do the same for your second load balancer (load2), and you will have a fully connected set
of load balancers. For example port 2 on redA, redB, yellowA, and yellowB are all connected to load2.

It will look like this for each server:


eth0 ÅÆ redA
eth1 ÅÆ yellowA
eth2 ÅÆ redB
eth3 ÅÆ yellowB

Step 3: Basic OS set-up


Now it is time to install CentOS 6.5 minimal on Load1 with software raid1 across two hard drives. The
CentOS installer does not allow you to configure network bonding during install, so just configure eth0
so you have a working internet connection. You will reconfigure the network with bonding right after
the install.

We don’t want to have to configure or troubleshoot selinux while we are setting up the server, so simply
turn it off for now. It is highly recommended to turn selinux back on again after you have a working
load balancer, but that is outside the scope of this article.

From a root login on the console:


# vi /etc/selinux/config

The original file will look like this:


# This file controls the state of SELinux on the system.

3 of 22
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted

Change the line:


SELINUX=enforcing

To:
SELINUX=disabled

Don’t reboot yet, but you will have to reboot the server for the selinux setting to take effect. We will
reboot after we configure network bonding.

Step 4: Network bonding


To get network bonding on CentOS you have to load the bonding module, to do that we create a new
file:
# vi /etc/modprobe.d/network.conf

Put this in the file:


# bonding commands
alias red bonding
alias yellow bonding

The final step in the network set-up is working on the network port config files.
We will create the config files ifcfg-red and ifcfg-yellow.

# cd /etc/sysconfig/network-scripts
# vi ifcfg-red

Put this in the file (replace the IPADDR and NETMASK with your real values):
DEVICE=red
USERCTL=no
BOOTPROTO=static
TYPE=bond
ONBOOT=yes
IPADDR=xxx.xxx.xxx.xx1
NETMASK=255.255.255.0
PEERDNS=no
NM_CONTROLLED=no
BONDING_OPTS="miimon=100 mode=1"

4 of 22
The miimon=100 option specifies the MII link monitoring frequency in milliseconds. This determines
how often the link state of each slave is inspected for link failures.
The mode=1 is the boding policy, in this case policy type 1 which is the active-backup policy. Only one
slave in the bond is active at a time.

# vi ifcfg-yellow

Put this in the file (replace the IPADDR and NETMASK with your real values):
DEVICE=yellow
USERCTL=no
BOOTPROTO=static
TYPE=bond
ONBOOT=yes
IPADDR=192.168.20.1
NETMASK=255.255.255.0
PEERDNS=no
NM_CONTROLLED=no
BONDING_OPTS="miimon=100 mode=1"

Now edit the individual network interface config files.


# vi ifcfg-eth0

Put this in the ifcfg-eth0 file (replace the HWADDR with your real MAC value):
# DEVICE=eth0
HWADDR=00:00:00:00:00:00

BOOTPROTO=none
ONBOOT=yes
PEERDNS=no

# Settings for Bond


MASTER=red
SLAVE=yes

# vi ifcfg-eth1

Put this in the ifcfg-eth1 file (replace the HWADDR with your real MAC value):
DEVICE=eth1
HWADDR=00:00:00:00:00:00

BOOTPROTO=none
ONBOOT=yes
PEERDNS=no

# Settings for Bond


MASTER=yellow

5 of 22
SLAVE=yes

# vi ifcfg-eth2

Put this in the ifcfg-eth2 file (replace the HWADDR with your real MAC value):
DEVICE=eth2
HWADDR=00:00:00:00:00:00

BOOTPROTO=none
ONBOOT=yes
PEERDNS=no

# Settings for Bond


MASTER=red
SLAVE=yes

# vi ifcfg-eth3

Put this in the ifcfg-eth3 file (replace the HWADDR with your real MAC value):
DEVICE=eth3
HWADDR=00:00:00:00:00:00

BOOTPROTO=none
ONBOOT=yes
PEERDNS=no

# Settings for Bond


MASTER=yellow
SLAVE=yes

Lastly make sure your internet gateway is set in /etc/sysconfig/network


# vi /etc/sysconfig/network

It should look like this (replace HOSTNAME and GATEWAY with your real values):
NETWORKING=yes
HOSTNAME=load1.example.com
GATEWAY=xxx.xxx.xxx.xxx

Now reboot the server, if we did everything correctly load1 will reboot and have 7 network connections
(including the loopback, lo) when it comes back up. Test them by running:
# ifconfig

You will see output like this:


eth0 Link encap:Ethernet HWaddr 00:00:00:00:00:00

6 of 22
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:137 errors:0 dropped:0 overruns:0 frame:0
TX packets:82 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:15229 (14.8 KiB) TX bytes:9112 (8.8 KiB)
Interrupt:18 Memory:d8020000-d8040000

eth1 Link encap:Ethernet HWaddr 00:00:00:00:00:00


UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:137 errors:0 dropped:0 overruns:0 frame:0
TX packets:82 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:15229 (14.8 KiB) TX bytes:9112 (8.8 KiB)
Interrupt:18 Memory:d8020000-d8040000

eth2 Link encap:Ethernet HWaddr 00:00:00:00:00:00


UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:137 errors:0 dropped:0 overruns:0 frame:0
TX packets:82 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:15229 (14.8 KiB) TX bytes:9112 (8.8 KiB)
Interrupt:18 Memory:d8020000-d8040000

eth3 Link encap:Ethernet HWaddr 00:00:00:00:00:00


UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:137 errors:0 dropped:0 overruns:0 frame:0
TX packets:82 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:15229 (14.8 KiB) TX bytes:9112 (8.8 KiB)
Interrupt:18 Memory:d8020000-d8040000

lo Link encap:Local Loopback


inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

red Link encap:Ethernet HWaddr 00:00:00:00:00:00


inet addr:xxx.xxx.xxx.xx1 Bcast:xxx.xxx.xxx.255 Mask:255.255.255.0
inet6 addr: 0000::000:0000:0000:000/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:137 errors:0 dropped:0 overruns:0 frame:0
TX packets:82 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:15229 (14.8 KiB) TX bytes:9112 (8.8 KiB)

yellow Link encap:Ethernet HWaddr 00:00:00:00:00:00


inet addr:192.168.20.1 Bcast:192.168.20.255 Mask:255.255.255.0
inet6 addr: 0000::000:0000:0000:000/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:137 errors:0 dropped:0 overruns:0 frame:0
TX packets:82 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0

7 of 22
RX bytes:15229 (14.8 KiB) TX bytes:9112 (8.8 KiB)

Step 5: iptables
While it is possible to implement LVS-NAT load balancing without a firewall in my opinion no internet
connected server should be run without a firewall. So we will set-up iptables with a few rules which you
can customize as you see fit.

Let’s edit /etc/sysconfig/iptables and add some rules.


# vi /etc/sysconfig/iptables

There will be a few default rules in the file that you can just delete and replace.
First we will add a nat section so the firewall will masquerade our outbound traffic by rewriting the
private IP to the public IP, otherwise the end users browser would think it sent a request to
xxx.xxx.xxx.xx3 and got a response from 192.18.20.x.

We add this to the /etc/sysconfig/iptables file for the masquerading.


Note: change 192.168.20.0/24 to match your private network and xxx.xxx.xxx.xx3 to match your public
VIP.
*nat
:PREROUTING ACCEPT [0:0]

:POSTROUTING ACCEPT [0:0]


# MASQUERADE for load balancing:
-A POSTROUTING -o red -s 192.168.20.0/24 -j SNAT --to xxx.xxx.xxx.xx3

:OUTPUT ACCEPT [0:0]


COMMIT

Next we will add the chains and a rule to allow the load balancer to act as a gateway to the private
servers. You may notice the SSH-FLOOD and SSH-ALLOW chains. These are added so we can limit
the speed at which ssh connections are allowed later on when we add the ssh service rules.

Note: change 192.168.20.0/24 to match your internal network.


*filter
:INPUT ACCEPT [0:0]

:FORWARD ACCEPT [0:0]


# To act as a gateway:
-A FORWARD -d 192.168.20.0/24 -j ACCEPT

:OUTPUT ACCEPT [0:0]


:RH-Firewall-1-INPUT - [0:0]
:SSH-FLOOD - [0:0]
:SSH-ALLOW - [0:0]

8 of 22
# Pass all inbound traffic to RH-Firewall-1-INPUT
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT

Now we will add the rules to support the services of load1 (like ssh) and the LVS-NAT services, in this
case VRRP (Virtual Router Redundancy Protocol), HTTP, and HTTPS.
Note: you will want to change xxx.xxx.xxx.xx1/32 and xxx.xxx.xxx.xx2/32 to match the public IPs of
load1 and load2, this will limit the public VRRP heartbeat traffic to load1 and load2 so no other servers
can send VRRP traffic to your load balancers.
Note: you will want to change xxx.xxx.xxx.xx1 on the SSH-FLOOD rule to match the public IP of load1
Note: if you changed the ssh port from 22 to something else be sure to change it here on both the SSH-
FLOOD and SSH-ALLOW rules.
##################################
# Start Rules:
##################################
# keepalived VRRP:
-A RH-Firewall-1-INPUT -s xxx.xxx.xxx.xx1/32 -d 224.0.0.0/8 -j ACCEPT
-A RH-Firewall-1-INPUT -s xxx.xxx.xxx.xx2/32 -d 224.0.0.0/8 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.20.1/32 -d 224.0.0.0/8 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.20.2/32 -d 224.0.0.0/8 -j ACCEPT

# Http and Https


-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 80 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 443 -j ACCEPT

# SSH, slow down ssh connections to reduce brute force attacks.


-A RH-Firewall-1-INPUT -d xxx.xxx.xxx.xx1 -p tcp -m tcp --dport 22 -m state --state NEW -j SSH-FLOOD
# ssh flood rules:
-A SSH-FLOOD -m limit --limit 2/minute --limit-burst 2 -j SSH-ALLOW
-A SSH-FLOOD -j DROP
# if not a flood accept:
-A SSH-ALLOW -p tcp -m tcp --dport 22 -j ACCEPT

Finally we add some standard rules back in so things like the loopback port works.
# Default rules
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -d 224.0.0.251 -p udp -m udp --dport 5353 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited

COMMIT

The last rule will prevent your server from answering ping requests. If you want to allow ping from
specific IPs you can add rules like this:
# allow ping from yyy.yyy.yyy.yyy:
-A RH-Firewall-1-INPUT -s yyy.yyy.yyy.yyy -p icmp -j ACCEPT

Here is the complete iptables file:

9 of 22
*nat
:PREROUTING ACCEPT [0:0]

:POSTROUTING ACCEPT [0:0]


# MASQUERADE for load balancing:
-A POSTROUTING -o red -s 192.168.20.0/24 -j SNAT --to xxx.xxx.xxx.xx3

:OUTPUT ACCEPT [0:0]


COMMIT

##########################################
*filter
:INPUT ACCEPT [0:0]

:FORWARD ACCEPT [0:0]


# To act as a gateway:
-A FORWARD -d 192.168.20.0/24 -j ACCEPT

:OUTPUT ACCEPT [0:0]


:RH-Firewall-1-INPUT - [0:0]
:SSH-FLOOD - [0:0]
:SSH-ALLOW - [0:0]

# Pass all inbound traffic to RH-Firewall-1-INPUT


-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT

##################################
# Start Rules:
##################################
# keepalived VRRP:
-A RH-Firewall-1-INPUT -s xxx.xxx.xxx.xx1/32 -d 224.0.0.0/8 -j ACCEPT
-A RH-Firewall-1-INPUT -s xxx.xxx.xxx.xx2/32 -d 224.0.0.0/8 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.20.1/32 -d 224.0.0.0/8 -j ACCEPT
-A RH-Firewall-1-INPUT -s 192.168.20.2/32 -d 224.0.0.0/8 -j ACCEPT

# Http and Https


-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 80 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 443 -j ACCEPT

# SSH, slow down ssh connections to reduce brute force attacks.


-A RH-Firewall-1-INPUT -d xxx.xxx.xxx.xx1 -p tcp -m tcp --dport 22 -m state --state NEW -j SSH-FLOOD
# ssh flood rules:
-A SSH-FLOOD -m limit --limit 2/minute --limit-burst 2 -j SSH-ALLOW
-A SSH-FLOOD -j DROP
# if not a flood accept:
-A SSH-ALLOW -p tcp -m tcp --dport 22 -j ACCEPT

# Default rules
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp -m icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -d 224.0.0.251 -p udp -m udp --dport 5353 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited

COMMIT

Restart iptables and check the rules:


# service iptables restart
iptables: Setting chains to policy ACCEPT: filter nat [ OK ]

10 of 22
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
iptables: Applying firewall rules: [ OK ]

# service iptables status

You should see an output of:


Table: filter
Chain INPUT (policy ACCEPT)
num target prot opt source destination
1 RH-Firewall-1-INPUT all -- 0.0.0.0/0 0.0.0.0/0

Chain FORWARD (policy ACCEPT)


num target prot opt source destination
1 ACCEPT all -- 0.0.0.0/0 192.168.20.0/24
2 RH-Firewall-1-INPUT all -- 0.0.0.0/0 0.0.0.0/0

Chain OUTPUT (policy ACCEPT)


num target prot opt source destination

Chain RH-Firewall-1-INPUT (2 references)


num target prot opt source destination
1 ACCEPT all -- xxx.xxx.xxx.xx1 224.0.0.0/8
2 ACCEPT all -- xxx.xxx.xxx.xx2 224.0.0.0/8
3 ACCEPT all -- 192.168.20.1 224.0.0.0/8
4 ACCEPT all -- 192.168.20.2 224.0.0.0/8
5 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:80
6 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:443
7 SSH-FLOOD tcp -- 0.0.0.0/0 xxx.xxx.xxx.xx1 tcp dpt:22 state NEW
8 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
9 ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0 icmp type 255
10 ACCEPT udp -- 0.0.0.0/0 224.0.0.251 udp dpt:5353
11 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
10 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited

Chain SSH-ALLOW (1 references)


num target prot opt source destination
1 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:22

Chain SSH-FLOOD (1 references)


num target prot opt source destination
1 SSH-ALLOW all -- 0.0.0.0/0 0.0.0.0/0 limit: avg 2/min burst 2
2 DROP all -- 0.0.0.0/0 0.0.0.0/0

Table: nat
Chain PREROUTING (policy ACCEPT)
num target prot opt source destination

Chain POSTROUTING (policy ACCEPT)


num target prot opt source destination
1 SNAT all -- 192.168.20.0/24 0.0.0.0/0 to:xxx.xxx.xxx.xx3

Chain OUTPUT (policy ACCEPT)


num target prot opt source destination

Step 6: Keepalived
Before we install keepalived we need to enable ip forwarding.

11 of 22
# vi /etc/sysctl.conf

Change the line:


net.ipv4.ip_forward = 0

To:
net.ipv4.ip_forward = 1

Now run sysctl to update the system


# sysctl -p

You will see output like this which will confirm the change:
net.ipv4.ip_forward = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key
error: "net.bridge.bridge-nf-call-iptables" is an unknown key
error: "net.bridge.bridge-nf-call-arptables" is an unknown key
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296

Now we need to install keepalived with yum:

# yum -y install keepalived ipvsadm

Next we need to configure keepalived, but I like to backup the original sample config file that was
installed with yum before I start making changes.
# cd /etc/keepalived
# cp -ap keepalived.conf keepalived.conf.bak

We are going to break up out keepalived config files into multiple parts. There are two main types of
files. Server specific files (files that will differ on load1 and load2) and common files (these must be
identical on load1 and load2). I like to put the common files in a dir and include *.conf files from that
directory. This makes it easy to rsync the common files between load1/2 and to disable a file simply
move it out of the directory or rename it so it does not end with .conf.

Let’s start with the server specific files of which there will only be one on each server keepalived.conf.

# vi keepalived.conf
Lets go over the parts of the keepalived.conf file section by section.
Note: comments are lines starting with ! this is not a common comment identifier so it can trip you up.

12 of 22
This section holds the general settings and sets up the e-mail notification settings. In this case it is
configured to simply send e-mail using the localhost (you will need to configure a mail server). It is
configured to send mail to root@ from load1@ (it helps to use load1@ and load2@ as the from address
so you can easily tell by the from line which server is sending you mail)
We also give the server a router_id, which will be included as part of the message when mail is sent.
! Configuration File for keepalived

global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 90
router_id load1
}

This section groups our two vrrp groups (the public VIP and the private VIP) so that is one of the IPs
switches servers they both do. It would be bad if the public IP was on load1 and the private IP (gateway)
was on load2 or visa versa.
vrrp_sync_group VG_1 {
group {
VI_1
VI_2
}
}

This section defined the public VIP lets go over it a line at a time.
This line creates a vrrp instance named “VI_1”
vrrp_instance VI_1 {

This line tells keepalived to broadcast a takeover message immediately on startup. Load2 will have
BACKUP in the state so that it does not send out an immediate broadcast message on startup, but
instead looks for the MASTER, if it does not find the MASTER then it sends out the takeover broadcast
message.
state MASTER

This line specifies what network interface to listen for broadcast messages on.
interface red

This line tells the vrrp what ip to broadcast from.


Note: xxx.xxx.xxx.xx1 should be replaced with the real public IP for load1.
mcast_src_ip xxx.xxx.xxx.xx1

13 of 22
This is the ID # of the vrrp virtual router, there are 2 vrrp virtual routers in our system, 1 is the public IP
and 2 is the private IP. It is very important that these are unique. If you copy and paste double check this
ID it will cause problems if you have two vrrp virtual routers with the same ID.
virtual_router_id 1

This is the priority of load1 vs. load2. In this case a higher priority means that if both load1 and load2
are up, load1 will be the active server with a priority of 150 vs load2’s priority of 100. You can choose
any number for the priority, these could be 1 for load1 and 2 for load2, but I like to make the values
unique so I don’t accidentally change the wrong “1” because the values match the value of the
virtual_router_id.
priority 150

This just tells keepalived to send out email on a status change.

smtp_alert

The authentication section prevents foreign servers from sending takeover messages.
Note: I generated the auth_pass with a “mkpass -l 12 -s 0" command.
authentication {
auth_type PASS
auth_pass i6Sbff1duBon
}

This is what sets up the VIP for the public interface.


xxx.xxx.xxx.xx3 is the public VIP
“brd” is the broadcast keyword
xxx.xxx.xxx.255 is the network broadcast IP
“dev” is the device keyword
red is the network device to add the VIP to
“label” is the label keyword
red:0 is the label that the VIP will be given, which will show up when you run ifconfig
virtual_ipaddress {
! This is the public VIP configuration:
xxx.xxx.xxx.xx3 brd xxx.xxx.xxx.255 dev red label red:0
}

Here the section all together:


vrrp_instance VI_1 {
state MASTER
interface red
! This is the public IP for load1:
mcast_src_ip xxx.xxx.xxx.xx1
virtual_router_id 1
priority 150
smtp_alert

authentication {
auth_type PASS

14 of 22
auth_pass i6Sbff1duBon
}

virtual_ipaddress {
! This is the public VIP configuration:
xxx.xxx.xxx.xx3 brd xxx.xxx.xxx.255 dev red label red:0
}
}

This section is the same as the VI_1 except it is for the private IP VI_2.
vrrp_instance VI_2 {
state MASTER
interface yellow
! This is the private IP of load1
mcast_src_ip 192.168.20.1
virtual_router_id 2
priority 150
smtp_alert

authentication {
auth_type PASS
auth_pass i6Sbff1duBon
}

virtual_ipaddress {
! This is the private VIP that will be used as the gateway for the web servers:
192.168.20.254 brd 192.168.20.255 dev yellow label yellow:0
}
}

Finally we have the include command that will include the common config files.

include /etc/keepalived/conf.d/*.conf

Here is the complete /etc/keepalived/keepalived.conf file


! Configuration File for keepalived

global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 90
router_id load1
}

vrrp_sync_group VG_1 {
group {
VI_1
VI_2

15 of 22
}
}

vrrp_instance VI_1 {
state MASTER
interface red
! This is the public IP for load1:
mcast_src_ip xxx.xxx.xxx.xx1
virtual_router_id 1
priority 150
smtp_alert

authentication {
auth_type PASS
auth_pass i6Sbff1duBon
}

virtual_ipaddress {
! This is the public VIP configuration:
xxx.xxx.xxx.xx3 brd xxx.xxx.xxx.255 dev red label red:0
}
}

vrrp_instance VI_2 {
state MASTER
interface yellow
! This is the private IP of load1
mcast_src_ip 192.168.20.1
virtual_router_id 2
priority 150
smtp_alert

authentication {
auth_type PASS
auth_pass i6Sbff1duBon
}

virtual_ipaddress {
! This is the private VIP that will be used as the gateway for the web servers:
192.168.20.254 brd 192.168.20.255 dev yellow label yellow:0
}
}

include /etc/keepalived/conf.d/*.conf

For load2 you can simply rsync this file over to it and edit it with the load 2 values as shown here:

! Configuration File for keepalived

global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 90
router_id load2
}

16 of 22
vrrp_sync_group VG_1 {
group {
VI_1
VI_2
}
}

vrrp_instance VI_1 {
state BACKUP
interface red
! This is the public IP for load2:
mcast_src_ip xxx.xxx.xxx.xx2
virtual_router_id 1
priority 100
smtp_alert

authentication {
auth_type PASS
auth_pass i6Sbff1duBon
}

virtual_ipaddress {
! This is the public VIP configuration:
xxx.xxx.xxx.xx3 brd xxx.xxx.xxx.255 dev red label red:0
}
}

vrrp_instance VI_2 {
state BACKUP
interface yellow
! This is the private IP of load2
mcast_src_ip 192.168.20.2
virtual_router_id 2
priority 100
smtp_alert

authentication {
auth_type PASS
auth_pass i6Sbff1duBon
}

virtual_ipaddress {
! This is the private VIP that will be used as the gateway for the web servers:
192.168.20.254 brd 192.168.20.255 dev yellow label yellow:0
}
}

include /etc/keepalived/conf.d/*.conf

Next we will set-up the common files which will house the HTTP and HTTPS services that we will be
load balancing.

First let’s create the directory to hold the files:

mkdir /etc/keepalived/conf.d

Now we will create the config file for the HTTP service.

17 of 22
# cd /etc/keepalived/conf.d
# vi http.conf

We will cover this file line by line as well; to start with this line creates a virtual server on
xxx.xxx.xxx.xx3 the public IP that listens on port 80.

virtual_server xxx.xxx.xxx.xx3 80 {

This tells keepalived how often to test that the web servers are up. It will check each server every 10
seconds. Remember that both load1 and load2 will be doing these checks, so the lower you set this the
more resources are used by the load balancer and the few that are available to your end users.

delay_loop 10

This setting is the LVS scheduler, which tells keepalived how to distribute the incoming requests to the
web servers. I have selected wrr (weighted round robin) I found it works well for the distribution of web
requests.
The schedulers are:
rr = Round-Robin Scheduling
wrr = Weighted Round-Robin Scheduling
lc = Least-Connection Scheduling
wlc = Weighted Least-Connection Scheduling
sh = Source Hashing Scheduling
dh = Destination Hashing Scheduling
lblc = Locality-Based Least-Connection Scheduling
lb_algo wrr

This specifies the LVS forwarding method, which we are using NAT.
lb_kind NAT

The persistence_timeout tell keepalived to send the same end user to the same web server for a period of
time. This is good because it keeps the scheduler from having to calculate which server to send traffic to
for every request. This is more important for something like FTP where the end user might hit multiple
ports and all of them would be directed to the same server behind the load balancer.

persistence_timeout 300

This just specified that this is a TCP service vs. a UDP service.
protocol TCP

Now we begin adding the web servers (also known as real servers), this line adds web1 on port 80,
keepalived will try to connect to web1 on port 80 and if it does so successfully it will add it to the pool
of available servers.

18 of 22
real_server 192.168.20.101 80 {

The weight parameter tells keepalived how much traffic to send to this server because we are using wrr.
You can use any value really, but I like to use 100 as my baseline because adjustments are closer to real
percentage points. For example if you added a new server that was 20% faster you could set its weight to
120. If you use a weight of lets say 10 as your baseline you could only use a weight of 12 for the new
server but what I found is that a 20% increase is not always 20% of the traffic and I end up using
adjusting it with something more like 115, which you can’t do with a baseline of 10.
weight 100

This section tells keepalived how to check that the web servers are up. In this case keepalived will pull a
web page using an http get command and compare the returned page to a hash value. There are other
checks like a simple TCP check and even a custom check that allows you to write your own checking
script.

HTTP_GET {

The url section tells the HTTP_GET check what file to pull and the hash of the page that it expects.
I like to use a CGI script for this check because then I am checking that the web server is up as well as
the CGI environment. If you have something like Nginx with a Perl dancer/moose/catalyst or a fastCGI
PHP script behind it, this check will verify that your CGI also works.
In this case I simply pulled in /index.pl, but you should create a special script/route like
000_DoNotDeleteMe.pl that does something very simple and fast. My test script just prints “ok”.

url {
path /index.pl
digest 671dfda45192a336edda728621293bbc
}
The digest is the hash that keepalived will compare to the page output from the get. Here is how to
generate the digest:
# genhash -s 192.168.20.101 -p 80 -u /index.pl

These 3 settings control how the tests behave. A server is considered down if it takes longer than
connect_timeout to return. nb_get_retry is how many times keepalived will retry the web server before it
considers it down and kicks it out of the available pool. delay_before_retry is simply how long it waits
between checks. With this configuration it will try 5 times to connect, it will wait 5 seconds between
tries and will try 5 times before giving up for a total of about 50 seconds of total time. Adjust this to
meet your needs.

connect_timeout 5
nb_get_retry 5
delay_before_retry 5

Then we simply repeat the config for each real server in the pool.

Here is the complete /etc/keepalived/conf.d/http.conf file:

19 of 22
! This is the public IP
virtual_server xxx.xxx.xxx.xx3 80 {
delay_loop 10
lb_algo wrr
lb_kind NAT
persistence_timeout 300
protocol TCP

! This is the IP of web1:


real_server 192.168.20.101 80 {
weight 100
HTTP_GET {
url {
path /index.pl
digest 671dfda45192a336edda728621293bbc
}
connect_timeout 5
nb_get_retry 5
delay_before_retry 5
}
}

! This is the IP of web2:


real_server 192.168.20.102 80 {
weight 100
HTTP_GET {
url {
path /index.pl
digest 671dfda45192a336edda728621293bbc
}
connect_timeout 5
nb_get_retry 5
delay_before_retry 5
}
}

! This is the IP of webN:


real_server 192.168.20.1xx 80 {
weight 100
HTTP_GET {
url {
path /index.pl
digest 671dfda45192a336edda728621293bbc
}
connect_timeout 5
nb_get_retry 5
delay_before_retry 5
}
}
}

Save the file and reload keepalived, then we run ipvsadm to see which servers are in the available server
pool.

# ipvsadm

20 of 22
You will see something like this:
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP xxx.xxx.xxx.xx3:http wrr persistent 300
-> 192.168.20.101:http Masq 100 0 0
-> 192.168.20.102:http Masq 100 0 0
...
-> 192.168.20.1xx:http Masq 100 0 0

Finally we can add the HTTPS config. We change the port, the digest, and the check type (from HTTP_GET to
SSL_GET) and other than that it is the same as the HTTP config file. Now if you have a different set of servers that handles
HTTPS you can of course use different real servers the two config files do not have to match at all.
! This is the public IP
virtual_server xxx.xxx.xxx.xx3 443 {
delay_loop 10
lb_algo wrr
lb_kind NAT
persistence_timeout 300
protocol TCP

! This is the IP of web1:


real_server 192.168.20.101 443 {
weight 100
SSL_GET {
url {
path /index.pl
digest f111e548bd53d14c2a5e75bcc0063d9a
}
connect_timeout 5
nb_get_retry 5
delay_before_retry 5
}
}

! This is the IP of web2:


real_server 192.168.20.102 80 {
weight 100
SSL_GET {
url {
path /index.pl
digest f111e548bd53d14c2a5e75bcc0063d9a
}
connect_timeout 5
nb_get_retry 5
delay_before_retry 5
}
}

! This is the IP of webN:


real_server 192.168.20.1xx 443 {
weight 100

21 of 22
SSL_GET {
url {
path /index.pl
digest f111e548bd53d14c2a5e75bcc0063d9a
}
connect_timeout 5
nb_get_retry 5
delay_before_retry 5
}
}
}

Save the HTTPS config file and reload keepalived one more time, then we run ipvsadm again to see
which servers are in the available server pool.

# ipvsadm

Now we see both HTTP and HTTPS server pools.

Prot LocalAddress:Port Scheduler Flags


-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP xxx.xxx.xxx.xx3:http wrr persistent 300
-> 192.168.20.101:http Masq 100 0 0
-> 192.168.20.102:http Masq 100 0 0
...
-> 192.168.20.1xx:http Masq 100 0 0
TCP xxx.xxx.xxx.xx3:https wrr persistent 300
-> 192.168.20.101:https Masq 100 0 0
-> 192.168.20.102:https Masq 100 0 0
...
-> 192.168.20.1xx:https Masq 100 0 0

The last thing to do is ensure that keepalived starts up on reboot.


# chkconfig keepalived on

This will enable the keepalived service on boot. While you are at it you should check what other
services are enabled on boot and see if there are any you can turn off.
# chkconfig --list | grep :on

Go through the list and evaluate what you need and turn off anything you don’t.

Congratulations, you now have a working LVS-NAT load balancer. Copy the /etc/keepalived/conf.d to
load2 and restart keepalived on load2 and you will have a failover pair of servers up and running. Test
them out by unplugging the network or shutting down keepalived on the load balancers. Also test by
shutting down HTTP and/or HTTPS on a single web server and make sure that the server is kicked out
of the pool by checking the ipvsadm command.

Chad Columbus is a IT and business consultant with 20 years of experience. He can be reached at
[email protected] for questions.

22 of 22

You might also like