LVS-HOWTO (Linux Virtual Server)
(C)1999 Joseph Mack mailto:mack@ncifcrf.gov
v0.2 18 Jul 99
v0.1 12 Jun 99 

comments/feedback/fixes/contributions to Joe 

- Not for public release. 
This copy is posted to the LVS site
http://proxy.iinchina.net/~wensong/ippfvs 
for review and comments only. 
After review, the v1.0 document will be released under GPL 
to the LDP.

_____________________________________________________________________________
This HOWTO explains howto setup a Linux Virtual Server (LVS) using a patched 
2.0.36 kernel. Work on 2.2.x kernels is pretty much done making the code
for the 2.0.36 kernel obsolete. I'll be releasing a new HOWTO for the
2.2.x kernels soon. For an initial approximation, you can use this HOWTO 
as a guide to installing the 2.2.x kernel patches. 

This HOWTO is not designed to tell you how the LVS works.

Quick description of the LVS:
a cluster of servers functioning as one server. One machine (running Linux) 
is the Director, the other machines are the servers. The servers can be 
local/remote, running Linux or other OS's, and serve services (eg ftp, 
http, dns, telnet, nntp) such as are found in /etc/services or inetd.conf. 

The Client/Server relationship is preserved since

1. IPs of all servers is mapped to one IP. Client sees only one IP address
2. servers at different IP addresses believe they are contacted directly by 
   the client.
_____________________________________________________________________________

Table of Contents

1. Contact Information

2. Getting Files 

3. Collect Hardware

4. Gotchas

5. Choose LVS type

6. Install/Configure/Test - General

7. VS-NAT - install, configure and test

8. VS-TUN - install, configure and test

9. VS-DR - install, configure and test

10. VS-LocalNode - install, configure and test

11. Failover

_____________________________________________________________________________

1. Contact Information

homepage             - http://proxy.iinchina.net/~wensong/ippfvs 
mailinglist/archives - click on link to "mailing list", 
                       follow instructions for joining/reading mailing list. 
technical help       - mailing list
                     - Wensong Zhang mailto:wensong@iinchina.net 
                       (please look in the archives or ask on the mailing 
                       list first, Wensong looks at all the postings to the 
                       mailing list)
HOWTO                - updates/problems, send to Wensong or to the HOWTO 
                            maintainer (currently mailto:mack@ncifcrf.gov)
_____________________________________________________________________________
2. Getting Files

2.1 HomePage
     http://proxy.iinchina.net/~wensong/ippfvs 

     Go to the "software" link. Get the latest patch tarball. This will 
contain the kernel patch, configure.lvs and source code for various 
programs.

There are 2 versions of the patch depending on your kernel (2.0.36, 2.2.x). 
(May 99), the 2.0.36 patch is stable, the 2.2.x patch is development/testing. 
The code for kernel-2.2.x will not be discussed in this HOWTO. To set up a 
working director box, it must be running 2.0.36. 

To understand how LVS works, peruse/download anything of interest in the 
"Documents" link. 

2.2 Failover

     You do _not_ need failover to setup a LVS. 
     If you want the LVS to survive a server or director failure, you can 
add software for this after you have the LVS working. 

2.2.1 Server failure:

Server failures are protected by mon. The most likely problems in a 
distributed server are overload of a node or loss of network connection. 
Hardware failure or an OS crash (on unix) is less likely. If the server you 
are connected to fails in a mon protected LVS, the client will loose their 
connection to the server. The client will have to start again as would 
happen on a regular server. However with LVS a new server will be made 
available to you transparently.

	Get "mon" and "fping" from 
ftp://ftp.kernel.org/pub/software/admin/mon
http://www.kernel.org/software/mon/

	Get the perl package "Period" from 
CPAN, ftp://ftp.cpan.org)

To use the fping and telnet monitors, you'll need the tcp_scan
binary which can be built from satan. The standard version
of satan needs patches to compile on Linux, the patched version
is at 

ftp://sunsite.unc.edu/pub/Linux/system/network/admin 

2.2.2 Director failure:

Uses fake and heartbeat. Not covered in this version of the HOWTO.
_____________________________________________________________________________


3. Collect Hardware

     You will need a minimum of 3 machines (you can do it with 2, but
this doesn't demonstrate how to scale up a server farm, read "localnode"
to see how this is done). For a test of LVS, using VS-NAT, you need

     Director: running Linux-2.0.36
     Server: any machine, any OS, running some service of interest 
          (eg httpd, ftpd, telnetd, smtp, nntp, dns, daytime ...)
     Client: any machine, any OS, with a client for the service 
          (eg netscape, fetch/xterm/...)

For VS-TUN the servers will need to run an OS that can tunnel (eg Linux). 
For VS-DR only Linux and Solaris have been shown to work. Others are likely
to work but have not been tested. 

_____________________________________________________________________________

4. Gotchas


Need outside client:

The LVS functions as one machine. You must access the LVS from 
a client that is not a member of the LVS. You cannot access the 
service (eg httpd) from any of the machines in the LVS; access from
the Director will hang, access from a server will connect to the 
service locally, bypassing the LVS. 

Minimum 3 machines: client, director, server(s)

_____________________________________________________________________________

5. Choose LVS Type

     The default LVS type is VS-NAT with round robin unweighted scheduling. 
This is just fine for a first install. You run VS-NAT with a Director running
Linux 2.0.36(patched for LVS), with any OS on the server, servers and 
Director on the same _private_ network and with the client connecting from a 
_separate_ network. 

     You can skip the rest of this section for a first install. 

     Here are the constraints for choosing the various flavors of LVS:
VS-NAT (network address translation), VS-TUN (tunnelling) and VS-DR (direct 
routing). 

                     VS-NAT          VS-TUN            VS-DR

server OS            any           must tunnel         any? 
server mods          none          tunl must not arp   lo must not arp 
port remapping       yes           no                  no
server network       private       on internet         local
                     (remote    or      local)          - 
server number        low(10)       high(100's?)        high(100's?)
client connnects to  director ext  VIP                 VIP
gateway for servers  director int  own router          own router

_____________________________________________________________________________

6. Install/Configure - General

6.1 Kernel

     Apply the kernel patch using the instructions in the tarball.

     Recompile the kernel. The default will give you a LVS running VS-NAT 
with round robin unweighted scheduling. This is just fine for a first install 
if you just have some machines sitting around for a test.

     You select the other flavors of LVS from "Network Options" in the kernel 
configuration. If you are experimenting and building other flavors of LVS, 
save each kernel with a different name (eg vmlinuz-2.0.36.vs_nat), and put an 
entry for each one into lilo.conf. LVS patches for 2.2.x can be built
as modules making loading and unloading easier.

6.2 ippfvsadm (2.0.36) and ipvsadm (renamed for 2.2.x kernels)

     The program is like ipfwadm and determines the services and servers
that the director directs. Compile and install using the supplied Makefile. 
(If it's called ippfvsadm in your distribution, you can make ipvsadm a 
link to it).


6.3. Configure/Test

This involves adding ethernet devices to the Director and Servers, adding
entries to the routing tables, and running the programs ipfwadm (for VS-NAT)
and ipvsadm (for all flavors of LVS). The process is helped by running
configure.lvs, which generates the required rc.d files. Since I don't know
your system setup, comments in the rc.d files will make suggestions as
to the steps you'll need to make yourself.

In several instances, a machine will need multiple IPs. You can put multiple 
IP's on a single NIC with IP aliasing (an option when building the kernel). 

_____________________________________________________________________________
7 VS-NAT

7.1 Brief Introduction

For VS-NAT, the servers and director must be on a private network (separate 
to the client), and their IPs are NAT'ed to the outside world with ipfwadm 
(comes with 2.0.x kernels) (ipchains for 2.2.x kernels). The network is 
standard NAT (client connects to an interface on the outside of the Director, 
the default gateway for the servers is the interface on the inside of the 
Director). 

Example IPs
Machine                      IP
client                       192.168.1.5
director external interface  192.168.1.1
director internal interface  10.1.1.1
server-1                     10.1.1.2
server-2                     10.1.1.3
server-3                     10.1.1.4
.
.
server-n                     10.1.1.n+1
server default gateway       10.1.1.1 (ie Director internal interface)


7.2 Recompile Kernel

Apply kernel patch
Reconfigure kernel with ipforwarding on, ip_aliasing on (if you need to 
      put 2 IPs on the same NIC)
Recompile 2.0.36 kernel (default LVS configuration, can be changed in
      "Networking Options"). 
Rename the kernel image produced to vmlinuz-2.0.36.vs_nat, put a 
      corresponding entry into lilo.conf (and rerun lilo).
On reboot pick vmlinuz-2.0.36.vs_nat kernel from lilo menu. 
After bootup, Check that you have the correct kernel by running ipvsadm 
      and looking for the LVS type (NAT). 


7.3 Setup NAT and VS-NAT on Director

(The rc.d files can be generated by editing and running configure.lvs
or you can copy the code below)

Normally a group of machines on a private network (like the servers in a 
VS-NAT setup) would be already be connected to the outside world by NAT. 
Here both both the regular NAT and reverse NAT used in VS-NAT are 
configured in the same file.

With VS-NAT, the ports can be re-mapped. A request to port 80 on 
the Director can be sent to port 8000 on a Server. This is possible 
because the source and destination of the packets are already being 
rewritten and no extra overhead is required to rewrite the port
numbers. The rewriting is slow (60usec/packet) and limits the 
throughput of VS-NAT (for 536byte packets, this is 72Mbit/sec
or about 100BaseT).

Here's rc.ipvs_natdirector

#rc.ipvs_natdirector
#eth0 has already been configured by rc.inet1
#the ethernet devices here (eth0:1..eth0:n) are for the NAT network
echo -n "rc.ipvs_natdirector "
#
#10.1.1.0 server network for LVS
VS_NETWORK="10.1.1.0"
VS_NETMASK="255.255.255.0"
VS_BROADCAST="10.1.1.255"
VS_ADDRESS="10.1.1.1"
VS_DEVICE="eth0:1"
/sbin/ifconfig ${VS_DEVICE} ${VS_ADDRESS} broadcast ${VS_BROADCAST} up
route add -net ${VS_NETWORK} netmask ${VS_NETMASK} dev ${VS_DEVICE}
#
#link to outside world
OUTSIDE_ADDRESS="192.168.1.1"
OUTSIDE_NETWORK="192.168.1.0"
OUTSIDE_NETMASK="255.255.255.0"
OUTSIDE_DEVICE="eth0:2"
OUTSIDE_BROADCAST="192.168.1.255"
/sbin/ifconfig ${OUTSIDE_DEVICE} ${OUTSIDE_ADDRESS} netmask \
           ${OUTSIDE_NETMASK} broadcast $OUTSIDE_BROADCAST up
route add -net ${OUTSIDE_NETWORK} dev ${OUTSIDE_DEVICE}
#
#activate NAT, 
#in a normal NAT setup this command would be in rc.ipfwadm
/sbin/ipfwadm -F -a m -S 10.1.1.0/24 -D 0.0.0.0/0
#
#the VS-NAT specific stuff, activated by ipvsadm 
#add a line for each server:port mapping 
#
#map a request to LVS:80 to server2:8080
/sbin/ipvsadm -A -t 192.168.1.1:80 -R 10.1.1.2:8080
#and to server3:80
/sbin/ipvsadm -A -t 192.168.1.1:80 -R 10.1.1.3:80
#map telnet to server1
/sbin/ipvsadm -A -t 192.168.1.1:23 -R 10.1.1.2:23
#
#display the LVS setup to console on exit
/sbin/ipvsadm -L
#
#----end rc.ipvs_natdirector-------------------

To test, check that you can connect to (or ping) the client 
(or a box outside the server network) from the servers.


7.4 Configure Server(s)

Point the default gateway to the IP of the director on the LVS network.

Here's rc.ipvs_natserver


#rc.ipvs_natserver 
echo "rc.ipvs_natserver"
#
#you should already have the IP 10.1.1.x on the servers
#and enabled routing to network 10.1.1.0
# 
#Default gateway is Director
/sbin/route add default gw 10.1.1.1
# 
# 
#----------end rc.ipvs_natserver------------------------------------


7.5 Test VS-NAT

Telnet:
On your client, telnet to 192.168.1.1 and see that you get the login 
       prompt with the name of machine 10.1.1.2
On the Director look at the output of ipvsadm, you should see a 
       connection to 10.1.1.2:23
On the server do $netstat -an | grep 23 to look for connections to 
       the telnet port.

http:
Point your browser to http://192.168.1.1. You will get the DocumentRoot 
of either 10.1.1.2 or 10.1.1.3. Open another copy of the browser and
connect again. You should get the other server (this will be easier 
to see if the webpages are different). Look at the output of ipvsadm on the 
director for connections to the httpd ports, and on the server look at the 
output from netstat -an | grep 80 (or 8080) for connections.

_____________________________________________________________________________

8. VS-TUN

Compile the 2.0.36 kernel after selecting the TUN option in LVS (see 
"Networking Options"). Rename the kernel image produced to 
vmlinuz-2.0.36.vs_tun and put a new entry into lilo.conf (and rerun lilo).  
On reboot pick the vmlinuz-2.0.36.vs_tun kernel. Check that you have the 
correct kernel by running ipvsadm and looking for the LVS type (NAT). 

The servers need to be running Linux-2.0.36 so that they have a (non-arp'ing)
tunl device. 

Here's an example set of IPs for a VS-TUN setup. Note that the servers are
not on a private network. Here for convenience the servers are on the same 
network as the client. The only restrictions are that the client must be
able to route to the Director and that the servers must be able to route
to the client (the return packets to the client come directly from the 
servers and do not go back through the Director). 

Normally for VS-TUN,the client is on a different network to the 
director/server(s), and each server has its own route to the outside 
world. In the case below where all machines are on the 192.168.1.0
network there would be no default route for the servers, and routing
for packets from the servers to the client would use the device on
the 192.168.1.0 network (presumably eth0). 


Machine                      IP
client                       192.168.1.5
director                     192.168.1.1
virtual IP (VIP)             192.168.1.110
server-1                     192.168.1.2
server-2                     192.168.1.3
server-3                     192.168.1.4
.
.
server-n                     192.168.1.n+1


8.1 Configure VS-TUN Director

This involves -

adding the Virtual IP (VIP) to an ethernet device
adding servers and services to the LVS with ipvsadm

With VS-TUN, the target port numbers of incoming packets cannot be remapped. 
A request to port 80 on the VIP will be forwarded to port 80 on some server, 
thus no port number is used for the target (server) IP when running the 
ipvsadm command. 

Modify the supplied rc.ipvs_tundirector file (or configure.lvs) for 

     the VIP (the address the client connects to, we use x.x.x.110)
     the services each server is serving
     the server IPs 


#rc.ipvs_tundirector 
#for virtual file server
#
echo -n "rc.ipvs_tundirector "
# 
echo "deleting any ipfwadm forwarding rules" 
/sbin/ipfwadm -Ff 
#
#network setup 
VS_NETWORK="192.168.1.0"
VS_NETMASK="255.255.255.0"
VS_BROADCAST="192.168.1.255"
#
#standard assignment of IP to outside world on Director
#uncomment if not already part of rc.inet1
#VS_ADDRESS="192.168.1.1"
#VS_DEVICE="eth0:1"
#/sbin/ifconfig ${VS_DEVICE} ${VS_ADDRESS} broadcast $VS_BROADCAST up 
#route add -net ${VS_NETWORK} netmask ${VS_NETMASK} dev ${VS_DEVICE}
#
#VIP link to servers
VIP_ADDRESS="192.168.1.110"
VIP_DEVICE="eth0:2"
/sbin/ifconfig ${VIP_DEVICE} ${VIP_ADDRESS} netmask ${VS_NETWORK} \
                 broadcast ${VS_BROADCAST} up 
route add -host ${VIP_ADDRESS} dev ${VIP_DEVICE}
#
# 
/sbin/ipvsadm -L 
# 
#add servers and services
#
SERVER1="192.168.1.2"
SERVER2="192.168.1.3"
#
#telnet to 192.168.1.2
/sbin/ippfvsadm -A -t ${VIP_ADDRESS}:23 -R ${SERVER1}
#http to 192.168.1.2
/sbin/ippfvsadm -A -t ${VIP_ADDRESS}:80 -R ${SERVER1}
#telnet to 192.168.1.3
/sbin/ippfvsadm -A -t ${VIP_ADDRESS}:23 -R ${SERVER2}
#http to 192.168.1.3 with weight of 2
/sbin/ippfvsadm -A -t ${VIP_ADDRESS}:80 -R ${SERVER2} -w 2
#
#display current settings on exit
/sbin/ipvsadm
#
#---------rc.ipvs_tundirector----------------------

Load the the parameters into the Director with the command 

$ . ./etc/rc.d/rc.ipvs_tundirector

(you can later add this to your rc.local file).

check the output from ipvsadm -L, ifconfig -a and netstat -rn,
to see that the services/IP's are correct. If not re-edit and 
re-run the script.


8.2 Configure VS-TUN Server(s)

Servers can run kernel 2.0.36 (unmodified is fine) with tunneling 
turned on. The only role of the 2.0.x kernel is to provide a tunl 
device which doesn't reply to arp requests. 

For 2.2.x kernels, the LVS patch must be applied to the kernel. 
The only function on the patch on the server is to turn off arp'ing 
on the tunl devices.

The director must be able to route to the servers.
The servers must be able to route to the client (here on 192.168.1.0).

Configuring the server involves running the tunnel server rc script.
This script is the same for all servers. 
Edit the script changing the IP for the tunnel device to that of the 
VIP. 

#rc.ipvs_tunserver 
#
echo -n "rc.ipvs_tunserver "
# 
#network and VIP info
VIP_ADDRESS="192.168.1.110"
VS_BROADCAST="192.168.1.255"
VIP_NETMASK="255.255.255.255"
VIP_DEVICE="tunl0"
#
#install non-arping tunnel device on Server
ifconfig $VIP_DEVICE $VIP_ADDRESS netmask $VIP_NETMASK \
           broadcast $VS_BROADCAST up
route add -host ${VIP_ADDRESS} dev ${VIP_DEVICE}
# 
echo "LVS Notice: make sure you have a route to the client(s) " 
#uncomment if not already handled by rc.inet1 
#CLIENT_NETWORK="192.168.1.0"
#CLIENT_NETMASK="255.255.255.0"
#CLIENT_DEVICE="eth0:1"
#route add -net ${CLIENT_NETWORK} netmask ${CLIENT_NETMASK} dev ${CLIENT_DEVICE}
#
#----------------------rc.ipvs_tunserver-----------------

load the file and check as was done for rc.ipvs_tundirector.


8.3 Test VS-TUN 

Check that the service(s) are running on each server at the IP of the VIP 
(use netstat -an).

Connect from the client. 
For http, connect to http://192.168.1.110/ If multiple servers are involved, 
make the file in the top directory of each server a little different and 
shift-reload to cycle through the machines in round robin fashion.

Look at the output of the ipvsadm -L command on the director to show 
that connections are being made. 

On the server look at the output of 

$netstat -an | grep 80

for connections to port 80.

If telnet (port 23) is being served then log into the VIP serveral times and 
check that you get the login prompt from a different server each time. 

_____________________________________________________________________________

9. VS-DR

Compile the 2.0.36 kernel after selecting the DR option in LVS (see 
"Networking Options"). Rename the kernel image produced to 
vmlinuz-2.0.36.vs_dr and put a new entry into lilo.conf (and rerun lilo). On 
reboot pick the vmlinuz-2.0.36.vs_dr kernel. Check that you have the correct 
kernel by running ipvsadm -L and look for the LVS type (DR). 

Direct routing setup and testing is the same as VS-TUN except that all 
machines must be on the same piece of wire. Communication within the LVS is by
linklayer. The VIP_DEVICE for the servers is lo:0 rather than tunl0. The
servers must be running an OS in which the lo device does not reply to arp
requests eg standard Linux-2.0.36, Linux-2.2.x with the LVS kernel patch 
applied kernel or Solaris. 

Edit the rc.ipvs_drdirector and rc.ipvs_drserver scripts (they are similar to
the VS-TUN scripts just changing tun0->lo:0) and load them

on director
$. ./etc/rc.d/rc.ipvs_drdirector

on servers  
$. ./etc/rc.d/rc.ipvs_drserver

Test using the VS-TUN procedures.

 
_____________________________________________________________________________
 
10. LocalNode

The Director machine can be a server. This is convenient when only a small 
number of machines are available to serve.

To use this feature add an extra line to the list of servers controlled
by ifvsadm, in the rc.ipvs_xxxdirector file.

for VS_NAT
/sbin/ipvsadm -A -t 192.168.1.1:80 -R 127.0.0.1:8080 -w 1

for VS_TUN and VS_DR
/sbin/ipvsadm -A -t 192.168.1.110:80 -R 127.0.0.1 -w 1
/sbin/ipvsadm -A -t 192.168.1.110:80 -R 127.0.0.1 -w 1


The service will be running on 192.168.1.1 (for VS_NAT) and 192.168.1.110
for VS_TUN and VS_DR. You are _not_ connecting to a service on 127.0.0.1
despite what this instruction might look like.

Testing LocalNode

With an httpd listening on the VIP (192.168.1.110:80) of the director 
(192.168.1.1) AND with no entries in the ipvsadm table, the director 
appears as a normal non-LVS node and you can connect to this service at 
192.168.1.110:80 from an outside client. If you then add a server to the 
ipvsadm table in the normal manner with

/sbin/ipvsadm -A -t 192.168.1.110:80 -R 192.168.1.2 

then connecting to 192.168.1.110:80 will display the webpage at the server 
192.168.1.2:80 and not the director. This is easier to see if the pages 
are different (eg put the real IP of each machine at the top of the 
webpage).

Now comes the LocalNode part - 

You can now add the director back into the ipvsadm table with

/sbin/ipvsadm -A -t 192.168.1.110:80 -R 127.0.0.1 

Shift-reloading the webpage at 192.168.1.110:80 will alternately display the
wepages at the server 192.168.1.2 and the director at 192.168.1.1 (if the
scheduling is unweighted round robin). If you remove the (external) server
with 

/sbin/ipvsadm -D -t 192.168.1.110:80 -R 192.168.1.2 

you will connect to the LVS only at the directors port. The ipvsadm table
will then look like

Protocol Local Addr:Port ==> 
                        Remote Addr           Weight ActiveConns TotalConns
                        ...
TCP      192.168.1.110:80 ==>
                        127.0.0.1             2      3           3         

From the client, you cannot tell whether you are connecting directly to
the 192.168.1.110:80 socket or through the LVS code.

_____________________________________________________________________________

11. Failover

Don't even think about doing this till you've got LVS working properly :-). 

To activate failover you instal mon on the Director.

11.1 Brief Explanation

Mon is a perl demon that uses "monitors" (perl scripts) to detect whether a 
service is alive. Remote nodes can be queried for a network connection with
fping and/or for a valid reply to a request for a service (eg telnet.monitor 
looks for the string "login:" in the reply). When a failure/recovery is 
detected by a monitor an "alert" (another perl script) is run. There are 
alerts which send email, page you or write to a log. LVS supplies a 
"virtualserver.alert" which uses ipvsadm to remove or add servers/services 
to the ipvsadm table.

11.2 BIG CAVEAT

*Trap for the unwary*

Mon runs on the director, but... 

remember that you cannot connect to any of the LVS controlled services from 
within the LVS (including from the director), only from the outside (eg from 
the client). The reason you cannot do that is that the packets will not return 
to you and the connection will hang (if packet originates on the director). 
You can't monitor the state of the servers from the outside either as you 
cannot tell which server you have connected to. 

The solution to monitoring services under control of the LVS is to monitor 
proxy services whose accessability should closely track that of the LVS 
service. Thus to monitor an LVS http service on a particular server, the 
same webpage should be made available on another IP (or to 0.0.0.0) on 
the same machine, not controlled by LVS.

Example:

VS-TUN, VS-DR
On the server 192.168.1.2, the LVS service will be on the tunl (or lo:0) 
interface of 192.168.1.110:80 and not on 192.168.1.2:80. (The IP 192.168.1.110
on the server 192.168.1.2 is a non-arp'ing device and cannot be accessed by 
mon.) Mon running on the Director at 192.168.1.1 can only detect services on 
192.168.1.2 (the reason that the Director cannot be a client as well). 
The best that can be done is to start a duplicate service on 192.168.1.2:80 
and hope that its functionality goes up and down with the service on 
192.168.1.110:80 (a reasonable hope).

VS-NAT
Normal IP communication is unaffected on the private Director/Server network 
of VS-NAT. If ports are not re-mapped then a monitor running on the Director 
can watch the httpd on server-1 (at 10.1.1.2:80). Nothing special here.
If the ports are re-mapped (eg the httpd server is listening on 8080), then 
you will have to either modify the http.monitor (making an http_8080.monitor)
or activate a duplicate http service on port 80 of the server.

Some services listen to 0.0.0.0:port, ie will listen on all IPs on the 
and you will not have to start a duplicate service. 


11.3 Mon Install 

Mon is installed on the director.

Most of mon does not need to be compiled, it is a set of perl scripts, 
it is mostly ready to go. 
You do the install by hand. Copy the man files (mon.1 etc) into 
/usr/local/man/man1 (rpc.monitor needs to be compiled, but you don't 
need it for LVS).

If you are using mon-0.38 or later get the Mon-0.4.tar.gz file from the
BETA directory from wherever you got mon. (I used mon-0.371).

$ cd /usr/lib
$ tar -zxvof /your_dirpath/mon-x.xx.tar.gz 

this will create the directory /usr/lib/mon-x.xx/
with mon and its files already installed. 

Do either

$ln -s mon-x.xx mon

or

$mv mon-x.xx mon

Check that you have the perl packages required for mon to run

$perl -w mon

do the same for all the alerts and monitors that you'll be
using (test.alert, mail.alert, fping.monitor, telnet monitor,
http.monitor). The location of perl in the alerts is 
#!/usr/bin/perl, make sure this is compatible with your setup.

copy/move  virtualserver.alert to /usr/lib/mon/alert.d

copy/move/link  mon_lvs.cf to /etc/mon/mon.cf

11.4 Mon Configure

This involves editing /etc/mon.cf, which contains information about 
1. nodes monitored 
2. how to detect if a node:service is up (does the node ping, does it serve http...?)
3. what to do when the node goes down, comes back up. 

The mon.cf supplied with LVS 

assigns each node to its own group (nodes are brought down one at a time
     rather than in groups) 
detects whether a node is serving http. 
on failure sends mail to root and removes the server from the ipvsadm table
     using a call to ipvsadm in virtualserver.alert
on recovery sends mail to root, adds the server back to the pool of working
     servers in the ipvsadm table.

11.5 Testing mon

It is best to get mon working in two steps. First show that mon works
completely independantly of LVS, then second bring in LVS. 

Using the mon.cf file supplied with LVS as a guide, pick one of your
servers to test mon. Enter its IP/hostname (eg 10.1.1.2) into an LVS 
group and comment all the monitors/alerts except for fping.monitor and 
test.alert (there is an alert and an upalert for each alert, leave both 
uncommented for test.alert). 

Test fping.monitor with

$ ./fping.monitor 10.1.1.2

You should get the command prompt back quickly with no other output.
As a control test for a machine that you know is not on the net

$ ./fping.monitor 10.1.1.254

fping.monitor will wait for a timeout (about 5secs) and then return
the IP of the unpingable machine before exiting.

Check that test.alert works (it writes a file in /tmp)

$ ./test.alert foo

you will get the data and "foo" in /tmp/test.alert.log

Start mon with rc.mon (or S99mon) and check that mon is in the ps table
(ps -auxw | grep perl). The fping.monitor will check that machine 10.1.1.2
is alive and will enter a string like 

Sun Jun 13 15:08:30 GMT 1999 -s fping -g LVS3 -h 10.1.1.2 -t 929286507 -u -l 0

into /tmp/test.alert.log. This is the date, the service (fping), the hostgroup
(LVS), the host monitored (10.1.1.2), unix time in secs, up (-u)
and some other stuff I haven't figured out.

Check for the "-u" in this line, indicating that 10.1.1.2 is up.

Then pull the network cable to 10.1.1.2. In 15secs or so you should
hear the whirring of disks and the following entry will appear in
/tmp/test.alert.log

Sun Jun 13 15:11:47 GMT 1999 -s fping -g LVS3 -h 10.1.1.2 -t 929286703 -l 0

Note there is no "-u" near the end of the entry indicating that the node is down.

Watch for a few more entries to appear in the logfile, then connect the
network again. A line with -u should appear in the log and then no
more entries should appear in the log.

If you've got this far, you are in good shape and are basically done. 

Kill mon

$ kill `cat /var/run/mon.pid`

Next activate mail.alert and telnet.monitor and comment out test.alert
and fping.monitor in /etc/mon/mon.cf.

Test mail.alert by doing

$ ./mail.alert root
hello
^D


root is the address for the mail, hello is some STDIN and controlD
exits the mail. The alert is looking for sendmail in /usr/lib/sendmail
(incase you have /usr/bin/sendmail). Root should get some mail 
with the string "ALERT" in the subject (indicating that a machine is down).

Repeat sending mail saying the machine is up (the "-u")

$ ./mail.alert -u root
hello
^D

Check that root gets mail with the string "UPALERT" in the subject 
(indicating that a machine has come up).


Check the telnet.monitor on a machine on the net

$ ./telnet.monitor 10.1.1.2

the program should exit with no output.
Test again on a machine not on the net

$ ./telnet.monitor 10.1.1.254

the program should exit outputting the IP of the machine not on the net.

Start up mon again (eg with rc.mon or S99mon), watch for one round
of mail sending notification that telnet is up (an "UPALERT). There 
should be no further mail while the machine remains telnet-able. 
Then pull the network cable and watch for the first ALERT mail. 
Mail should continue arriving every mon time interval (15secs in mon_lvs.cf).
Then plug the network cable back in and watch for one UPALERT mail.

If you've got here you are really in good shape. 
Kill mon (kill `cat /var/run/mon.pid`)

Now add in virtualserver.alert in the mon.cf file (by uncommenting it). 
You do not need to have LVS configured to be doing anything useful at 
this stage, but ipvsadm needs to be working so you'll need a patched 
kernel running on the Director. 

The arguements to virtualserver.alert sent by mon.cf need to match those 
needed by ipvsadm. The defaults in mon_lvs.cf are for a VS-NAT setup

alert virtualserver.alert -V 192.168.1.1:23 -R 10.1.1.2:23  
upalert virtualserver.alert -V 192.168.1.1:23 -R 10.1.1.2:23 

prime the ipvsadm table by running this from the command line

$ipvsadm -A -t 192.168.1.1:23 -R 10.1.1.2:23 

check that they were entered OK

$ipvsadm -L

Protocol Local Addr:Port ==> 
                        Remote Addr           Weight ActiveConns TotalConns
                        ...
TCP      192.168.1.1:23 ==>
                        10.1.1.2:23             1      0           0   


Start up mon, you should get an UPALERT mail, and no change seen in the
ipvsadm output. 

Pull the network cable on 10.1.1.2 and watch for an ALERT mail and the
disappearance of the entry from ipvsadm -L. Plug the network cable back
in and watch for an UPALERT mail and the reappearance of entries into
ipsvadm.

Kill mon again. Edit mon.cf activating http.monitor, mail.alert and 
virtualserver.alert.

Check that http.monitor works for a machine with an httpd on port 80

$ ./http.monitor 10.1.1.2

the program will exit with no output

and for a machine on the network but with no httpd

$ ./http.monitor 10.1.1.254

the program will exit giving 10.1.1.254 as output

You are almost done. Kill mon and edit mon.cf for its running configuration
(you can do this in configure.lvs) using the appropriate monitors
(http if running a LVS webserver) having a pair of mail.alert lines
(one an alert and the other an upalert) and a pair of virtualserver.alert
lines (an alert and an upalert) for each virtual service being monitored.

Start mon and start pulling network cables, watching the output of
ipvsadm -L. If all is OK, then look at the LVS from the LVS client. Make
each webpage a little different, and check that you still get service
when you pull plugs one at a time and that the expected server disappears
and returns on plugging the cables back in.

Although the ipvsadm table was primed with entries on startup here, once
mon is configured and running properly, the table could have been empty
as ipvsadm entries will be added as mon detects the expected services.
When mon finds the servers are up, it will run the virtualserver.alert 
to "up" the entries in ipvsadm and send you mail telling you that it 
has done so.

You're done. Congratulations.


--------------------------------end LVS-HOWTO-----------------------------------
