Building an elastic-scaling application in Interoute Virtual Data Centre

Design overview
Create networks in a zone
Create a load balancer with HAProxy (The top of the sandwich)
Create a shared NFS disk for the webcluster (The bottom of the sandwich)
Create a template for the webserver virtual machines (The sandwich filling)
Running an elastic scaling program
Testing the elastic scaling program



Elastic scaling is one of the big ideas of cloud computing. The idea is that the number of virtual machines, and the amount of CPU, RAM and disk currently being consumed, are matched with the actual live requirement for the loading on your software applications. When the load is low, the number of virtual machines running will be low; as the load increases or decreases, so does the number of virtual machines expand or shrink (hence 'elastic'). 'Load' can be measured several ways: one way is to monitor the proportion of virtual machine CPU capacity that is being used, and to set an operational maximum (say 90%) which will trigger new machines to be deployed. Another way is to monitor the amount of activity (web sessions or data traffic) between the software application and its users, and this is the type of monitoring that will be used in this tutorial.

This tutorial shows how to build an elastic scaling cluster of webservers. The number of virtual machines in the cluster will change automatically based on the loading of webserver user sessions.

The design takes the form of a sandwich with two fixed slices of bread and variable amounts of filling, where the filling scales according to the degree of hunger of the consumer – as in this Garfield cartoon. The top piece of bread is an 'HAProxy' virtual machine serving as a load balancer, and the bottom piece of a bread is a networked SSD storage drive running on an NFS server virtual machine. The sandwich filling is going to be an automatically-varying number of webserver virtual machines.

This is not quite a production-ready setup since the single load balancer (and the single VDC zone) is a point of failure which should be covered by a backup or failover server, ideally in a different zone. The internal setup of single storage drive also needs some backup/failover, and the form of storage could be a lot more complex depending on the amounts of data involved in running the cluster. For many web-based applications, you would want to implement a database server as well as a shared disk (see the tutorial Building a SaaS application in multiple zones of Interoute Virtual Data Centre, for a demonstration of a master-master synchronised database cluster).

An elastic-scaled application optimises what you spend on your cloud computing because the consumed capacity of deployed virtual machines will be maximised and idle resources will be minimised. And VDC offers a specially good deal for network-intensive applications because there are ZERO charges for network setups and network traffic. You only pay for CPU, RAM and storage, and your costs will be more predictable.


You need to have a VDC account, with API access turned on, and you have the two 'API keys' to configure your API connection. You will also require the Cloudmonkey API tool.

You can follow this tutorial with a VDC free trial account: it only takes a couple of minutes to sign up and get your trial running.

For setup instructions for the VDC API and the Cloudmonkey API tool, see VDC API: Introduction to the API.

For using the autoscaler program, you will need to put the API access configuration into a file (called for example 'vdc_config') using this form:

 "api_url": ""}

In the following command inputs, note the different prompts: commands with 'phillip$' must be typed on your own computer; with '>', using Cloudmonkey on your own computer; with 'ubuntu@Webcluster-HAProxy-Loadbalancer:~$', etc, you need to login to the correct virtual machine.

I will 'XXX' out the digits of the public IP addresses for reasons of privacy. In any case, you will need to replace these IP addresses with your own values.

Design overview

In one zone I am going to create the following:

  • Three networks: 2 of type 'local with Internet gateway', 1 of type 'Public Direct Connect'

  • One Ubuntu Certified 14.04 virtual machine, running HAProxy as a load balancer

  • One Ubuntu Certified 14.04 virtual machine, running an NFS server

  • One Ubuntu Certified 14.04 virtual machine, running Apache webserver (httpd) and PHP, to be the original VM for a new VM template

  • One SSD datadisk of size 50 GBytes, network-shared through the NFS server

  • An elastic scaling program (written in Python) will run at regular intervals on the HAProxy machine to perform the automatic scaling

I have selected to use the VDC Frankfurt zone, but everything will work the same in any of the 17 available VDC zones.

It's generally a good idea to use the Ubuntu Certified virtual machine templates because they will deploy more quickly than the standard Ubuntu templates, and they are optimised for the VDC platform. Plus, for a production service, Certified VMs can be covered by Canonical's enterprise support services.

Create networks in a zone

First, I will create a Public Direct Connect network to connect the HAProxy virtual machine directly to the Internet:

> createPublicDirectConnect displaytext='Frankfurt-Webcluster-DirectConnect' zonename='Frankfurt (ESX)' suffix=29

(Note that Public Direct Connect networks have no firewall, so an attached VM becomes directly connected to the Internet, and therefore for a production service you should have a firewall running on the VM, and all of the installed applications fully hardened for exposure to the Internet.)

Next, I will create a Local with Internet Gateway network to inter-connect the HAProxy load balancer with the webserver virtual machines, as follows.

> createLocalNetwork displaytext='Frankfurt-Webcluster-web-network' zonename='Frankfurt (ESX)' cidr= gateway=

Make a note of the new network UUID, because you will need to use it in a moment.

An egress rule needs to be created for the Local Gateway network, so that VMs can communicate outwardly to the Internet. For my demonstration, this is only needed while the system is being setup, and the egress rule could be deleted later. I am going to use the simplest and very permissive rule, which would not be recommended for a production system:

> create egressfirewallrule networkid=UUID protocol=all cidr=

Replace UUID here with the network ID for your Local Gateway network.

Next, the network which will be used for the sharing of the networked 'NFS disk'. I will use a 'Local Gateway' with closed firewall, but (see below) open the firewall temporarily for the setup of the shared disk.

> createLocalNetwork displaytext='Frankfurt-Webcluster-NFS-network' zonename='Frankfurt (ESX)' cidr= gateway=

It is not absolutely necessary to have a separate network for the NFS disk, but it is good practice to isolate webserver data away from the network which handles the web traffic.

Create a load balancer with HAProxy (The top of the sandwich)

HAProxy is a powerful, open source load balancer, which I am going to set running on an Ubuntu virtual machine. VDC does have a built-in load balancer as part of each Local Gateway network, however that does not allow for traffic monitoring, which is essential to this project. The HAProxy VM will be connected to two networks, one (Public Direct Connect) directly connected to the Internet, and the other (Local Gateway) network will be used to connect to the elastic cluster of Webserver VMs.

First I deploy an Ubuntu 14.04 VM in the Frankfurt zone:

> deploy virtualmachine networkids=UUID1,UUID2 displayname=Webcluster-HAProxy-Loadbalancer name=Webcluster-HAProxy-Loadbalancer zoneid=7144b207-e97e-4e4a-b15d-64a30711e0e7 templateid=545abbf5-5d37-4156-8c4f-85571f1bd5b3 serviceofferingid=4cb92069-e001-4637-8848-76d74f406bb8

You need to replace UUID1 and UUID2 with your network IDs for the Public Direct Connect and Local Gateway networks; you must put the IDs in that order so that Ubuntu is configured with its Internet connection as the default network. The 'service offering' used here is for a small VM (1 CPU, 1 GB RAM). If more resources are required, just change the 'serviceoffering'.

Make a note right away of the virtual machine password (the value of the key 'password' shown in the API response).

Now I use ssh to login to the VM, using the password just recorded:

phillip$ ssh

(Note that for every first connection to a VM, the ssh program will ask you to confirm that you trust the connection.)

The first task is to install and enable the HAProxy program:

ubuntu@Webcluster-HAProxy-Loadbalancer:~$ sudo apt-get install haproxy

Edit the file /etc/default/haproxy so that it contains this line:


And start the HAProxy service:

ubuntu@Webcluster-HAProxy-Loadbalancer:~$ sudo service haproxy start

Also, install the Apache webserver program:

ubuntu@Webcluster-HAProxy-Loadbalancer:~$ sudo apt-get install apache2

Test that the webserver is running and accessible by typing the 'public' IP address of the load balancer into a web browser on your own computer (in my case, You should see the default 'welcome to Apache' page.

That's the first part of the load balancer set up. I will come back to this VM at the end to set up the elastic autoscaler program.

Create a shared NFS disk for the webcluster (The bottom of the sandwich)

To store data for my webcluster, I will create a VDC 'datadisk' using SSD, then attach the disk to a VM running as an NFS server. These discs are based on EBS (elastic block storage), which can be configured with user-specified size and they can be automatically backed up ('protected', for a higher unit cost per GB).

A new disk has to be partitioned, formatted and mounted in order to be usable by the virtual machine. But once it is formatted the data stored on the disk is permanent and exists independently of the VM it is attached to.

Here is the Cloudmonkey API command to create a new datadisk of size 50 GBytes in the Frankfurt zone:

> create volume name=Webcluster-NFS-datadisk size=50 zoneid=7144b207-e97e-4e4a-b15d-64a30711e0e7 diskofferingid=67ec69a7-45de-491d-84b8-321bedf30a21

You will need to choose a disk size corresponding to the amount of storage you currently require for your web-based application(s). The size can be expanded anytime later, so there is no need to be spending money on unused disk space.

I am using the 'diskofferingtype' for 'EBS SSD Protected' which uses fast SSD-based storage, and features automated snapshot backup (every 4 hours) of the datadisk. However all of that happens in the VDC orchestration layer and the VM only detects the new disk as an 'EBS' medium.

Now to deploy the virtual machine:

> deploy virtualmachine networkids=UUID displayname=Webcluster-NFS-server name=Webcluster-NFS-server zoneid=7144b207-e97e-4e4a-b15d-64a30711e0e7 templateid=545abbf5-5d37-4156-8c4f-85571f1bd5b3 serviceofferingid=4cb92069-e001-4637-8848-76d74f406bb8

You need to replace UUID with your UUID for the network 'Frankfurt-Webcluster-NFS-network'. The 'serviceoffering' used here is for a small VM (1 CPU, 1 GB RAM). If more resources are required, just change the 'serviceoffering'.

Make a note right away of the virtual machine password (the value of the key 'password' shown in the API response).

Attach the datadisk to the virtual machine:

> attach volume id=3b8af552-1e36-43a4-8c70-b4d78a843a6c virtualmachineid=f953f2c8-e50c-4f8c-bd49-916fdbea89e7

You will need to substitute the UUIDs for your own volume and VM in this command. After attaching the datadisk, you will need to restart the VM so that the Ubuntu operating system detects the new disk, which will appear initially as a device /dev/sdb.

> reboot virtualmachine id=f953f2c8-e50c-4f8c-bd49-916fdbea89e7

At this point, I need to be able to login to the VM operating system, which requires a port-forwarding rule to be added to the network. First, I will lookup the UUID for the public IP address that is attached to the Local Gateway network:

> list publicipaddresses filter=id,ipaddress,zonename
|   ipaddress      |                  id                  |      zonename     |
|  | 35e87ceb-1268-4ff6-89ac-4f929c06cf83 |  Frankfurt (ESX)  |
|  | 4d200603-3539-49d0-a9e8-2b47614ca4fd |  Frankfurt (ESX)  |
.... ETC ....

And insert the required UUIDs for the IP address and the virtual machine to create the port-forwarding rule:

> create portforwardingrule protocol=tcp privateport=22 publicport=22 openfirewall=true ipaddressid=35e87ceb-1268-4ff6-89ac-4f929c06cf83 virtualmachineid=f953f2c8-e50c-4f8c-bd49-916fdbea89e7

And then I login using the 'ssh' program from my own computer:

phillip$ ssh

The steps to partition, format and mount the disk are as follows:

ubuntu@Webcluster-NFS-server:~$ sudo fdisk /dev/sdb

This command presents a series of prompts, for which I entered the values: 'n', 'p', '1', press enter for the default options for the sectors, then finally 'w' to write this configuration to the disk. At the end of that you should see a new device named /dev/sdb1. And then the following three commands to complete the setup:

ubuntu@Webcluster-NFS-server:~$ sudo mkfs -t ext4 /dev/sdb1
ubuntu@Webcluster-NFS-server:~$ sudo mkdir /media/nfsdisk
ubuntu@Webcluster-NFS-server:~$ sudo mount /dev/sdb1 /media/nfsdisk

Note that the results of the 'mount' command are not remembered after a machine restart, so I edited the contents of the file /etc/fstab to add a new line like this:

/dev/sdb1   /media/nfsdisk  ext4   defaults     0 0

Now, I will install and set up the NFS share:

ubuntu@Webcluster-NFS-server:~$ sudo apt-get install nfs-kernel-server

And edit the file /etc/exports to include this line:


And restart the NFS server to implement this configuration:

ubuntu@Webcluster-NFS-server:~$ sudo service nfs-kernel-server restart

The drive is being shared with any VM in the '114' network, which will be able to read and write data. There are many possible export configurations for NFS, this is only a simple one.

Check that the export is correct:

ubuntu@Webcluster-NFS-server:~$ sudo exportfs

When all is finished in configuring the NFS server VM, I can delete the portforwarding rule and egress rule to isolate the VM from the Internet. (An isolated machine is not able to access OS updates, which is not recommended, so in a production system you would need to create a means for updating.)

Create a template for the webserver virtual machines (The sandwich filling)

To create the template, I will deploy a virtual machine, install the necessary programs and data that I need for each webserver, and make a 'snapshot' copy of the VM's root disk.

First, to deploy the 'template master' VM:

> deploy virtualmachine networkids=UUID1,UUID2 displayname=Webcluster-TEMPLATE-MASTER name=Webcluster-TEMPLATE-MASTER zoneid=7144b207-e97e-4e4a-b15d-64a30711e0e7 templateid=545abbf5-5d37-4156-8c4f-85571f1bd5b3 serviceofferingid=4cb92069-e001-4637-8848-76d74f406bb8

You need to replace UUID1 and UUID2 with your network IDs for the Local Gateway networks; you must put the UUID of the 'Frankfurt-Webcluster-web-network' first. The 'service offering' used here is for a small VM (1 CPU, 1 GB RAM). If more VM resources are required, just change the 'serviceoffering'.

Make a note right away of the virtual machine password (the value of the key 'password' shown in the API response).

I need to create a portforwarding rule to have access to this machine:

> create portforwardingrule protocol=tcp privateport=22 publicport=22 openfirewall=true ipaddressid=4d200603-3539-49d0-a9e8-2b47614ca4fd virtualmachineid=7dc64cfe-34c7-458f-80f0-0aa741a10d26

Login to the VM:

phillip$ ssh

And install the Apache webserver software and basic PHP functions:

ubuntu@Webcluster-TEMPLATE-MASTER:~$ sudo apt-get install apache2
ubuntu@Webcluster-TEMPLATE-MASTER:~$ sudo apt-get install php5

For accessing the NFS-shared disk, install the NFS client software, and mount the NFS drive:

ubuntu@Webcluster-TEMPLATE-MASTER:~$ sudo apt-get install nfs-common
ubuntu@Webcluster-TEMPLATE-MASTER:~$ sudo mount /media/nfsshared

And create a new folder for the webserver content:

ubuntu@Webcluster-TEMPLATE-MASTER:~$ sudo mkdir /media/nfsshared/apachehtml

To make a permanent mounting, edit the file /etc/fstab and insert a new line: /media/nfsshared nfs  rsize=51474048,wsize=51474048,timeo=14,intr

(The size number is the number of 1K blocks in the disk, which can be found by typing df at the command line.)

I will create some simple website content: a one line display which states the private IP address of the webserver, so I can see that the load balancer is correctly distributing requests to different webservers. I create a file /media/nfsshared/apachehtml/index.php with the following content:

echo "<html><body><h1>This is server: ";
echo $_SERVER['SERVER_ADDR']; // displays the webserver's IP address.
echo "</h1></body></html>";

That completes the minimal setup for the webserver. It's also a good idea to update the operating system for the latest versions of packages, before creating the template: see steps 3 and 4 at the Ubuntu Server Guide.

Now, I go to my local machine and Cloudmonkey, to stop the TEMPLATE-MASTER, make a snapshot, and a new template.

Stop the VM:

> stop virtualmachine id=7dc64cfe-34c7-458f-80f0-0aa741a10d26

Look up the UUID of the VM's root disk:

> list volumes filter=id,name,state virtualmachineid=7dc64cfe-34c7-458f-80f0-0aa741a10d26id
count = 1
| state |                  id                  |     name    |
| Ready | 47a254cf-9e2e-47c3-ac9e-9e721d31d7ab | ROOT-367289 |

Create a snapshot from the root disk:

> create snapshot volumeid=47a254cf-9e2e-47c3-ac9e-9e721d31d7ab

And create a template from the snapshot, in the VDC Frankfurt zone:

> create template ostypeid=99 name=Webcluster-web-TEMPLATE displaytext=Webcluster-web-TEMPLATE passwordenabled=true snapshotid=21d15bcf-55c5-4065-abbe-be1b6795e63b

The 'ostypeid' should be the same as the original virtual machine (use 'listVirtualMachines' to check it).

Running an elastic scaling program

That completes the VDC infrastructure of networks, virtual machines, EBS disk storage, and a template for the deployment of 'elastic' webserver VMs.

The last thing I need is a control program to monitor loading at regular intervals and perform the elastic scaling. I am going to set this program to run on the HAProxy VM, because this is the simplest way to alter the HAProxy configuration to correspond to the cluster of currently-running webserver VMs.

I will login to the HAProxy VM and download the files for a Python-based control program:

ubuntu@Webcluster-HAProxy-Loadbalancer:~$ wget
ubuntu@Webcluster-HAProxy-Loadbalancer:~$ wget
ubuntu@Webcluster-HAProxy-Loadbalancer:~$ wget

And set the configuration in the file at the following lines:

vmName = 'Webcluster-Frankfurt-web-'
haproxyStatsURL = 'http://213.XXX.XXX.137/haproxy?stats;csv'
haproxyStatsUser = 'statsuser'
haproxyStatsPassword = 'mypassword'
zoneID = '7144b207-e97e-4e4a-b15d-64a30711e0e7'
templateID = '545abbf5-5d37-4156-8c4f-85571f1bd5b3'
serviceofferingID = '4cb92069-e001-4637-8848-76d74f406bb8'
networkIDs = 'e7d5c36d-ff6f-42c1-bbc3-8fd83a2b97f4,82d6f7cb-d654-4a5e-949b-8fecb8f66c04'
haproxyConfigFileStatic = 'haproxy_cfg_static'
haproxyConfigFile = '/etc/haproxy/haproxy.cfg'
maxVM = 10			
minVM = 1
sessionPerVM = 5

You must insert your own values for the different parameters in the first eight lines; the other lines are option settings. 'vmName' should be a unique stem for each cluster of VMs (the actual VM names will have datestamps added to the stem). For 'networkIDs' the first UUID should be for the 'web-network' (that is, the Local Gateway network containing the HAProxy VM).

The way the scaling works is that the program measures the current number of web sessions being handled by HAProxy (actually it measures twice over a 10-second interval and takes the average), and divides this number by the 'sessionPerVM' number to get a 'required' number of VMs. For example, there are 32 web sessions, so the required number of VMs is 32/5, which (rounded upwards) is 7. The required number is compared with the number of VMs and if they are not equal, a calculated number of VMs will be deployed, or destroyed. For example, 4 VMs are running, the difference is 7 - 4, that means 3 new VMs should be created.

The control program is setup to run at regular intervals; the program has a default 3 minute interval, which you can change with the 'interval' option.

ubuntu@Webcluster-HAProxy-Loadbalancer:~$ sudo python -c vdc_config -i 180

When you run this for the first time, with all the configurations done above, the program should deploy one VM:

loadbased_autoscaler: Starting at 2016-09-27 13:09:56 UTC
Current virtual machines: 0
VM list:
Current web sessions: 1
Required VMs (for target loading of 5 sessions per VM): 1
Change to VMs: 1
Creating 1 VMs now...
  creating: Webcluster-Frankfurt-web-20160927T131006-001
VM deploy completed: Webcluster-Frankfurt-web-20160927T131006-001. 0 deploys left to complete.
Finished the deployment of virtual machines.
New configuration written to config file /etc/haproxy/haproxy.cfg
 * Restarting haproxy haproxy
Service haproxy restarted.
loadbased_autoscaler: Finished at 2016-09-27 13:13:35 UTC

Press Ctrl+C to stop the program. You can monitor the HAProxy load balancer using its web interface, which will be at the following address: ''

To keep the program running and repeating the autoscaling at regular intervals, it is best to run the program 'in the background' like this:

ubuntu@Webcluster-HAProxy-Loadbalancer:~$ sudo python -c vdc_config > /dev/null 2>&1 &

The print and error output of the program is turned off, but all of the program activity will be stored in a logfile (the default logfile is 'autoscaler.log' in the same folder as the program is running). You can disconnect from the virtual machine and the program will carry on running, either until the VM is shutdown, or the program has a failure.

Testing the elastic scaling program

As a first test, I will bypass the load monitoring and simply specify a number of required VMs. Do this by un-commenting this line in the Python program

#requiredVmCount = 3

Now run the control program manually:

ubuntu@Webcluster-HAProxy-Loadbalancer:~$ sudo python -c vdc_config

You need to check that new VMs deploy correctly and that 'surplus' VMs are expunged correctly. The HAProxy program should detect the changes and rebuild the set of webserver VMs available to receive web requests. Change 'requiredVmCount' up and down and check that HAProxy reconfigures correctly.

Next, I will do a simple load test to check that the load monitoring is working. I will install the Apache Benchmark tool on my local computer:

phillip$ sudo apt-get install apache2-utils

(Webserver testing involves quite a lot of data movement, so you probably want to do this on a network connection where you are not being charged per volume of data.) And do a load test, for example:

phillip$ ab -n 20000 -c 100 -k http://213.XXX.XXX.137/

This means 100 simultaneous requests to the webserver, and keep going until 20,000 requests are completed. If you watch the HAProxy statistics report (http://213.XXX.XXX.137/haproxy?stats) you will see the traffic building up. Note that on fast network connections the requests will be performed more quickly, so you may need to increase the 20,000 figure.

The time taken to deploy a VM is relatively long (around 2 minutes), so if the loading on the HAProxy varies faster than the deploy time, or the time taken to remove VMs (which is usually faster, about 30 to 60 seconds) then the autoscaler won't be able to keep pace with the loading. A 'spike' of traffic lasting a couple of minutes might be undetected by the autoscaler, or it might trigger new VMs to be deployed, which are going to be redundant by the time they are deployed and running.


youtubelinkedinfacebookgoogle plustwitter

Interoute's Awards

UK Cloud Awards 2016
Best Cloud Service Provider at the UK Cloud Awards 2016

Gartner Magic Quadrant
2017 Leader in Gartner's Magic Quadrant for Managed Hybrid Cloud Hosting, Europe
2016 Leader in Gartner's Magic Quadrant for Managed Hybrid Cloud Hosting, Europe

Gartner 2015 Magic Quadrant
Leader in Gartner's Cloud-Enabled Managed Hosting Magic Quadrant, Europe

IT Europa European IT & Software Excellence Award
2015 Managed Service Solution of the Year for Interoute Virtual Data Centre

ECO Internet Award
2014 Carrier/ISP Winner for Interoute One Bridge

Capacity Awards
2014 Best pan-European Wholesale Carrier
2013 Best European Product/Service Interoute SmartPoint

Data Center Market
2014 Data Centre Innovation