Building a SaaS application in multiple zones of Interoute Virtual Data Centre
- Design overview
- Create private and public networks
- Deploy virtual machines
- Connecting to the virtual machines
- Testing the private network latency and throughput
- Deploy and mount datadisks
- Install and configure a database cluster (MariaDB and Galera)
- Install the rest of the LAMP stack: Apache and PHP
- Install a SaaS application: Piwigo
This tutorial shows how to build a photo-sharing web app running in two VDC zones, VDC Frankfurt and VDC Paris, with a MariaDB-Galera clustered database running in active-active synchronisation mode. With this it is possible to show the real achievable performance of network latency and throughput, between Frankfurt and Paris. And to show the possibility of fast sychronisation of (large) volumes of data. Because there are no charges for data transfers in Interoute VDC, it is a very cost-effective cloud platform for distributed computing applications.
You need to have a VDC account, with API access turned on, and you have the two 'API keys' to configure your API connection.
For setup instructions for the VDC API and the Cloudmonkey API tool, see VDC API: Introduction to the API.
The goal of this tutorial is to create a 'Software as a Service' type application in two zones of Interoute VDC, Frankfurt and Paris. In each zone there will be a virtual machine (Ubuntu 14.04 Linux) running a MariaDB database server, an Apache web server and PHP libraries, and a 'LAMP stack' photo-sharing application, Piwigo. There are many applications designed to run on a LAMP stack base, and these should drop-in exactly where Piwigo has been used in this tutorial. The last element of the design is that the Galera database clustering system is used to create a live synchronisation between the two MariaDB database servers (sometimes called an 'active-active', or multiple master, setup).
(Some LAMP applications store configuration and/or content data in the local file system, outside of the database. Obviously some or all of this configuration and data will need to be shared across the multiple instances of the application, and the Galera clustering won't do it. So additional file synchronisation would have to be implemented. I am going to ignore such additional complexity for this tutorial.)
This design could be used as a distributed application where both virtual machines are in use simultaneously, the idea being to give users more responsiveness by locating the data and compute closer to users, or it could be used as a type of 'failover' disaster recovery setup where one virtual machine is used live (say Frankfurt), and its data is constantly copied to the database at the secondary site (Paris), where the webserver is switched off. In the event that the live machine experiences a failure (in the machine itself, or in the rare case of an outage of the whole zone, or its network connection), the secondary site could immediately take over with the least possible loss of service for users. If the two sites are connected to a proxy webserver such as HAProxy, an automated failover of DNS records would ensure that users experienced a near-continuity of service.
The first step is to create the networks that the virtual machines will use. Each VM will have a public network ('Local with Internet Gateway') connecting to the Internet, and a private network ('Private Direct Connect') through which the databases will synchronise data. The network interconnection is achieved due to the networks belonging to the same 'Direct Connect Group'. You can read more about the different network types at VDC API: How to create a network.
First the Local with Internet Gateway networks.
createLocalNetwork displaytext='Gateway-Frankfurt' zonename='Frankfurt (ESX)' cidr=192.168.113.0/24 gateway=192.168.113.254
createLocalNetwork displaytext='Gateway-Paris' zonename='Paris (ESX)' cidr=192.168.104.0/24 gateway=192.168.104.254
And now the private networks. These 'Private Direct Connect' networks require that I have a 'Direct Connect Group' (DCG) in my VDC account. So first I check that a DCG is present:
listDirectConnectGroupscount = 1 directconnectgroups: +------+-------+---------+----------+ | sids | id | name | networks | +------+-------+---------+----------+ |  | 35934 | Default |  | +------+-------+---------+----------+
And I create the networks:
createPrivateDirectConnect zonename='Frankfurt (ESX)' cidr=10.0.101.0/24 gateway=10.0.101.254 displaytext='privateDC-dcg35934-FRA' dcgid=35934
createPrivateDirectConnect zonename='Paris (ESX)' cidr=10.0.105.0/24 gateway=10.0.105.254 displaytext='privateDC-dcg35934-PAR' dcgid=35934
Egress rules need to be created for each of the Local Gateway networks, so that the VMs can communicate outward to the Internet to download software. I am going to use the simplest and very permissive rule, which would not be recommended for a production system.
create egressfirewallrule networkid=UUID protocol=all cidr=0.0.0.0/0
Replace UUID with the network ID for your Local network.
One virtual machine will be deployed in each zone, using an Ubuntu version 14.04 Certified image template, with 2 GByte of RAM and 1 virtual CPU. Certified templates offer improved performance over the standard public Ubuntu templates, and VMs running on Interoute VDC can be covered by Canonical's paid support services for Ubuntu. (And note that Ubuntu Certified templates have the default admin user set to 'ubuntu' and not root.)
In the following API calls, you will need to replace the 'networkids' with your own UUIDs and note that you need to put the Local Gateway network first in the list so that network becomes the default network for the Ubuntu operating system.
Frankfurt virtual machine:
deploy virtualmachine networkids=6202e7b7-38a5-40a0-8ca6-511e2a16159a,e8110a4a-1c38-49d6-b03d-9b22c7cda659 displayname=ubuntu1404-SaaS-test-FRA name=ubuntu1404-SaaS-test-FRA zoneid=7144b207-e97e-4e4a-b15d-64a30711e0e7 keypair=keypair-saas templateid=cd136c46-4b74-4968-b59a-00e78c577d34 serviceofferingid=7e8da6c8-586b-4226-b12d-151861aba887
Paris virtual machine:
deploy virtualmachine networkids=1399bc16-01d6-47cf-a471-308ee763d413,4b32613c-e610-40fe-a281-99f6d84218b0 displayname=ubuntu1404-SaaS-test-PAR name=ubuntu1404-SaaS-test-PAR zoneid=374b937d-2051-4440-b02c-a314dd9cb27e keypair=keypair-saas templateid=cd136c46-4b74-4968-b59a-00e78c577d34 serviceofferingid=7e8da6c8-586b-4226-b12d-151861aba887
I have included the
keypair option so that I will be able to connect to the VMs using an SSH keypair, which saves effort with using passwords and is much more secure. You will need to upload an SSH public key in advance using the registerSSHKeyPair API call.
Now that the virtual machines are deployed, I need to create some port-forwarding rules so that the VMs will be accessible from the Internet for SSH connection and web access. First I need to look up the UUIDs of the public IP addresses, which need to be included in the port-forwarding rule.
list publicipaddresses filter=id,ipaddress,zonenamecount = 2 publicipaddress: +----------------+--------------------------------------+-------------------+ | ipaddress | id | zonename | +----------------+--------------------------------------+-------------------+ | 213.xx.xx.139 | 76839351-3323-4791-9dc5-34845bc5b24c | Frankfurt (ESX) | | 213.xx.xx.221 | 14d081f4-593e-42d1-95a8-16aa7e8a0e62 | Paris (ESX) | +----------------+--------------------------------------+-------------------+
And the four rules:
create portforwardingrule privateport=22 publicport=52200 ipaddressid=76839351-3323-4791-9dc5-34845bc5b24c virtualmachineid=52c0faf0-c15d-495f-8314-e24f1bfb6134 enablefirewall=true protocol=tcp
create portforwardingrule privateport=80 publicport=80 ipaddressid=76839351-3323-4791-9dc5-34845bc5b24c virtualmachineid=52c0faf0-c15d-495f-8314-e24f1bfb6134 enablefirewall=true protocol=tcp
create portforwardingrule privateport=22 publicport=52200 ipaddressid=14d081f4-593e-42d1-95a8-16aa7e8a0e62 virtualmachineid=3a108d7b-3d0e-4d13-b0bd-51d52b80197a enablefirewall=true protocol=tcp
create portforwardingrule privateport=80 publicport=80 ipaddressid=14d081f4-593e-42d1-95a8-16aa7e8a0e62 virtualmachineid=3a108d7b-3d0e-4d13-b0bd-51d52b80197a enablefirewall=true protocol=tcp
After all of the above I have got the following configuration of networks and virtual machines:
― 'Gateway-Frankfurt' (Zone: Frankfurt (ESX), CIDR: 192.168.113.0/24, IP: 213.xxx.xxx.139) └ 192.168.113.40: 'ubuntu1404-SaaS-test-FRA' (ports: -> ->) ― 'Network Private Direct Connect Interoute Tutorial 1'|'privateDC-dcg35934-FRA' (Zone: Frankfurt (ESX), CIDR: 10.0.101.0/24) └ 10.0.101.2: 'ubuntu1404-SaaS-test-FRA' ― 'Gateway-Paris' (Zone: Paris (ESX), CIDR: 192.168.104.0/24, IP: 213.xxx.xxx.221) └ 192.168.104.230: 'ubuntu1404-SaaS-test-PAR' (ports: -> ->) ― 'Network Private Direct Connect Interoute Tutorial 2'|'privateDC-dcg35934-PAR' (Zone: Paris (ESX), CIDR: 10.0.105.0/24) └ 10.0.105.147: 'ubuntu1404-SaaS-test-PAR'
I use the 'ssh' command in Linux and MacOS, or PuTTY in Windows:
ssh -p 52200 email@example.com -i ~/.ssh/id_rsa_SaaSkey
~/.ssh/id_rsa_SaaSkey contains the private key for my SSH keypair. I have xxx'd out the public IP address for reasons of privacy.
One final configuration thing that I need to do is to set a routing rule on the VMs so that the networks will function correctly, by routing private network traffic through the private gateway.
I login to the Frankfurt VM and input:
sudo ip route add 10.0.0.0/8 via 10.0.101.254
And login to the Paris VM and input:
sudo ip route add 10.0.0.0/8 via 10.0.105.254
You will need to set the gateway addresses (above, 10.0.101.254) according to the actual addresses that you specified when you setup the networks.
I am going to do a latency test with the ping program, which is pre-installed in Ubuntu. You can see from the sample output below that the latency is just over 9 milliseconds, and this performance is very reliable.
ping 10.0.101.2PING 10.0.101.2 (10.0.101.2) 56(84) bytes of data. 64 bytes from 10.0.101.2: icmp_seq=1 ttl=62 time=9.09 ms 64 bytes from 10.0.101.2: icmp_seq=2 ttl=62 time=9.09 ms 64 bytes from 10.0.101.2: icmp_seq=3 ttl=62 time=9.10 ms 64 bytes from 10.0.101.2: icmp_seq=4 ttl=62 time=9.16 ms 64 bytes from 10.0.101.2: icmp_seq=5 ttl=62 time=9.09 ms 64 bytes from 10.0.101.2: icmp_seq=6 ttl=62 time=9.15 ms 64 bytes from 10.0.101.2: icmp_seq=7 ttl=62 time=9.11 ms 64 bytes from 10.0.101.2: icmp_seq=8 ttl=62 time=9.08 ms 64 bytes from 10.0.101.2: icmp_seq=9 ttl=62 time=9.16 ms 64 bytes from 10.0.101.2: icmp_seq=10 ttl=62 time=9.08 ms
Next, a TCP throughput test with the 'iperf' program, which requires that you install iperf on both virtual machines:
sudo apt-get install iperf
sudo apt-get install iperf
Then on one of the VMs I type the following to run an iperf server process:
And on the other VM:
iperf -c 10.0.105.147 -P 10------------------------------------------------------------ Client connecting to 10.0.105.147, TCP port 5001 TCP window size: 85.0 KByte (default) ------------------------------------------------------------ [ 11] local 10.0.101.2 port 50996 connected with 10.0.105.147 port 5001 [ 3] local 10.0.101.2 port 50988 connected with 10.0.105.147 port 5001 [ 4] local 10.0.101.2 port 50989 connected with 10.0.105.147 port 5001 [ 5] local 10.0.101.2 port 50990 connected with 10.0.105.147 port 5001 [ 6] local 10.0.101.2 port 50991 connected with 10.0.105.147 port 5001 [ 8] local 10.0.101.2 port 50993 connected with 10.0.105.147 port 5001 [ 9] local 10.0.101.2 port 50994 connected with 10.0.105.147 port 5001 [ 7] local 10.0.101.2 port 50992 connected with 10.0.105.147 port 5001 [ 10] local 10.0.101.2 port 50995 connected with 10.0.105.147 port 5001 [ 12] local 10.0.101.2 port 50997 connected with 10.0.105.147 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 224 MBytes 188 Mbits/sec [ 7] 0.0-10.0 sec 100 MBytes 83.8 Mbits/sec [ 11] 0.0-10.0 sec 547 MBytes 457 Mbits/sec [ 4] 0.0-10.0 sec 202 MBytes 168 Mbits/sec [ 10] 0.0-10.0 sec 494 MBytes 413 Mbits/sec [ 12] 0.0-10.0 sec 549 MBytes 459 Mbits/sec [ 5] 0.0-10.1 sec 218 MBytes 182 Mbits/sec [ 6] 0.0-10.0 sec 174 MBytes 145 Mbits/sec [ 9] 0.0-10.1 sec 272 MBytes 227 Mbits/sec [ 8] 0.0-10.1 sec 229 MBytes 191 Mbits/sec [SUM] 0.0-10.1 sec 2.94 GBytes 2.51 Gbits/sec
The specification for throughput performance on Interoute VDC private networks between zones is 3 Gbits/sec, and the measurements which I made with iperf ranged from about 2.5 to 3.5 Gbits/sec. TCP throughput does decline as the network latency increases because TCP communication is two-way: it requires the recipient machine to acknowledge the arrival of every data packet back to the sender machine (and re-sending of packets in case of data loss), a process which takes longer as the latency/distance grows. So if you measure with iperf for widely-separated zones the throughput wil be lower than measured in this demonstration.
In a SaaS setup, data is by far the most fragile element, and you will be doing a range of things to keep your data safe and accessible. Besides what is done at the operating system level, the main way to protect data is to create 'datadisks' in VDC. These discs are based on EBS (elastic block storage) and can be configured with user-specified size and (for a higher unit cost) can be automatically backed up ('EBS protected' or 'EBS mirrored').
I need to create a datadisk in each of the zones that I am using, then attach the disk to the virtual machine. The new disk will have to be partitioned, formatted and mounted in order to be usable by the virtual machine.
Here is the Cloudmonkey API command to create a new datadisk of size 200 GBytes in the Frankfurt zone:
create volume name=datadisk-mariadb-FRA size=200 zoneid=7144b207-e97e-4e4a-b15d-64a30711e0e7 diskofferingid=e3752c57-6742-4e45-b259-c6b225f155a7
And attach it to the virtual machine:
attach volume id=14ff2aac-95d0-416a-acc1-5076167f4fbe virtualmachineid=5c8fe455-ff67-41a8-a002-015fdcfff527
You will need to substitute the UUIDs for your own volumes and VMs.
I am using the 'diskofferingtype' for 'EBS Protected' which features automated snapshot backup of the datadisk, with the snapshots stored in the same zone; 'EBS Mirrored' offers additional snapshot backup into another geographically-separated zone, but this is less necessary here as I am already using two zones for the databases.
The steps to partition, format and mount the disk are as follows.
$ sudo fdisk /dev/sdbubuntu1404-SaaS-test-FRA:~$
$ sudo mkfs -t ext4 /dev/sdb1ubuntu1404-SaaS-test-FRA:~$
$ sudo mkdir /media/datadisk-mariadbubuntu1404-SaaS-test-FRA:~$
$ sudo mount /dev/sdb1 /media/datadisk-mariadb
(see Ubuntu: Installing a New Hard Drive for more details.)
Below, I will need to configure MariaDB to use the new folder '/media/datadisk-mariadb' rather than the default location in the main OS filesystem.
MariaDB is an open source database project founded by the original creators of MySQL, and MariaDB programs are drop-in compatible with MySQL programs. But MariaDB has some useful extensions, the particular advantage here is that the Galera software for clustering is installed and ready-to-use.
To get the latest version of MariaDB (10.1.x), with Galera already integrated, it is necessary to add a software repository to Ubuntu's default list. To find the necessary parameters for each type/version of Linux, I used this webpage: https://downloads.mariadb.org/mariadb/repositories. From that webpage I found the following three commands:
sudo apt-get install software-properties-commonubuntu1404-SaaS-test-FRA:~$
sudo apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xcbcb082a1bb943dbubuntu1404-SaaS-test-FRA:~$
sudo add-apt-repository 'deb [arch=amd64,i386] http://mirrors.coreix.net/mariadb/repo/10.1/ubuntu trusty main'
Then, refresh the Ubuntu repositories, and install MariaDB:
sudo apt-get updateubuntu1404-SaaS-test-FRA:~$
sudo apt-get install mariadb-server
The next step is to change the 'data directory' to be located in the additional datadisk. Copy the 'mysql' folder from the default location to the new location:
sudo cp -R -p /var/lib/mysql /media/datadisk-mariadb
Create the file '/etc/mysql/my.cnf', remove the default configuration file.
sudo cp /etc/mysql/my.cnf /etc
sudo mv /etc/mysql/my.cnf /etc/mysql/my.cnf.old
In '/etc/mysql/my.cnf' change the value of 'datadir' as follows:
Now restart the database server program to update to the new configuration:
sudo /etc/init.d/mysql restart
The next part is to configure and activate the Galera clustering system, which will synchronise the two database servers automatically.
The minimum recommended cluster size for Galera is three database instances. For demonstration purposes I am only using two instances here.
I edited the MariaDB config file 'etc/my.cnf' so that the '[galera]' section looks like the following, on the Paris VM:
[galera] wsrep_cluster_name=VDCCluster wsrep_cluster_address="gcomm://10.0.101.2,10.0.105.147" wsrep_node_name=VDCNode1 wsrep_node_address="10.0.105.147" wsrep_provider=/usr/lib/galera/libgalera_smm.so wsrep_on=ON binlog_format=row default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 # Allow server to accept connections on all interfaces. bind-address=0.0.0.0
And on the Frankfurt VM:
[galera] wsrep_cluster_name=VDCCluster wsrep_cluster_address="gcomm://10.0.101.2,10.0.105.147" wsrep_node_name=VDCNode2 wsrep_node_address="10.0.101.2" wsrep_provider=/usr/lib/galera/libgalera_smm.so wsrep_on=ON binlog_format=row default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 # Allow server to accept connections on all interfaces. bind-address=0.0.0.0
Now, in the Paris VM, restart the database server with the new configuration, and initiate the cluster, based on the information given in the '/etc/my.cnf' file:
sudo /etc/init.d/mysql stopubuntu1404-SaaS-test-FRA:~$
sudo /etc/init.d/mysql start --wsrep-new-cluster
You only need to initiate the cluster one time, in one of the VMs. The other VMs just need to have the database server re-started, and the Galera process on each VM should automatically discover the other members of the cluster:
sudo /etc/init.d/mysql restart
One last thing to do is to copy the contents of the file '/etc/mysql/debian.cnf' from the Paris VM to the Frankfurt VM. If you don't do this the mysql program will throw up errors like this: "ERROR 1045 (28000): Access denied for user 'debian-sys-maint'@'localhost'". (What happens is that the user password for 'debian-sys-maint' is stored inside the database, and in a config file; when the databases connect and synchronise, a password mis-match is created.)
There are two more pieces of the LAMP stack to install—on both VMs, though I will only show the commands for one. First, the Apache web server:
sudo apt-get install apache2
If this installs correctly, and the port-forwarding rules on the VDC Local Gateway network are correct, then I can test the web server by pointing a web browser at 'http://213.xxx.xxx.139' and I should see the default Apache welcome screen. If this fails to work, try running this command on the VM itself:
wget -O - localhost
If you don't see a string of HTML as output, the webserver may have a problem; or if the HTML is good, there must be a problem with the VDC network configuration.
Second, a bunch of PHP programs and libraries need to be installed. Following are the standard set for Ubuntu; you may not need all of these for specific web applications, or you might need to install additional ones.
sudo apt-get install php5 php5-mysql php5-cli php5-imagick php5-gd php5-mcrypt php5-curl
Piwigo is an open source photo-sharing application. I had some errors using it, which seemed to be due to having incompatible versions of the various PHP libraries or possibly wrong PHP configurations (either of which are usually time-consuming to solve). Anyway, it worked well enough to demonstrate having two instances of the Piwigo program sychronising via their databases.
Here is what I did:
Download the Piwigo program sources into both VMs, in the default Apache web folder '/var/www/html'.
Use 'mysql' in each VM to create a new, empty database with name 'piwigo'.
Point a web browser at 'http://213.xxx.xxx.139/piwigo' which initiates the Piwigo install process. I filled in the configuration options as in the following screenshot.
Do the same thing for the other virtual machine: 'http://213.xxx.xxx.221/piwigo'
Login to both instances of Piwigo.
Create users, albums, upload photos, etc.