Posted & filed under Web Development.

Sandbox environment

The tutorial has been written specifically for an Ubuntu 12 server, on Amazon EC2, but it is still valid, with some minor changes, for a Centos server. Major differences are the login method and the name/path of th Apache configuration files.

We are going to use the command line for:

  • Server ssh connection to an Amazon EC2server

  • Install a brand new WordPress site

  • Run a load test

  • Monitor server performance

  • Troubleshooting

We’ll see the basic Linux shell commands for:

  • basic file management

  • create/extract archives

  • web browsing

  • text file editing

  • mysql management

  • Apache virtual host configuration

  • load testing

  • monitoring server resources

  • shell scripting

  • process management

 

How to login via ssh

I assume you already have downloaded the server access key locally. Change the permission on the pem file so that only the owner has read-write permission. This is to avoid a permission error under Linux, that won’t accept a key with read-write group permissions.

To change the file permission:

# cd /path/to/key/
# chmod 700 zone-sandbox.pem

Where 700 means: read-write-execute for owner, nothing for group, nothing for others.

Remember 7 = 4 (read) + 2 (write) + 1 (execute).

To connect:

# ssh -i /path/to/key/<your key>.pem ubuntu@<your server ip>

Amazon will not allow any direct login as root user, so we need to use ubuntu user and then become root, once logged in:

# sudo su

With Putty you can bookmark the server connection data.

The pem key needs first to be converted in Putty ppk format, using puttygen utility.

Under Windows it has a GUI interface (puttygen), while in Linux you just need to run:

# puttygen zone-sandbox.pem -o zone-sandbox.ppk

In case putty is not installed on your machine:

# sudo apt-get install putty

For Windows, see:

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html
 

Website setup

For this demo we are going to install a vanilla WP website connected to your instance public domain.

To install from scratch, we have to:

  • download WP package

  • extract to a new folder

  • create a new database

  • setup WP

  • create a new Apache virtual host

For this tutorial, we use the server instance public DNS, for example:

ec2-54-246-110-32.eu-west-1.compute.amazonaws.com

 

Install WP

First, we check the available server resources:

# df
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/xvda1       8256952 1633084   6204440  21% /

We still have 6G free on the drive.

# free
           total       used       free         shared    buffers          cached
Mem:        604376     488232     116144          0      37028         297500
-/+ buffers/cache:   153704     450672

We have 600M of RAM and 450 free.

# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Xeon(R) CPU           E5430  @ 2.66GHz

We have only 1 CPU.

# uname -a
70-Ubuntu SMP Wed May 29 20:12:06 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Let’s check the OS release:

# cat /etc/*release*
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.04
DISTRIB_CODENAME=precise
DISTRIB_DESCRIPTION="Ubuntu 12.04.2 LTS"
…

We have an Ubuntu 12 with a x64 architecture.

Let’s create the website folder.

The default apache document root is under /var/www.

# cd /var/www

List the documents:

# ll (2 lower case L)

We need to download the latest version of WordPress.

To download a file from the web, we use wget command.

# wget http://wordpress.org/latest.tar.gz

 

Note

To check if a page exists, without downloading it:

# wget --spider http://wordpress.org/latest.tar.gz

To call a page but not saving it:

# wget http://wordpress.org/latest.tar.gz -O /dev/null

wget offers a huge number of options:

# wget --help

To call and download web pages, an alternative is the curl command.

Curl offers even more options and protocols.

 

Now we have the compressed tar archive with the WP code, that we have to extract in a new directory.

To extract a tar.gz file:

# tar -xzf latest.tar.gz

In this case the archive will automatically extract in the “wordpress” folder.

 

Mysql setup

The next step is to create a new mysql database for WP.

Mysql offers several commands/programs to interact with the central database server.

The most commonly used are mysql (administration) and mysqldump (backup).

On the sandbox server, access the database as root, with the password (empty in this example) you have set during the setup:

# mysql -u root

Once logged in, you’ll see the mysql prompt. The mysql shell will execute any SQL command you type, terminated by ; and ENTER.

Some useful commands:

Show all the databases

> show databases;

Use (select) a database

> use <dbname>;

 

Now, create a new database for the site:

> create database wordpress;

Create a new user ‘zone’ for the wptest  database

> CREATE USER 'wordpress'@'localhost' IDENTIFIED BY 'password';
> GRANT ALL PRIVILEGES ON wordpress.* TO 'wordpress'@'localhost';
> FLUSH PRIVILEGES;

To leave the mysql shell:

> quit;

 

Setting wp configuration file

On a brand new WP installation, you have to copy and rename the wp-config-sample.php to wp-config.php.

# cp wp-config-sample.php wp-config.php

Edit the config file:

# nano wp-config.php

Move the cursor using the arrows keys. Once edited the database name, user name and password, save the changes and exit:

CTRL-O

CTRL-X

 

Apache virtual host

The virtual host configuration connect a domain (in this case your public instance domain name) to a web folder on the server (and so to a php application).

In Ubuntu, the virtual hosts should be contained in separate files (best practice) under the folder /etc/apache/sites-available folder.

# cd /etc/apache/sites-available

In the file list, you find the default file, that contains the default configuration to use when no match has been found between the domain and the virtual hosts.

Let’s create a new one for our site, using the existing one as a template:

# cp default wordpress

# nano wordpress

ServerName is the domain the vhost is matching, the DocumentRoot is the folder the vhost is pointing to. Edit the file to match the new wordpress root folder and save.

<VirtualHost *:80>
      ServerName <public domain name>
      DocumentRoot /var/www/wordpress
      <Directory /var/www/wordpress>
              Options Indexes FollowSymLinks MultiViews
              AllowOverride All
              Order allow,deny
              allow from all
      </Directory>
      ErrorLog ${APACHE_LOG_DIR}/wordpress-error.log
      LogLevel warn
      CustomLog ${APACHE_LOG_DIR}/wordpress-access.log combined
</VirtualHost>

 

Now that we have a new vhost file in the available sites folder, we need to enable it:

To enable a new vhost file:

# a2ensite wordpress

To disable a vhost file

# a2dissite <vhost file>

You can double check everything is fine, listing the files in the sites-enabled folder:

# ls /etc/apache/sites-enabled

You’ll find there a “symlink”, symbolic link, to the files you have edited in the sites-available folder (it means that it is not a real copy, but a simple pointer to the original file).

After any change to the Apache configuration or virtual hosts, you need to restart the Apache service in order to read the new configuration.

Reload and graceful command do the same: restart the Apache service once the current requests are completed, while restart suddenly stops the service and restart it (the browser will get an incomplete page or an error).

So, if the server is live, graceful is always the preferred method, otherwise restart:

# apachectl graceful

or

# apachectl restart

or

# service apache restart

The last command is useful to stop/start/restart any service or process that supports these commands, e.g. mysql.

If everything is fine, opening the new domain will load the WordPress setup page:

http://<public dns>

You don’t necessarily need to use a GUI browser on your local machine to complete the setup, or browse the internet. Linux offers “Lynx”, a refreshingly “minimalistic” text-based browser. It doesn’t support javascript, and so it can be used for accessibility tests.

# lynx <public dns>

 

Website administration

 

Database backup

Once completed the setup, let’s backup the database:

# mysqldump -u root --add-drop-table wordpress> wordpress.backup.sql

Mysqldump exports the database content in SQL format, as a sequence of sql commands.

To import the database again, we run those commands again, against a specific database:

# mysql -u root wordpress < wordpress.backup.sql

Note that mysql command can be used to run any string of sql commands, read from a text file.

The backup, in general, could be quite a large file, better compressing it:

# tar -czf wordpress.backup.tar.gz wordpress.backup.sql

 

Load test

For load testing, we use a simple tool that comes with the Apache package itself, a program named “ab”.

The load testing should not be done from the same server the website is running from. This will alter the results because it won’t take into account the network limits and the ab process itself, while running, will slow down the other processes, including Apache.

Similarly, it should not run from the local pc. Limitation in concurrency, bandwidth and resources of the local machine will significantly alter the results. A meaningful load test should run from a different web server, on a different network.

Ideally, the load test should run from several servers in different networks, with many different IPs (cloud load test). This gives the best real-life simulation and results but it also the most expensive method.

# ab -n 100 -c 2 http://<public dns>

This will send 100 requests, with a concurrency of 2 (2 simulated users browsing at the same time), to the site home page (please note the trailing slash), and download the page html only, producing some performance statistics. Note that ab will not download any of the page resources (images, videos, etc.) and will not run any javascript. It is useful only to get a basic performance baseline and profiling/debugging the code execution and the server setup.

The above command will return something like:

 

Server Software:        Apache/2.2.22
Server Hostname:        <public dns>
Server Port:            80

Document Path:          /
Document Length:        6884 bytes

Concurrency Level:      2
Time taken for tests:   47.248 seconds
Complete requests:      100
Failed requests:        0
Write errors:           0
Total transferred:      716600 bytes
HTML transferred:       688400 bytes
Requests per second:    2.12 [#/sec] (mean)
Time per request:       944.953 [ms] (mean)
Time per request:       472.477 [ms] (mean, across all concurrent requests)
Transfer rate:          14.81 [Kbytes/sec] received

Connection Times (ms)

            min  mean[+/-sd] median   max

Connect:        1    1   0.6      1       5
Processing:   128  943 986.6    251    3659
Waiting:      127  938 983.5    250    3658
Total:        129  944 986.7    252    3660


Percentage of the requests served within a certain time (ms)

50%        252
66%   1170
75%   1865
80%   2032
90%   2391
95%   2820
98%   3506
99%   3660
100%   3660 (longest request)


 

Requests per seconds: this is a measure of the best performances we can expect from the site, given that in a real page download, with all the images etc. this metric can only get worse, on average.

Time per request (mean across concurrent requests): the average time to complete a single page request.

95th percentile:  it means that 95% of the requests have been completed within this time (each request was actually 2 pages requests, since concurrency is 2).

 

Server monitoring

Back to our test, we also need to check how the server is performing under load. The main tool to monitor the server, in real time, is the “top” command.

# top

top - 10:06:21 up 21:32,  1 user,  load average: 1.57, 0.60, 0.25
Tasks:  86 total,   3 running,  83 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.8%us,  0.4%sy,  0.0%ni, 92.9%id,  0.3%wa,  0.0%hi,  0.0%si,  4.5%st
Mem:    604376k total,   532076k used,    72300k free,    32088k buffers
Swap:        0k total,        0k used,        0k free,   238780k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
17000 www-data  20   0  359m  66m  31m R 58.5 11.3   1:53.91 apache2
17001 www-data  20   0  354m  61m  32m R 39.6 10.4   1:53.84 apache2
1 root      20   0 24456 1492  500 S  0.0  0.2   0:00.58 init
…

 

Load average

In short, it gives an indication about how much the server is “struggling” to serve the requests and the processes running. The first number is the load average in the last 1 minute, then 5 minutes and then 15 minutes. As a rule of thumb, load 1 means you need all the resources of 1 machine like the current one to server all the requests without delays, in other words that the server is working at full capacity. A load bigger than 1 means that the server is in trouble keeping up with the work. In case of multi-core architecture, the load should be divided by the number of CPU to give meaningful results. So, if you see that (in a single-core server) the load is constantly above 1, we have a problem. If the load is above 5-10, there is the risk the machine is going to hang and the server crash, most likely because we are running out of RAM and excessive swapping will bring the server to an halt.  

Reference:

http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages

 

CPU

One of the most likely limiting factors for PHP based websites. Check the id (idle CPU) parameter to see how much “computing power” is left for the site to operate. On a virtual server, like Amazon, another important variable is the st (CPU stolen) variable: it tells how much of the CPU time is taken back by the virtual machine hypervisor and given to other machines, and so not usable by the server. A constant high st percentage means that the instance has not enough cpu power to sustain the load, and that a more powerful and expensive instance is needed, or that the load must be reduced, e.g. caching .

Process list

In this case it is clear that apache processes are using all the available CPU, and that causes the load to increase.

Memory

In this case, there is still a little memory left (70M + 30M in buffers) and no swapping.

 

Now, we need to run the load test and at the same time check the server load.

There are many ways to run concurrent processes in Linux. One way is to run a process in the background, using the & with the command.

# ab -n 10000 -c 2 http://<public dns> &

This will start the load test in the background, so we can stop the output printing (CTRL-X) and run the top command:

# top

Now we can see in real time the effect of our test and how the server responds. You’ll see that the load increases, since an Amazon micro-instance is not meant to get a constant load, but random, short burst of activity.

To stop the load test ab process in the background, we need to know the process ID (pid).

# ps aux | grep ab

Read the pid and then kill the process:

# kill <pid>

If you check top now, you’ll see that load will slowly decrease and CPU is back to idle.

 

Troubleshooting

If the server is slowing down and is almost non-responsive, it is likely out of memory and swapping, so you need to log in and stop the apache processes.

The first thing is running top and check that apache is actually overloading the server. If so, the quickest way is restarting it (not gracefully, or it could not restart in time):

# apachectl restart

This will cut all queued connections, restart Apache and release the memory. But, if the traffic is still too high for the server, we’ll soon have the same problem. Use top to monitor the server status, ready to stop Apache until the traffic has decreased.

# apachectl stop
# apachectl start

In case the apache process is unresponsive, and the apachectl command doesn’t work, you need to kill all apache processes:

# killall apache2

 

Apache log

Apache saves error and traffic logs, very useful for troubleshooting and debugging.

They are saved in the /var/log/apache folder:

# ll /var/log/apache2
wordpress-access.log
wordpress-error.log
...

The logs can become quite big (even GB if not properly configured for rotation), and they save the latest info at the end of the file, in lines. We need a program to read the log that won’t risk to fill the memory and crash the server. Also, we just need the latest lines.

# tail -n 20 wordpress-access.log

This will output the last 20 lines of the file, containing all single requests info:

37.157.35.178 - - [11/Jun/2013:11:07:56 +0000] "GET /wp-includes/css/admin-bar.min.css?ver=3.5.1 HTTP/1.1" 304 209 "http://sandbox.zonedemos.com/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:21.0) Gecko/20100101 Firefox/21.0"
37.157.35.178 - - [11/Jun/2013:11:07:56 +0000] "GET /wp-content/themes/twentyeleven/images/headers/willow.jpg HTTP/1.1" 304 187 "http://sandbox.zonedemos.com/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:21.0) Gecko/20100101 Firefox/21.0"
...

 

Cleanup

For the sake of the exercise, let’s assume that once done with our sandbox, it’s time to cleanup: website, vhost and database.

We will create a shell script to automate this.

# nano cleanup.sh

Write all the commands in the file:

#!/bin/bash
mysql -u root -e "DROP DATABASE wordpress; DROP USER 'wordpress'@'localhost'"
rm -r /var/www/wordpress
a2dissite wordpress
rm /etc/apache/sites-available/wordpress
apachectl restart
echo “Cleanup done!”

Save and exit:

CTRL-O

CTRL-X

To execute a script file, it must have execution permission:

# chmod 700 cleanup.sh

To execute a script that is in the current directory:

# ./cleanup.sh

A similar script could have been used to install wordpress in the first place. The beauty of the Linux shell.

Leave a Reply

  • (will not be published)