One of my current research pet ideas is benchmarking cloud providers. For one paper that I currently have in progress, I wanted to compare our benchmarking results with a standard Web benchmark. For this, I chose RUBiS. RUBiS is a quite well-established benchmark in the scientific community, with 300+ citations on Google Scholar. As Joseph of Arimathea would say: I chose poorly.
Very quickly after downloading the benchmark source code, it became evident that making this thing run in the year 2013 was not going to be a piece of cake:
- Hard-coded paths everywhere: check
- Very little useful documentation: check
- Quite obvious bugs in the source code: check
- Original target Java version: 1.3
- Original target PHP version: probably 4.X
- Original target MySQL version: pre-5.0
- Used build tool: make (not Maven, not even Ant. make)
Finally, the quality of the source code (both the PHP and the Java code) was worse than pretty much anything I would accept from my students as part of university assignments. All in all, you could tell that the benchmark originated from a research project, and has not seen maintenance for a long, long time.
Anyway, I decided to give installing this on Amazon EC2 a try. I decided to focus on the PHP version (I did not try to make the EJB or servlet version run). I wanted to create the system-under-test in AWS and configure my local MacBook Pro running Mac OS X Mavericks to act as the client.
As foundation, I created a Turnkey Linux LAMP instance of type m1.small running Ubuntu Wheezy 13.0 in AWS. This have me an Ubuntu Linux instance with Apache, PHP and MySQL pre-installed. I logged into this instance as root. Before doing anything, I installed make and Java 1.7:
apt-get install openjdk-7-jdk make sysstat
Now I downloaded the RUBiS source code to my AWS instance, from the web page linked above. I chose the most recent version of RUBiS. For the remaining steps, I discovered this tutorial. This covered the majority of issues I ran into (but not all of them).
Firstly, I populated the server database. I edited the files database/generate_categories.awk and database/generate_regions.awk. In these scripts, I updated the hashbang from /bin/awk to /usr/bin/awk. Then I could use these scripts to generate SQL dumps to populate the database. Within the database subdirectory, I ran:
Now it was time to actually create the database:
mysql -uroot -pmypass < rubis.sql
mysql -uroot -pmypass rubis < categories.sql
mysql -uroot -pmypass rubis < regions.sql
(of course you will change your username / password in above statements to whatever you have configured on your system)
Now I copied the PHP web application to my Apache server root:
cp -r PHP /var/www
I was now able to check out the general RUBiS web app by pointing my Web browser to http://<server_address>/PHP. However, this version of the PHP source has a number of problems. Firstly, the application uses the old (pre-PHP5) way of accessing POST and GET parameters, so I ran the following sed scripts to replace everything:
find . -type f -print0 | xargs -0 sed -i `s/HTTP_GET_VARS/_GET/g`
find . -type f -print0 | xargs -0 sed -i `s/HTTP_POST_VARS/_POST/g`
(executed from /var/www/PHP)
Secondly, there is a bug in RegisterItem.php. I opened the file with a text editor such as vim, and changed the variable $quantity to $qty.
Thirdly, I had to tell the PHP application the correct credentials of my MySQL server. This is configured rather awkwardly in the file PHPPrinter.php. I edited the fifth line in the file to read:
$link = mysql_pconnect(“localhost”, “root”, “mypass”) or die (“ERROR: Could not connect to database”);
Now I could go ahead and insert some test data. For performance reasons, this should really be done on the same machine (it also works remotely, but it will take forever), so you need to build the Java client on your web server. Go back to the root directory of your RUBiS download, and ran
Now I opened the configuration file Client/rubis.properties and changed all the obvious configuration values (host names, paths, etc.). It also seems like a good idea to reduce the amount of test data a little, e.g., I set database_number_of_users = 10000. Now the test data can be inserted:
nohup make initDB PARAM=all &
tail -f nohup.out
This command executes the make target in the background and decoupled from the current remote session. The reason why you would want to do that is that the data insertion process may take a very long time (multiple hours, days if you did not reduce the number of users earlier), and using nohup the process will continue running even if you log out. When everything is done, the last line of nohup.out should read Done!. Afterwards, I checked that test data has been inserted for all tables in the rubis MySQL database. The server part was now done.
Now I copied the entire RUBiS directory to my MacBook Pro as well and started configuring the benchmark client. As a first step, I opened a terminal and logged in as root (rather awkwardly for Mac OS X, but the client assumes that you are running it as root).
(will ask you for your password). I changed to my RUBiS directory and exported the necessary environment variables again:
(change to the correct directory, depending on your concrete Java version). I also had to re-compile the client code
cd Client ; make client
I also had to make sure that the user root from my Mac could log into my server via ssh. Hence, I copied my public key to the server’s authorized keys file:
(on client) cat /root/.ssh/id_rsa.pub | pbcopy
(on server) <open file /root/.ssh/authorized_keys and append the copied line>
Finally, I adapted Client/rubis.properties to reflect the correct settings for the client, and ran the benchmark on the client with
This takes a couple of minutes, and generates an HTML report in bench/<datetime>/index.html, which contains relatively comprehensive benchmarking information.