Fixing the RUBiS Benchmark for Modern Linux Environments

One of my current research pet ideas is benchmarking cloud providers. For one paper that I currently have in progress, I wanted to compare our benchmarking results with a standard Web benchmark. For this, I chose RUBiS. RUBiS is a quite well-established benchmark in the scientific community, with 300+ citations on Google Scholar. As Joseph of Arimathea would say: I chose poorly.

Very quickly after downloading the benchmark source code, it became evident that making this thing run in the year 2013 was not going to be a piece of cake:

  • Hard-coded paths everywhere: check
  • Very little useful documentation: check
  • Quite obvious bugs in the source code: check


  • Original target Java version: 1.3
  • Original target PHP version: probably 4.X
  • Original target MySQL version: pre-5.0
  • Used build tool: make (not Maven, not even Ant. make)

Finally, the quality of the source code (both the PHP and the Java code) was worse than pretty much anything I would accept from my students as part of university assignments. All in all, you could tell that the benchmark originated from a research project, and has not seen maintenance for a long, long time.

Anyway, I decided to give installing this on Amazon EC2 a try. I decided to focus on the PHP version (I did not try to make the EJB or servlet version run). I wanted to create the system-under-test in AWS and configure my local MacBook Pro running Mac OS X Mavericks to act as the client.

As foundation, I created a Turnkey Linux LAMP instance of type m1.small running Ubuntu Wheezy 13.0 in AWS. This have me an Ubuntu Linux instance with Apache, PHP and MySQL pre-installed. I logged into this instance as root. Before doing anything, I installed make and Java 1.7:

 apt-get install openjdk-7-jdk make sysstat

Now I downloaded the RUBiS source code to my AWS instance, from the web page linked above. I chose the most recent version of RUBiS. For the remaining steps, I discovered this tutorial. This covered the majority of issues I ran into (but not all of them).

Firstly, I populated the server database. I edited the files database/generate_categories.awk  and database/generate_regions.awk. In these scripts, I updated the hashbang from /bin/awk  to /usr/bin/awk. Then I could use these scripts to generate SQL dumps to populate the database. Within the database subdirectory, I ran:

./generate_categories.awk ebay_full_categories.txt

./generate_regions.awk ebay_regions.txt

Now it was time to actually create the database:

mysql -uroot -pmypass < rubis.sql

mysql -uroot -pmypass rubis < categories.sql

mysql -uroot -pmypass rubis < regions.sql

(of course you will change your username / password in above statements to whatever you have configured on your system)

Now I copied the PHP web application to my Apache server root:

cp -r PHP /var/www

I was now able to check out the general RUBiS web app by pointing my Web browser to http://<server_address>/PHP. However, this version of the PHP source has a number of problems. Firstly, the application uses the old (pre-PHP5) way of accessing POST and GET parameters, so I ran the following sed scripts to replace everything:

find . -type f -print0 | xargs -0 sed -i `s/HTTP_GET_VARS/_GET/g`

find . -type f -print0 | xargs -0 sed -i `s/HTTP_POST_VARS/_POST/g`

(executed from /var/www/PHP)

Secondly, there is a bug in RegisterItem.php. I opened the file with a text editor such as vim, and changed the variable $quantity  to $qty.

Thirdly, I had to tell the PHP application the correct credentials of my MySQL server. This is configured rather awkwardly in the file PHPPrinter.php. I edited the fifth line in the file to read:

$link = mysql_pconnect(“localhost”, “root”, “mypass”) or die (“ERROR: Could not connect to database”);

Now I could go ahead and insert some test data. For performance reasons, this should really be done on the same machine (it also works remotely, but it will take forever), so you need to build the Java client on your web server. Go back to the root directory of your RUBiS download, and ran

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/

export CLASSPATH=/usr/lib/jvm/java-7-openjdk-amd64/lib

make client

Now I opened the configuration file Client/ and changed all the obvious configuration values (host names, paths, etc.). It also seems like a good idea to reduce the amount of test data a little, e.g., I set database_number_of_users = 10000. Now the test data can be inserted:

nohup make initDB PARAM=all &

tail -f nohup.out

This command executes the make target in the background and decoupled from the current remote session. The reason why you would want to do that is that the data insertion process may take a very long time (multiple hours, days if you did not reduce the number of users earlier), and using nohup the process will continue running even if you log out. When everything is done, the last line of nohup.out should read Done!. Afterwards, I checked that test data has been inserted for all tables in the rubis MySQL database. The server part was now done.

Now I copied the entire RUBiS directory to my MacBook Pro as well and started configuring the benchmark client. As a first step, I opened a terminal and logged in as root (rather awkwardly for Mac OS X, but the client assumes that you are running it as root).

sudo su

(will ask you for your password). I changed to my RUBiS directory and exported the necessary environment variables again:

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_12.jdk/Contents/Home/

(change to the correct directory, depending on your concrete Java version). I also had to re-compile the client code

cd Client ; make client

I also had to make sure that the user root from my Mac could log into my server via ssh. Hence, I copied my public key to the server’s authorized keys file:

(on client) cat /root/.ssh/ | pbcopy

(on server) <open file /root/.ssh/authorized_keys and append the copied line>

Finally, I adapted Client/ to reflect the correct settings for the client, and ran the benchmark on the client with

make emulator

This takes a couple of minutes, and generates an HTML report in bench/<datetime>/index.html, which contains relatively comprehensive benchmarking information.

2 thoughts on “Fixing the RUBiS Benchmark for Modern Linux Environments

  • leitner

    It’s pretty standard (replace with your values, clearly):

    # HTTP server information
    httpd_hostname =
    httpd_port = 80

    # Precise which version to use. Valid options are : PHP, Servlets, EJB
    httpd_use_version = PHP

    ejb_server = sci20
    ejb_html_path = /ejb_rubis_web
    ejb_script_path = /ejb_rubis_web/servlet

    servlets_server = sci21
    servlets_html_path = /Servlet_HTML
    servlets_script_path = /servlet

    php_html_path = /PHP
    php_script_path = /PHP

    # Workload: precise which transition table to use
    workload_remote_client_nodes =
    workload_remote_client_command = /usr/bin/java -classpath RUBiS edu.rice.rubis.client.ClientEmulator
    workload_number_of_clients_per_node = 300

    workload_transition_table = /workload/transitions.txt
    workload_number_of_columns = 27
    workload_number_of_rows = 29
    workload_maximum_number_of_transitions = 1000
    workload_number_of_items_per_page = 20
    workload_use_tpcw_think_time = yes
    workload_up_ramp_time_in_ms = 120000
    workload_up_ramp_slowdown_factor = 2
    ### workload_session_run_time_in_ms = 900000
    workload_session_run_time_in_ms = 300000
    workload_down_ramp_time_in_ms = 60000
    workload_down_ramp_slowdown_factor = 3

    #Database information
    database_server = localhost

    # Users policy
    database_number_of_users = 10000

    # Region & Category definition files
    database_regions_file = /database/ebay_regions.txt
    database_categories_file = /database/ebay_simple_categories.txt

    # Items policy
    ### database_number_of_old_items = 1000000
    database_number_of_old_items = 10000
    database_percentage_of_unique_items = 80
    database_percentage_of_items_with_reserve_price = 40
    database_percentage_of_buy_now_items = 10
    database_max_quantity_for_multiple_items = 10
    ### database_item_description_length = 8192
    database_item_description_length = 512

    # Bids policy
    database_max_bids_per_item = 20

    # Comments policy
    database_max_comments_per_user = 20
    ### database_comment_max_length = 2048
    database_comment_max_length = 256
    # Monitoring Information
    monitoring_debug_level = 1
    monitoring_program = /usr/bin/sar
    monitoring_options = -n DEV -n SOCK -rubcw
    monitoring_sampling_in_seconds = 1
    monitoring_rsh = /usr/bin/ssh
    monitoring_scp = /usr/bin/scp
    ### monitoring_gnuplot_terminal = jpeg
    monitoring_gnuplot_terminal = png

Leave a Reply to Aleksandar Cancel reply

Your email address will not be published. Required fields are marked *