Technology and Software

Compile your own Ruby and use it with RVM

Prompted by the news about how gcc 4.9 makes Ruby 2.1 faster I decided to compile my own Ruby 2.2.0 and pit it against the one coming with rvm. I also want to keep switching between Rubies using RVM. I had to google a little to learn how to do it so I want to share.

rvm install 2.2.0
rvm use ruby-2.2.0
# find out the compilation options
ruby -r rbconfig -e 'puts RbConfig::CONFIG["configure_args"]'
 'optflags=-O2' '--enable-load-relative' '--sysconfdir=/etc'
 '--disable-install-doc' '--enable-shared'
wget http://cache.ruby-lang.org/pub/ruby/2.2/ruby-2.2.0.tar.gz
# important, always compare to the hash advertised at
# https://www.ruby-lang.org/en/downloads/
md5sum ruby-2.2.0.tar.gz
tar xzf ruby-2.2.0.tar.gz
cd ruby-2.2.0
mkdir -p /home/me/compiled-rubies/2.2.0p0
# configure with the same compilation options
# of the standard binary
CFLAGS=-O2 ./configure --enable-load-relative \
  --sysconfdir=/etc \
  --disable-install-doc --enable-shared \
  --prefix=/home/me/compiled-rubies/2.2.0p0
make
make test
make install
# make it available to rvm as ext-ruby-2.2.0-gcc4.9_O2
rvm mount /home/me/compiled-rubies/2.2.0p0 \
  -n ruby-2.2.0-gcc4.9_O2
rvm list
...
    ext-ruby-2.2.0-gcc4.9_O2 [ x86_64 ]
 => ruby-2.2.0 [ x86_64 ]
...
rvm use ext-ruby-2.2.0-gcc4.9_O2

The files in ~/.rvm/rubies/ext-ruby-2.2.0-gcc4.9_O2 will be symlinks to the ones in compiled-rubies/2.2.0p0 so don’t remove that directory.

The point of this post is already made but as a bonus here are the benchmarks of the two Rubies using Antonio Cangiano’s tests.

git clone git://github.com/acangiano/ruby-benchmark-suite.git
cd ruby-benchmark-suite
rvm use ruby-2.2.0 # for the standard one
rake
rvm use ext-ruby-2.2.0-gcc4.9_O2 # for the compiled one
rake # This might fail, see the note at the end

Here are the results: ruby-2.2.0 and ruby-2.2.0-gcc4.9_O2 (YAML), summary (CSV). TL;DR: the compiled Ruby is a little bit faster overall. It’s much faster in a few tests, a bit slower in some others. It’s a difficult choice and it probably depends on what you do.  Please notice all those tests that ended with errors (look at the YAML files). They could make a difference for the overall assessment of which version is faster but I didn’t dig into that issue yet.

In case of failure

Rake could end with a weird syntax error for the compiled Ruby. There are two possible fixes. One is to replace `which rake` with the version from the 2.2.0 binary distribution. The other is to really understand what’s going on. The key is: that rake is a bash script which execs a Ruby interpreter on itself using ruby’s -x switch which strips away the bash script at the beginning. But Ruby doesn’t seem to honour that. No time to investigate any further now…

Standard
Technology and Software

Running Ruby on Rails tests with a ramdisk backed PostgreSQL

I’m not a fan of mocking objects in my Ruby On Rails tests so my tests always hit the database, which is PostgreSQL anytime I can make the choice and a customer doesn’t dictate MySQL.

Hitting the DB means that the test suite eventually slows down as tests pile up as the application grows.
It’s been a long time since I wanted to check what happens if I run my tests on a database backed on RAM instead of by a spinning disk. Are they going to be much faster?

Tl;dr No, they run at the same speed.

Creating the DB on the ramdisk

The setup is based on information provided at http://stackoverflow.com/questions/11764487/unit-testing-with-in-memory-database and http://jayant7k.blogspot.com.au/2006/08/ram-disk-in-linux.html
Thanks!

The test system is my laptop, an Ubuntu 12.04, i7-4700MQ CPU @ 2.40GHz, 16 GB RAM, SDD for the OS, HDD for my home and databases. The DB is PostgreSQL 9.3.

Linux has 16 ramdisks already created as /dev/ram* at boot time. Let’s take one and mount it.

mkdir ~/tmp/ram
sudo mkfs.ext4 -m 0 /dev/ram0
sudo mount /dev/ram0 ~/tmp/ram/
df -h ~/tmp/ram/
Filesystem Size Used Avail Use% Mounted on
/dev/ram0 58M 1.3M 56M 3% /home/me/tmp/ram

It’s a tiny disk and it turned out to be barely enough to accommodate my tests but it’s OK for experimenting.
You can make it larger if you need to. http://jayant7k.blogspot.com.au/2006/08/ram-disk-in-linux.html explains how.

We create a DB there now.

cd ~/tmp/ram
sudo bash
mkdir postgresql
chown postgres.postgres postgresql/
su - postgres
/usr/lib/postgresql/9.3/bin/initdb --locale=en_US.UTF-8 -D ~/tmp/ram/postgresql/
exit
mkdir postgresql/log
chown postgres.postgres postgresql/log/

We make it use a different port from the one used by the default PostgreSQL DB on the laptop.

vi postgresql/postgresql.conf

port = 5433

We start the DB and connect to it

sudo -u postgres /usr/lib/postgresql/9.3/bin/pg_ctl \
  -D ~/tmp/ram/postgresql/
  -l ~/tmp/ram/postgresql/log/postgresql-9.3-main.log start
psql -p 5433 -U postgres
 \l

Great!

Running tests

We edit config/database.yml to use the ramdisk DB

port: 5433

We create the test user and the test db

psql -p 5433 -U postgres
create role testuser login password 'password';
alter user testuser with createdb;
create database myapp_test owner testuser encoding='UTF8' lc_collate='en_US.UTF-8' lc_ctype='en_US.UTF-8';
\q
exit

We create the DB schema

cd the/rails/directory
RAILS_ENV=test rake db:migrate

And finally we benchmark the tests over the two databases.

rake spec:controllers

RAMDISK

Finished in 1 minute 5.34 seconds (files took 1.76 seconds to load)
Finished in 1 minute 4.26 seconds (files took 1.75 seconds to load)
Finished in 1 minute 2.07 seconds (files took 1.75 seconds to load)

HDD

Finished in 1 minute 7.09 seconds (files took 1.76 seconds to load)
Finished in 1 minute 6.01 seconds (files took 1.72 seconds to load)
Finished in 1 minute 4.68 seconds (files took 1.74 seconds to load)

2 seconds are not worth the trouble. Let’s benchmark the models.

rake spec:models

RAMDISK

Finished in 1 minute 36.8 seconds (files took 1.69 seconds to load)
Finished in 1 minute 38.08 seconds (files took 1.72 seconds to load)
Finished in 1 minute 37.9 seconds (files took 1.73 seconds to load)

HDD

Finished in 1 minute 38.64 seconds (files took 1.79 seconds to load)
Finished in 1 minute 32.73 seconds (files took 1.69 seconds to load)
Finished in 1 minute 41.89 seconds (files took 1.71 seconds to load)

No difference at all, only a bit more variance in the durations of the HDD tests.

This is my conjecture. The data go first into the OS file buffer, then are synced to the disks. Syncing to ramdisk is faster but if there is enough RAM data is staying in RAM anyway and it doesn’t matter if we’re using a ramdisk or a HDD. Remember: this is a test DB with a handful of data, not a large production DB with high I/O loads.

Let’s stop the DB and change the configuration to do without syncing. If there is no speedup my conjecture should be confirmed.

sudo -u postgres /usr/lib/postgresql/9.3/bin/pg_ctl \
  -D ~/tmp/ram/postgresql/ \
  -l ~/tmp/ram/postgresql/log/postgresql-9.3-main.log stop
sudo vi ~/tmp/ram/postgresql/postgresql.conf
fsync=off
sudo -u postgres /usr/lib/postgresql/9.3/bin/pg_ctl \
  -D ~/tmp/ram/postgresql/ \
  -l ~/tmp/ram/postgresql/log/postgresql-9.3-main.log start

Run the test on the ramdisk again.

rake spec:models

RAMDISK

Finished in 1 minute 36.52 seconds (files took 1.71 seconds to load)
Finished in 1 minute 35.45 seconds (files took 1.7 seconds to load)
Finished in 1 minute 36.59 seconds (files took 1.68 seconds to load)

No difference, so my tests didn’t move enough data to make the syncing operations relevant.

Conclusions

You can keep your test DB on a spinning disk and the OS buffering will make it fast.
If you want quick tests you probably have to mock everything and do without the DB.

Alternative: run tests in parallel with the parallel_tests gem.

Questions

Could I have done something better to make the ramdisk based DB run faster?

Standard