Technology and Software

Benchmark: RoR’s ActiveRecord vs Stored Procedures

This post is about my comparison of the performances of Ruby on Rails’ ActiveRecord class with PostgreSQL stored procedures. The timings favour the stored procedures approach, nevertheless using ActiveRecond keeps its appeal. See the conclusions for more.
Comments are welcome.

Introduction

I’ve got a Ruby on Rails application with a controller exporting a method that receives data from remote devices (obviously over HTTP). Those devices don’t have any capability to support sessions so every hit starts a new one on my server. Not only the vast majority of the files in tmp/sessions are totally useless, but I can’t reuse any resource between successive hits. One notable source of overhead is the creation of new ActiveRecords.
My method queries one table and inserts the received data into an other one. Looking into the log files I can see the queries used by ActiveRecord to create dynamically the get set and find methods. On my test server Rails reports a number of transactions per second in the 300-350 range.
That’s probably OK for my current needs but I wonder how much all those metadata queries are slowing down my app.

The test system

The test server is actually a Xen virtual server hosted at vpsland. The HW is shared with other customers and that’s why
the performances aren’t costant.
The software configuration is: Debian 3.1, Linux kernel 2.6.16,
Rails 1.1.0, ActiveRecord 1.14.0, PostgreSQL 8.1.4 connected to Rails with postgresql-8.1-405.jdbc3.jar.
The hardware configuration isn’t relevant as the system is shared.
The tests of the different access methods has been interwined, one access of a type (ActiveRecord) followed by one access of another type (stored procedures). In this way the timings of the different access types are comparable because on the average they are equally affected by the fluctuations of the CPU and disk load.

Before starting to code around ActiveRecord I got rid of the session management code which, as explained, can’t benefit that part of my application.
This article explains how to remove it for all the application, a single controller or a single method. The latter is what interested me. The syntax is:

class MyController < ApplicationController
  session :off, :only => %w(my_method)
  def my_method
    ...
  end
end

All the tests have been performed with session management turned off.

Tests and results

I had three tests at this point. The first one is a pure Ruby (should I say pure Rails?) ActiveRecord implementation. The second one is an intermediate version of my code, which I was curious to compare with the other ones. It’s a mix of stored procedures (one to replace the SELECT) and raw SQL commands (the INSERT). The third one is a pure stored procedure implementation.
The Webrick was bound to the local loop interface (127.0.0.1) of the server and called with wget. Rails was run in production mode.
All the tests shared the code to validate the input URL parameters.
The timings were taken from the rails log, from the
“Completed in … (… reqs/sec)” lines.

I run the tests several times, 200 accesses per test type each.
I varied the amount of delay between successive wgets but that doesn’t seem to affect the results. In the table I give the average requests per seconds over all the tests and the performance increases over the ActiveRecord implementation.

delay (s)
Requests per second Improvement
  ActRcrd SP+SQL StrdProc SP+SQL SP
0,2000 348,633 438,833 1475,915 1,26 4,23
0,1000 329,450 408,743 1407,410 1,24 4,27
0,0500 349,235 441,073 1484,043 1,26 4,25
0,0200 308,490 389,450 1325,197 1,26 4,30
0,0100 353,550 449,233 1530,088 1,27 4,33
0,0050 345,240 433,082 1476,523 1,25 4,28
0,0020 258,892 323,450 1104,037 1,25 4,26
0,0010 352,982 438,027 1482,148 1,24 4,20
0,0005 349,043 433,172 1475,452 1,24 4,23

On the average replacing ActiveRecord database access with a stored procedure gave a 4.26 time performance increase. I don’t say that it translates into 1 server instead of 4, nevertheless is such a big gain that can hardly be ignored. The mixed test (one stored procedure and raw SQL) gains a 25% over ActiveRecord but it might not be worth the effort.

Note that removing session didn’t yeld any measurable result
(not included in the table).

Conclusions

The ActiveRecord test has been easier to code, less prone to typos and probably easier to maintain. Does that justifies a 4 times decrease of performances?

If the application has to sustain only a light load, my take is that
it does. However the optimization of the critical paths of the system becomes necessary when the load increases, even if I’m going to leave ActiveRecords into the less used parts of the system. The rationale is that I’ll partition my system in two areas, in one performances are critical and in the other one they’re not. The tradeoffs between the costs of coding and maintenance and the costs of the hardware in those two parts are different and that justifies the choice of two different technologies.

Finally, if too much has to be optimized, probably Rails isn’t the right technology for that project.

Afterword: stored procedures in Rails

So, how do you call a stored procedure from Rails? Here’s what I discovered browsing through PostgreSQL documentation and a number of railish web sites and blogs.

This is a simple stored procedure written with PostgreSQL pl/pgsql syntax:

$ vi sp_test.sql

CREATE OR REPLACE FUNCTION sp_test()
              RETURNS varchar AS $PROC$
  BEGIN
    RETURN 'test';
  END
$PROC$ LANGUAGE plpgsql;

:wq

If the db fails to recognize the language you have to execute
the CREATE LANGUAGE plpgsq command into
psql.

Let’s connect to PostgresSQL and check that it works:

$ psql -f sp_test.sql
$ psql
psql=# select sp_test();
 sp_test
 ----------
 test
(1 row)
psql=# q

The next step is to call it from Ruby.

There are at least two alternative libraries that can be used to connect to PostgreSQL.
The first one is PGconn, a Ruby wrapper around a C library.
The other one is PostgresPR, pure Ruby.

PGConn

#!/usr/local/bin/rubyrequire "postgres"
conn = PGconn.connect("/tmp", 5432, "", "", "mydb",
                      "user", "password")
res = conn.query("SELECT * FROM sp_test()")
logger.info "Stored procedure results:"
for tuple in 0..(res.num_tuples - 1)
  s = ""
  for field in 0..(res.num_fields - 1)
    s = s + (field == 0? "": ",") +
            res.getvalue(tuple, field).to_s
  end
  logger.info s
end
conn.close

PostgresPR

require 'postgres-pr/connection'
conn = PostgresPR::Connection.new('mydb',
                     'user', 'password')
res = conn.query("SELECT * FROM sp_test()")
p res
conn.close

Things are a little different inside a Rails controller.
First of all, the framework already opened the connession to the DB.
This article explains how to get it. The code is:

conn = ActiveRecord::Base.connection.
                    instance_variable_get(:@connection)
result = conn.exec("select sp_test();")

One more glitch: the functional tests fail because the stored procedure isn’t defined into the test database (remember, it is cleared and recreated each time a test is run.) So, how to define the stored procedure in that db?
There are some hints around the Net, involving modified rake tasks
and migrations. One more thing to study later on.

Advertisements
Standard

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s