This post is about my comparison of the performances of Ruby on Rails’ ActiveRecord class with PostgreSQL stored procedures. The timings favour the stored procedures approach, nevertheless using ActiveRecond keeps its appeal. See the conclusions for more.
Comments are welcome.
Introduction
I’ve got a Ruby on Rails application with a controller exporting a method that receives data from remote devices (obviously over HTTP). Those devices don’t have any capability to support sessions so every hit starts a new one on my server. Not only the vast majority of the files in tmp/sessions are totally useless, but I can’t reuse any resource between successive hits. One notable source of overhead is the creation of new ActiveRecords.
My method queries one table and inserts the received data into an other one. Looking into the log files I can see the queries used by ActiveRecord to create dynamically the get set and find methods. On my test server Rails reports a number of transactions per second in the 300-350 range.
That’s probably OK for my current needs but I wonder how much all those metadata queries are slowing down my app.
The test system
The test server is actually a Xen virtual server hosted at vpsland. The HW is shared with other customers and that’s why
the performances aren’t costant.
The software configuration is: Debian 3.1, Linux kernel 2.6.16,
Rails 1.1.0, ActiveRecord 1.14.0, PostgreSQL 8.1.4 connected to Rails with postgresql-8.1-405.jdbc3.jar.
The hardware configuration isn’t relevant as the system is shared.
The tests of the different access methods has been interwined, one access of a type (ActiveRecord) followed by one access of another type (stored procedures). In this way the timings of the different access types are comparable because on the average they are equally affected by the fluctuations of the CPU and disk load.
Before starting to code around ActiveRecord I got rid of the session management code which, as explained, can’t benefit that part of my application.
This article explains how to remove it for all the application, a single controller or a single method. The latter is what interested me. The syntax is:
class MyController < ApplicationController
session :off, :only => %w(my_method)
def my_method
...
end
end
All the tests have been performed with session management turned off.
Tests and results
I had three tests at this point. The first one is a pure Ruby (should I say pure Rails?) ActiveRecord implementation. The second one is an intermediate version of my code, which I was curious to compare with the other ones. It’s a mix of stored procedures (one to replace the SELECT) and raw SQL commands (the INSERT). The third one is a pure stored procedure implementation.
The Webrick was bound to the local loop interface (127.0.0.1) of the server and called with wget. Rails was run in production mode.
All the tests shared the code to validate the input URL parameters.
The timings were taken from the rails log, from the
“Completed in … (… reqs/sec)” lines.
I run the tests several times, 200 accesses per test type each.
I varied the amount of delay between successive wgets but that doesn’t seem to affect the results. In the table I give the average requests per seconds over all the tests and the performance increases over the ActiveRecord implementation.
delay (s)
|
Requests per second |
Improvement |
|
ActRcrd |
SP+SQL |
StrdProc |
SP+SQL |
SP |
0,2000 |
348,633 |
438,833 |
1475,915 |
1,26 |
4,23 |
0,1000 |
329,450 |
408,743 |
1407,410 |
1,24 |
4,27 |
0,0500 |
349,235 |
441,073 |
1484,043 |
1,26 |
4,25 |
0,0200 |
308,490 |
389,450 |
1325,197 |
1,26 |
4,30 |
0,0100 |
353,550 |
449,233 |
1530,088 |
1,27 |
4,33 |
0,0050 |
345,240 |
433,082 |
1476,523 |
1,25 |
4,28 |
0,0020 |
258,892 |
323,450 |
1104,037 |
1,25 |
4,26 |
0,0010 |
352,982 |
438,027 |
1482,148 |
1,24 |
4,20 |
0,0005 |
349,043 |
433,172 |
1475,452 |
1,24 |
4,23 |
On the average replacing ActiveRecord database access with a stored procedure gave a 4.26 time performance increase. I don’t say that it translates into 1 server instead of 4, nevertheless is such a big gain that can hardly be ignored. The mixed test (one stored procedure and raw SQL) gains a 25% over ActiveRecord but it might not be worth the effort.
Note that removing session didn’t yeld any measurable result
(not included in the table).
Conclusions
The ActiveRecord test has been easier to code, less prone to typos and probably easier to maintain. Does that justifies a 4 times decrease of performances?
If the application has to sustain only a light load, my take is that
it does. However the optimization of the critical paths of the system becomes necessary when the load increases, even if I’m going to leave ActiveRecords into the less used parts of the system. The rationale is that I’ll partition my system in two areas, in one performances are critical and in the other one they’re not. The tradeoffs between the costs of coding and maintenance and the costs of the hardware in those two parts are different and that justifies the choice of two different technologies.
Finally, if too much has to be optimized, probably Rails isn’t the right technology for that project.
Afterword: stored procedures in Rails
So, how do you call a stored procedure from Rails? Here’s what I discovered browsing through PostgreSQL documentation and a number of railish web sites and blogs.
This is a simple stored procedure written with PostgreSQL pl/pgsql syntax:
$ vi sp_test.sql
CREATE OR REPLACE FUNCTION sp_test()
RETURNS varchar AS $PROC$
BEGIN
RETURN 'test';
END
$PROC$ LANGUAGE plpgsql;
:wq
If the db fails to recognize the language you have to execute
the CREATE LANGUAGE plpgsq command into
psql.
Let’s connect to PostgresSQL and check that it works:
$ psql -f sp_test.sql
$ psql
psql=# select sp_test();
sp_test
----------
test
(1 row)
psql=# q
The next step is to call it from Ruby.
There are at least two alternative libraries that can be used to connect to PostgreSQL.
The first one is PGconn, a Ruby wrapper around a C library.
The other one is PostgresPR, pure Ruby.
PGConn
#!/usr/local/bin/rubyrequire "postgres"
conn = PGconn.connect("/tmp", 5432, "", "", "mydb",
"user", "password")
res = conn.query("SELECT * FROM sp_test()")
logger.info "Stored procedure results:"
for tuple in 0..(res.num_tuples - 1)
s = ""
for field in 0..(res.num_fields - 1)
s = s + (field == 0? "": ",") +
res.getvalue(tuple, field).to_s
end
logger.info s
end
conn.close
PostgresPR
require 'postgres-pr/connection'
conn = PostgresPR::Connection.new('mydb',
'user', 'password')
res = conn.query("SELECT * FROM sp_test()")
p res
conn.close
Things are a little different inside a Rails controller.
First of all, the framework already opened the connession to the DB.
This article explains how to get it. The code is:
conn = ActiveRecord::Base.connection.
instance_variable_get(:@connection)
result = conn.exec("select sp_test();")
One more glitch: the functional tests fail because the stored procedure isn’t defined into the test database (remember, it is cleared and recreated each time a test is run.) So, how to define the stored procedure in that db?
There are some hints around the Net, involving modified rake tasks
and migrations. One more thing to study later on.