Technology and Software

Ruby performances with PostgreSQL and MySQL

(original post in the Italian Ruby Forum)

I had to convert a database seeding script from PostgreSQL 9.4 to MariaDB 10 (customer’s choice and with little enthusiasm I had to comply). This lead to a number of interesting discoveries about the pg and mysql2 Ruby drivers. Apart a few minor issues [1] [2] [3] [4] I immediately noticed that the script with MariaDB run 20 times (twenty) more slowly than the  PostgreSQL one: 21 minutes vs 1 minute and 3 seconds. Unusable and inexplicable.

Such a big difference can not be due to the database, so I started to investigate the configuration. Even the MySQL coming with Ubuntu 12.04 was too slow and I can expect that the distributors set it up reasonably well. At this point the suspect becomes the driver. I opened this issue https://github.com/brianmario/mysql2/issues/623 and they gave me two valuable tips: use a profiler ( https://github.com/ruby-prof/ruby-prof ) and the gem-import activerecord ( https://github.com/zdennis/activerecord-import ). I knew both of them but sometimes you have to be reminded about tools you don’t use often. Ops.

The profiler show that the driver uses pg prepared statements that give obvious benefits with the number of records created by my script (a little over 32,000). The version of mysql2 I had to use (0.3.x) does not have prepared statement (but the newer version does) and that seems to make the difference. I rewrote the script to use activerecord-import, which  inserts a whole array of objects at once. The script looks a bit unnatural, because I repeatedly needed the ids of the record I created to pass them along the associations, but the execution times for mysql2 dropped from 21 minutes to 1 minute and 33″. It was worth it. There are only 1,045 calls to the db and yet is always slower than 32k calls made by the original script with pg. The script with pg and activerecord-import dropped to 47 seconds.

Despite all the enhancements introduced in the import-activerecord calls my script’s calls to PostgreSQL add up to 9.4 seconds. The calls to MariaDB are  49.8 seconds. Ruby accounts for 40 seconds, regardless of the database used.

TL;DR

1) Work on PostgreSQL has performance advantages with Ruby due to drivers.

2) mysql2 0.4.0+ has prepared statements but if you’re working with Rails you must be careful. There are issues [A] [B] and it seems you need Rails 4.2.5+ to use it. I didn’t test the combination yet.

3) For details of my profiling research (tables, times, calls) read https://github.com/brianmario/mysql2/issues/623 

 

Finally, the issues I run into:

[1] https: //mariadb.com/kb/en/mariadb/installing-maria …

[2] For MariaDB install the gem mysql2 with

bundle config \
build.mysql2 --with-mysql-config=/path/to/mariadb/bin/mysql_config

Careful: this is globals so use –with-mysql-config=/usr/bin/mysql_config when you need to connect to MySQL.

[3] My script would clear the db before seeding using TRUNCATE CASCADE, but MySQL and MariaDB don’t have it. This  is the workaround

 connection = ActiveRecord :: Base.connection
 Connection.Execute ("SET foreign_key_checks = 0;")
 [all models] .each do | model |
   Connection.Execute ("TRUNCATE model.table_name # {}")
 end
 Connection.Execute ("SET foreign_key_checks = 1;")

[4] But neither ActiveRecord has TRUNCATE, so either you use some gems that add it to AR or even for PostgreSQL you need a loop like that, but you don’t need the SET foreign_ley_checks statements.

Advertisements
Standard