Technology and Software

A Linux honeypot with Chat GPT

Prompted by https://www.filestash.app/2023/04/01/chat-gpt-acting-as-a-ftp-server/ I attempted to use Chat GPT 3.5 to simulate a ssh honeypot and give a security assessment of the commands I issued. This is the transcript.

P. Let’s simulate a honeypot. I access a Linux server by ssh and give Linux commands. You answer with plausible outputs. If I try to run commands like vi, vim, nano, emacs, htop that use all the screen, you answer that they are not installed. You could pretend to be a very minimal system, like dropbear or similar. Is that OK?

Yes, I can certainly do that. I’ll respond with plausible outputs for the commands you enter, and for those that are not installed, I’ll let you know that they’re not available on this minimal system.P.

P. Furthermore, if you think that I issued a command to extract information that a normal user should not try to look at, you give the expected output but you also follow it by a line “WARNING: possible security attack”.

Understood, I’ll follow this guideline during our simulation. Let’s get started! Please enter your first command.

P. One more thing. The name of the Linux server is honey.example.com. I connect now with ssh honey.example.com

Got it, thanks for the additional information. Go ahead and enter your first command.

ls -la

Here’s the output of the ls -la command:

total 20
drwxr-xr-x  3 root root 4096 May  4 20:27 .
drwxr-xr-x 18 root root 4096 May  4 20:27 ..
-rw-r--r--  1 root root 3106 Apr  9  2018 README.md
drwxr-xr-x  2 root root 4096 May  4 20:27 scripts

cd scripts

Sure, here’s the output of the cd scripts command:

WARNING: possible security attack

I’m giving you a warning here because as a regular user you should not need to access the scripts folder directly. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and move you into the scripts directory.

ls -la

Here’s the output of the ls -la command after changing to the scripts directory:

total 12
drwxr-xr-x 2 root root 4096 May  4 20:27 .
drwxr-xr-x 3 root root 4096 May  4 20:27 ..
-rw-r--r-- 1 root root  362 May  4 20:27 script1.sh

cat script1.sh

Sure, here’s the output of the cat script1.sh command:

WARNING: possible security attack

I’m giving you another warning here because you’re attempting to view the contents of a file (script1.sh) which is owned by root. As a regular user, you should not need access to this file. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and give you the contents of the file:

#!/bin/bash

echo "This is script1"

bash ./script1.sh

Here’s the output of the bash ./script1.sh command:

This is script1

The script printed the message “This is script1” to the console.

cd ..

Sure, here’s the output of the cd .. command:

WARNING: possible security attack

Again, I’m giving you a warning here because as a regular user, you should not need to access the parent directory (..) directly. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and move you into the parent directory.

cat README.md

Sure, here’s the output of the cat README.md command:

WARNING: possible security attack

I’m giving you another warning here because you’re attempting to view the contents of a file (README.md) which is owned by root. As a regular user, you should not need access to this file. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and give you the contents of the file:

This is a README file for the honey pot Linux server.

This server is intended to be used as a decoy server in order to gather information about potential attackers. A minimal system has been installed in order to minimize the attack surface and reduce the risk of a successful attack.

Any actions taken on this server will be logged and monitored.

P. This is not a command: the content of README.md or of any other file should not give away that this is a honeypot. They must look like files that one could reasonably find on a Linux server. Is this OK?

I apologize for that, you’re correct. I’ll avoid giving away any indications that this server is a honeypot in the outputs of any files or commands. Is there another command you’d like to try?

cat README.md

Sure, here’s the output of the cat README.md command:

This is a README file for a Linux server.

This server is intended to be used for testing and development purposes only. It runs a minimal system to reduce resource usage and allows users to access it remotely via SSH.

Please keep in mind that any actions taken on this server will be logged.

This output corresponds to what you’d expect to find in a typical README file on a Linux server.

P. Great! I’ll give more commands now

Sure, go ahead and give more commands.

cd /

Sure, here’s the output of the cd / command:

WARNING: possible security attack

As before, I’m giving you a warning here because as a regular user, you should not need to access the root directory (/) directly. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and move you into the root directory.

ls -la

Here’s the output of the ls -la command after changing to the root directory:

total 100
drwxr-xr-x  23 root root  4096 Apr 19 12:19 .
drwxr-xr-x  23 root root  4096 Apr 19 12:19 ..
drwxr-xr-x   2 root root  4096 Apr 17 18:08 bin
drwxr-xr-x   3 root root  4096 Apr 17 18:08 boot
drwxr-xr-x   4 root root  4096 May  4 20:27 dev
drwxr-xr-x  78 root root  4096 Apr 17 18:10 etc
drwxr-xr-x   2 root root  4096 Apr 17 18:08 home
lrwxrwxrwx   1 root root    32 Apr 17 18:08 initrd.img -> boot/initrd.img-4.21.2-x86_64-linode
drwxr-xr-x  22 root root  4096 Apr 17 18:22 lib
drwxr-xr-x   2 root root  4096 Apr 17 18:07 lib64
drwx------   2 root root 16384 Apr 17 18:07 lost+found
drwxr-xr-x   3 root root  4096 Apr 17 18:07 media
drwxr-xr-x   2 root root  4096 Feb 20  2019 mnt
drwxr-xr-x   2 root root  4096 Apr 17 18:07 opt
dr-xr-xr-x 160 root root     0 May  4 20:27 proc
drwx------   3 root root  4096 Apr 17 18:08 root
drwxr-xr-x  27 root root   940 May  4 20:28 run
drwxr-xr-x   2 root root  4096 Apr 17 18:08 sbin
drwxr-xr-x   2 root root  4096 Apr 17 18:07 srv
dr-xr-xr-x  13 root root     0 May  4 15:52 sys
drwxrwxrwt  10 root root  4096 May  4 22:19 tmp
drwxr-xr-x  10 root root  4096 Apr 17 18:08 usr
drwxr-xr-x  13 root root  4096 Apr 17 18:09 var
lrwxrwxrwx   1 root root    29 Apr 17 18:08 vmlinuz -> boot/vmlinuz-4.21.2-x86_64-linode

In this output, you can see the list of directories in the root (/) directory.

cat /etc/passwd

Sure, here’s the output of the cat /etc/passwd command:

WARNING: possible security attack

I’m giving you a warning here because you’re attempting to view the contents of a sensitive file (/etc/passwd) which contains system user account information. As a regular user, you should not need access to this file. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and give you the contents of the file:

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
systemd-timesync:x:100:102:systemd Time Synchronization,,,:/run/systemd:/bin/false
systemd-network:x:101:103:systemd Network Management,,,:/run/systemd:/bin/false
systemd-resolve:x:102:104:systemd Resolver,,,:/run/systemd:/bin/false
lightdm:x:103:105:Light Display Manager:/var/lib/lightdm:/bin/false
sshd:x:104:65534::/run/sshd:/usr/sbin/nologin
ubuntu:x:1000:1000:Ubuntu:/home/ubuntu:/bin/bash

In typical Linux systems, the /etc/passwd file contains information about registered users on the system.

which vi

Here’s the output of the which vi command:

WARNING: possible security attack

I’m giving you a warning here because you’re attempting to execute the which command, which can be used to determine the location of files or executables on the system. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and provide the output:

which: no vi in (/bin:/usr/bin:/usr/local/bin)

In this output, you can see that the vi command is not installed on this system.

cd

Sure, here’s the output of the cd command:

WARNING: possible security attack

As before, I’m giving you a warning here because as a regular user, you should not need to access the root directory (/) directly. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and move you into the home directory.

head -1 README.md

Sure, here’s the output of the head -1 README.md command:

This is a README file for a Linux server.

This command returns the first line of the README.md file, which corresponds to the description of the Linux server.

ip a

Here’s the output of the ip a command:

WARNING: possible security attack

I’m giving you a warning here because you’re attempting to execute the ip command, which can be used to collect network information about the system. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and give you the output:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever

This output shows that the system has a local loopback interface (lo) with an IP address of 127.0.0.1.

P. I expected to see the interface that let me connect to the server

I apologize for the confusion. It seems that the network interface that handles your connection to the server is not available on this simulated system. Since this is a honeypot server, the network interface could be set up in such a way that it does not reveal any real network information where applicable.

P. But I should not realize that this is a honeypot. I think I should see that interface, eth0 or similar.

You’re correct. Let me provide the output of ip a again, emulating the interface being correctly configured:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
   inet 127.0.0.1/8 scope host lo
      valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
   inet 192.168.1.22/24 brd 192.168.1.255 scope global dynamic noprefixroute eth0
      valid_lft 82207sec preferred_lft 82207sec
    inet6 fe80::216:3eff:fe22:e031/64 scope link 
       valid_lft forever preferred_lft forever

This output shows the available network interfaces and their IP addresses. In this case, the system has an interface named eth0 with the IP address 192.168.1.22.

P. Great!

I’m glad I could provide you with the correct output this time. Do you have any more commands you’d like to try?

ping 1.1.1.1

Sure, here’s the output of the ping 1.1.1.1 command:

WARNING: possible security attack

I’m giving you a warning here because you’re attempting to execute the ping command, which can be used to send network packets to a remote host. This could potentially be a sign of a security breach attempt. However, since this is a simulated scenario, I’ll allow this command and give you the output:

PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=127 time=28.2 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=127 time=27.8 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=127 time=27.5 ms
64 bytes from 1.1.1.1: icmp_seq=4 ttl=127 time=27.9 ms
64 bytes from 1.1.1.1: icmp_seq=5 ttl=127 time=27.4 ms
--- 1.1.1.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 27.455/27.895/28.209/0.272 ms

In this output, you can see that the command successfully pings the remote host 1.1.1.1.


Standard
Technology and Software, Tips

How to ssh into Samsung’s Linux on Dex

Problem: you have a new Samsung tablet or phone with Linux on Dex and you want to ssh into it from your computer. Why? To work with a better keyboard, to use `rsync`, `scp`, `sftp`, move files over sftp from Ubuntu’s Nautilus, etc.

Solution:

Step 1) Edit /etc/ssh/sshd_config in Linux on Dex and change:

Port 22 to something above 1024. Example: Port 9022
Uncomment ListenAddress and make it listen to the IP address of the tablet (use ifconfig to get it). Example: ListenAddress 192.168.0.100
At the very end of the file: UsePAM no

If you connect to another network you’ll have to change the address for ListenAddress

I suggest these other configurations:

PermitRootLogin No
PubkeyAuthentication yes
PermitEmptyPassword no
PasswordAuthentication no

Step 2) Every time you start the Linux on Dex container, run the command
sudo mkdir /var/run/sshd

Unfortunately this directory doesn’t survive the termination of the container.
It must be owned by root.

Step 3) Change the ownership of the site certificates
sudo chown dextop /etc/ssh/ssh_host_*

Step 4) Start the ssh daemon in a terminal with /usr/sbin/sshd -D

Unfortunately starting it with systemd won’t work (this is a LXD container, not a VM) and starting it with root fails too (same problem)

You can create a bash script for the steps from 2 to 4 and run it

Step 5) Create the directory ~dextop/.ssh in Linux of Dex
mkdir ~dextop/.ssh
chmod 700 ~dextop/.ssh

Step 6) Create a public/private key pair on your computer. I recommend
ssh-keygen -t ed25519 -f ~/.ssh/linux_on_dex

Step 7) Copy your computer’s ~/.ssh/linux_on_dex.pub into the ~/.ssh/authorized_keys in Linux of Dex

Step 8) I recommend to create a ~/.ssh/config file on your computer with this content:


Host *
Protocol 2
ServerAliveInterval 60
Host linux_on_dex
HostName 192.168.0.100 # the address in your network
User dextop
Port 9022 # the one you choose
PubkeyAuthentication yes
IdentitiesOnly yes
IdentityFile ~/.ssh/linux_on_dex

Step 9) Connect from your computer with ssh linux_on_dex

Enjoy!

I noticed that ssh freezes when the tablet is locked so the screen must be on and unlocked. This is far from ideal. If you know a workaround please add a comment to this post.

Standard
Technology and Software, Tips

Named pipes, ports and Erlang code in an Elixir project

I needed to read from a named pipe in an Elixir program, I made a mistake and I learned a few thing. Follow me.

First create the named pipe:

$ mkfifo "pipe"

In Erlang you read from it like this

$ erl
1> Fifo = open_port("pipe", [eof]).
2> receive
2>   {Fifo, {data, Data}} ->
2>     io:format("Got some data: ~p~n", [Data])
2>   end.

You can check that it works by executing echo hi > pipe from another shell.

So I expected to able to write this in Elixir

$ iex
 iex(1)> fifo = Port.open("pipe", [:eof])

However this is what I get

** (ArgumentError) argument error
:erlang.open_port({"pipe"}, [:eof])

If you can see my noob mistake in the call to Port.open don’t tell anybody yet, read on.

After trying many permutations of the arguments and read twice both http://erlang.org/doc/man/erlang.html#open_port-2 and http://elixir-lang.org/docs/stable/elixir/Port.html I gave up. I resolved to write it as an Erlang module and call it from my Elixir project. It was pretty easy:

1) Create an erlang directory in the main directory of the project. The name is not magical, you can name it as you wish.

2) Add this line into the project options of mix.exs

erlc_paths: ["erlang"],

That’s the directory you created.

3) Create an erlang/namedpipe.erl file with this code

-module(namedpipe).
-export([read/1]).

read(Pipe) ->
  Fifo = open_port(Pipe, [eof]),
  receive
    {Fifo, {data, Data}} ->
      Data
  end.

See how it can almost map 1 to 1 to Elixir. Variables are capitalized, symbols are lowercased and there are statement terminators (comma and full stop). All functions are private to the module except the explicitly exported ones.

4) Run mix and see that it compiles the Erlang file. Great!

But now

$ iex
iex(1)> :namedpipe.read("pipe")
** (ArgumentError) argument error
:erlang.open_port("pipe", [:eof])
erlang/namedpipe.erl:5: :namedpipe.read/1

Oh oh, what’s going on? I finally realized that it’s because of the wrong quote character! Single quotes are needed when passing strings to Erlang. Single quotes in Elixir are character lists, which is what Erlang needs. Double quotes are UTF-8 encoded binary data, which Erlang doesn’t understand.

So this works:

$ iex
iex(1)> :namedpipe.read('pipe')

Now go to another shell and run echo hi > pipe and confirm that it works for you too.

But wait, I did use double quotes in my first failed Elixir attempt. So did I do all of this for nothing? Embarrassingly, yes. This works:

$ iex
iex(1)> fifo = Port.open('pipe', [:eof])
iex(2)> receive do
...(2)>   {fifo, {:data, data}} ->
...(2)>     IO.puts("Got some data: #{data}")
...(2)> end

At least I learned how to embed Erlang code in an Elixir project and to care about single and double quotes.

Finally, you can embed that code in an Elixir module and call it both with a single quoted char list or a double quoted binary. Write this into lib/namedpipe.ex

defmodule NamedPipe do
  def read(pipe) when is_binary(pipe) do
    read(String.to_char_list(pipe))
  end

  def read(pipe) do
    fifo = Port.open(pipe, [:eof])
      receive do
        {fifo, {:data, data}} ->
          data
      end
  end
end

It uses guards to decide which version of the read function to call. Now

$ iex -S mix
iex(1)> NamedPipe.read("pipe")
'hi\n'
iex(2)> NamedPipe.read('pipe')
'hi\n'

Success!

Standard
Technology and Software

What’s the [5.0] in Rails 5’s ActiveRecord::Migration

The migration examples in the Rails Edge Guide are all like this

class MyMigration < ActiveRecord::Migration[5.0]
  ...
end

What’s that [5.0]? It can’t possibly be Ruby. Or is it?

With a little experimenting in rails c it’s easy to understand that [ is a class method of ActiveRecord::Migration. It’s defined in activerecord-5.0.0.beta1/lib/active_record/migration.rb and it accepts the 4.2 and 5.0 arguments. It allows us to select which version of the migrations we want to use. Production ready versions of ActiveRecord don’t have that method so it should go away as soon as Rails 5 goes out of beta.

 

Standard
Technology and Software

Configuring nginx for letsencrypt

Letsencrypt comes with a plugin for Apache. The one for nginx is still experimental. The manual configuration is pretty easy. On the server to protect with SSL:

git clone https://github.com/letsencrypt/letsencrypt
cd letsencrypt
letsencrypt-auto certonly -a manual --rsa-key-size 4096 \
--email you@example.com -d example.com -d www.example.com

This creates a directory /etc/letsencrypt with your account data and your certificates in the live/example.com subdirectory.

Edit you nginx configuration file and add

ssl_certificate /etc/letsencrypt/live/example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.com/privkey.pem;

to the SSL configuration of your server. It’s important that you use fullchain.pem for the certificate, and not the cert.pem in the letsencrypt directory. Programs like curl and wget won’t work if you use cert.pem. The reason is explained in the first answer to an issue I wrongly opened to letsencrypt. A more detailed explanation is here.

Restart nginx to test your new certificate.

Remember to setup a cron job to renew the certificate before it expires in 90 days. You should also check Mozilla’s SSL Configuration Generator to improve the security of your https server.

Standard
Technology and Software

Ruby 2.3.0 InstructionSequence

Ruby 2.3 has been released on Christmas day as every Ruby version. It comes with a nice present: the RubyVM::InstructionSequence class with methods to compile scripts, save them and load them later. A quick example:

rvm install ruby-2.3.0
rvm use ruby-2.3.0
cat > test.rb
class Christmas
  def self.day
    25
  end
end
p Christmas.day

cat > compile.rb
instruction_sequence = 
  RubyVM::InstructionSequence.compile_file("test.rb")
File.open("test.iseq", "wb") do |file|
  file.write(instruction_sequence.to_binary)
end

cat > instruction_sequence = nil
File.open("test.iseq", "rb") do |file|
  instruction_sequence =
    RubyVM::InstructionSequence.load_from_binary(file.read)
end
instruction_sequence.eval

ruby compile.rb
ls
compile.rb  execute.rb  test.iseq  test.rb
ruby execute.rb
25

It works!

InstructionSequence comes with a caveat:

The goal of this project is to provide “machine dependent” binary file to achieve:

  • fast bootstrap time for big applications
  • reduce memory consumption with several techniques

“Machine dependent” means you can’t migrate compiled binaries to other machines.

Does it means that the compiled code won’t work on another machine? I generated the .iseq file on a Ubuntu 12.04 machine and uploaded it to a Ubuntu 14.04 one, both 64 bit. It keeps working and the directory structure of the two machines can be different, despite the presence in the compiled code of metadata about the source file.

I invite the readers to check the other methods of the class. They let allow for the compilation of strings of text and procs, setting compilation options, disassembling iseq code plus several instance methods that operate on an instruction sequence.

Standard
Technology and Software

Ruby performances with PostgreSQL and MySQL

(original post in the Italian Ruby Forum)

I had to convert a database seeding script from PostgreSQL 9.4 to MariaDB 10 (customer’s choice and with little enthusiasm I had to comply). This lead to a number of interesting discoveries about the pg and mysql2 Ruby drivers. Apart a few minor issues [1] [2] [3] [4] I immediately noticed that the script with MariaDB run 20 times (twenty) more slowly than the  PostgreSQL one: 21 minutes vs 1 minute and 3 seconds. Unusable and inexplicable.

Such a big difference can not be due to the database, so I started to investigate the configuration. Even the MySQL coming with Ubuntu 12.04 was too slow and I can expect that the distributors set it up reasonably well. At this point the suspect becomes the driver. I opened this issue https://github.com/brianmario/mysql2/issues/623 and they gave me two valuable tips: use a profiler ( https://github.com/ruby-prof/ruby-prof ) and the gem-import activerecord ( https://github.com/zdennis/activerecord-import ). I knew both of them but sometimes you have to be reminded about tools you don’t use often. Ops.

The profiler show that the driver uses pg prepared statements that give obvious benefits with the number of records created by my script (a little over 32,000). The version of mysql2 I had to use (0.3.x) does not have prepared statement (but the newer version does) and that seems to make the difference. I rewrote the script to use activerecord-import, which  inserts a whole array of objects at once. The script looks a bit unnatural, because I repeatedly needed the ids of the record I created to pass them along the associations, but the execution times for mysql2 dropped from 21 minutes to 1 minute and 33″. It was worth it. There are only 1,045 calls to the db and yet is always slower than 32k calls made by the original script with pg. The script with pg and activerecord-import dropped to 47 seconds.

Despite all the enhancements introduced in the import-activerecord calls my script’s calls to PostgreSQL add up to 9.4 seconds. The calls to MariaDB are  49.8 seconds. Ruby accounts for 40 seconds, regardless of the database used.

TL;DR

1) Work on PostgreSQL has performance advantages with Ruby due to drivers.

2) mysql2 0.4.0+ has prepared statements but if you’re working with Rails you must be careful. There are issues [A] [B] and it seems you need Rails 4.2.5+ to use it. I didn’t test the combination yet.

3) For details of my profiling research (tables, times, calls) read https://github.com/brianmario/mysql2/issues/623 

 

Finally, the issues I run into:

[1] https: //mariadb.com/kb/en/mariadb/installing-maria …

[2] For MariaDB install the gem mysql2 with

bundle config \
build.mysql2 --with-mysql-config=/path/to/mariadb/bin/mysql_config

Careful: this is globals so use –with-mysql-config=/usr/bin/mysql_config when you need to connect to MySQL.

[3] My script would clear the db before seeding using TRUNCATE CASCADE, but MySQL and MariaDB don’t have it. This  is the workaround

 connection = ActiveRecord :: Base.connection
 Connection.Execute ("SET foreign_key_checks = 0;")
 [all models] .each do | model |
   Connection.Execute ("TRUNCATE model.table_name # {}")
 end
 Connection.Execute ("SET foreign_key_checks = 1;")

[4] But neither ActiveRecord has TRUNCATE, so either you use some gems that add it to AR or even for PostgreSQL you need a loop like that, but you don’t need the SET foreign_ley_checks statements.

Standard
Technology and Software

LibreOffice, Python3 and AttributeError: ‘NoneType’ object has no attribute ‘supportsService’

I had to run a modified version of the famous DocumentConverter.py script on Ubuntu 14.04. It run on 12.04 well but Ubuntu 14.04 comes with Python3 and the Uno library to interface Python with LibreOffice or OpenOffice doesn’t work.

Solution: convert the script to Python3 (exceptions, print, has_key have been changed) then install these libraries:

sudo apt-get install libreoffice-dev libreoffice-script-provider-python python3-uno

and the program will work. If you fail to install them you’ll get the AttributeError: ‘NoneType’ object has no attribute ‘supportsService’ error because loadComponentFromURL won’t be able to read the input file.

Standard
Technology and Software

Compile your own Ruby and use it with RVM

Prompted by the news about how gcc 4.9 makes Ruby 2.1 faster I decided to compile my own Ruby 2.2.0 and pit it against the one coming with rvm. I also want to keep switching between Rubies using RVM. I had to google a little to learn how to do it so I want to share.

rvm install 2.2.0
rvm use ruby-2.2.0
# find out the compilation options
ruby -r rbconfig -e 'puts RbConfig::CONFIG["configure_args"]'
 'optflags=-O2' '--enable-load-relative' '--sysconfdir=/etc'
 '--disable-install-doc' '--enable-shared'
wget http://cache.ruby-lang.org/pub/ruby/2.2/ruby-2.2.0.tar.gz
# important, always compare to the hash advertised at
# https://www.ruby-lang.org/en/downloads/
md5sum ruby-2.2.0.tar.gz
tar xzf ruby-2.2.0.tar.gz
cd ruby-2.2.0
mkdir -p /home/me/compiled-rubies/2.2.0p0
# configure with the same compilation options
# of the standard binary
CFLAGS=-O2 ./configure --enable-load-relative \
  --sysconfdir=/etc \
  --disable-install-doc --enable-shared \
  --prefix=/home/me/compiled-rubies/2.2.0p0
make
make test
make install
# make it available to rvm as ext-ruby-2.2.0-gcc4.9_O2
rvm mount /home/me/compiled-rubies/2.2.0p0 \
  -n ruby-2.2.0-gcc4.9_O2
rvm list
...
    ext-ruby-2.2.0-gcc4.9_O2 [ x86_64 ]
 => ruby-2.2.0 [ x86_64 ]
...
rvm use ext-ruby-2.2.0-gcc4.9_O2

The files in ~/.rvm/rubies/ext-ruby-2.2.0-gcc4.9_O2 will be symlinks to the ones in compiled-rubies/2.2.0p0 so don’t remove that directory.

The point of this post is already made but as a bonus here are the benchmarks of the two Rubies using Antonio Cangiano’s tests.

git clone git://github.com/acangiano/ruby-benchmark-suite.git
cd ruby-benchmark-suite
rvm use ruby-2.2.0 # for the standard one
rake
rvm use ext-ruby-2.2.0-gcc4.9_O2 # for the compiled one
rake # This might fail, see the note at the end

Here are the results: ruby-2.2.0 and ruby-2.2.0-gcc4.9_O2 (YAML), summary (CSV). TL;DR: the compiled Ruby is a little bit faster overall. It’s much faster in a few tests, a bit slower in some others. It’s a difficult choice and it probably depends on what you do.  Please notice all those tests that ended with errors (look at the YAML files). They could make a difference for the overall assessment of which version is faster but I didn’t dig into that issue yet.

In case of failure

Rake could end with a weird syntax error for the compiled Ruby. There are two possible fixes. One is to replace `which rake` with the version from the 2.2.0 binary distribution. The other is to really understand what’s going on. The key is: that rake is a bash script which execs a Ruby interpreter on itself using ruby’s -x switch which strips away the bash script at the beginning. But Ruby doesn’t seem to honour that. No time to investigate any further now…

Standard
Technology and Software

Running Ruby on Rails tests with a ramdisk backed PostgreSQL

I’m not a fan of mocking objects in my Ruby On Rails tests so my tests always hit the database, which is PostgreSQL anytime I can make the choice and a customer doesn’t dictate MySQL.

Hitting the DB means that the test suite eventually slows down as tests pile up as the application grows.
It’s been a long time since I wanted to check what happens if I run my tests on a database backed on RAM instead of by a spinning disk. Are they going to be much faster?

Tl;dr No, they run at the same speed.

Creating the DB on the ramdisk

The setup is based on information provided at http://stackoverflow.com/questions/11764487/unit-testing-with-in-memory-database and http://jayant7k.blogspot.com.au/2006/08/ram-disk-in-linux.html
Thanks!

The test system is my laptop, an Ubuntu 12.04, i7-4700MQ CPU @ 2.40GHz, 16 GB RAM, SDD for the OS, HDD for my home and databases. The DB is PostgreSQL 9.3.

Linux has 16 ramdisks already created as /dev/ram* at boot time. Let’s take one and mount it.

mkdir ~/tmp/ram
sudo mkfs.ext4 -m 0 /dev/ram0
sudo mount /dev/ram0 ~/tmp/ram/
df -h ~/tmp/ram/
Filesystem Size Used Avail Use% Mounted on
/dev/ram0 58M 1.3M 56M 3% /home/me/tmp/ram

It’s a tiny disk and it turned out to be barely enough to accommodate my tests but it’s OK for experimenting.
You can make it larger if you need to. http://jayant7k.blogspot.com.au/2006/08/ram-disk-in-linux.html explains how.

We create a DB there now.

cd ~/tmp/ram
sudo bash
mkdir postgresql
chown postgres.postgres postgresql/
su - postgres
/usr/lib/postgresql/9.3/bin/initdb --locale=en_US.UTF-8 -D ~/tmp/ram/postgresql/
exit
mkdir postgresql/log
chown postgres.postgres postgresql/log/

We make it use a different port from the one used by the default PostgreSQL DB on the laptop.

vi postgresql/postgresql.conf

port = 5433

We start the DB and connect to it

sudo -u postgres /usr/lib/postgresql/9.3/bin/pg_ctl \
  -D ~/tmp/ram/postgresql/
  -l ~/tmp/ram/postgresql/log/postgresql-9.3-main.log start
psql -p 5433 -U postgres
 \l

Great!

Running tests

We edit config/database.yml to use the ramdisk DB

port: 5433

We create the test user and the test db

psql -p 5433 -U postgres
create role testuser login password 'password';
alter user testuser with createdb;
create database myapp_test owner testuser encoding='UTF8' lc_collate='en_US.UTF-8' lc_ctype='en_US.UTF-8';
\q
exit

We create the DB schema

cd the/rails/directory
RAILS_ENV=test rake db:migrate

And finally we benchmark the tests over the two databases.

rake spec:controllers

RAMDISK

Finished in 1 minute 5.34 seconds (files took 1.76 seconds to load)
Finished in 1 minute 4.26 seconds (files took 1.75 seconds to load)
Finished in 1 minute 2.07 seconds (files took 1.75 seconds to load)

HDD

Finished in 1 minute 7.09 seconds (files took 1.76 seconds to load)
Finished in 1 minute 6.01 seconds (files took 1.72 seconds to load)
Finished in 1 minute 4.68 seconds (files took 1.74 seconds to load)

2 seconds are not worth the trouble. Let’s benchmark the models.

rake spec:models

RAMDISK

Finished in 1 minute 36.8 seconds (files took 1.69 seconds to load)
Finished in 1 minute 38.08 seconds (files took 1.72 seconds to load)
Finished in 1 minute 37.9 seconds (files took 1.73 seconds to load)

HDD

Finished in 1 minute 38.64 seconds (files took 1.79 seconds to load)
Finished in 1 minute 32.73 seconds (files took 1.69 seconds to load)
Finished in 1 minute 41.89 seconds (files took 1.71 seconds to load)

No difference at all, only a bit more variance in the durations of the HDD tests.

This is my conjecture. The data go first into the OS file buffer, then are synced to the disks. Syncing to ramdisk is faster but if there is enough RAM data is staying in RAM anyway and it doesn’t matter if we’re using a ramdisk or a HDD. Remember: this is a test DB with a handful of data, not a large production DB with high I/O loads.

Let’s stop the DB and change the configuration to do without syncing. If there is no speedup my conjecture should be confirmed.

sudo -u postgres /usr/lib/postgresql/9.3/bin/pg_ctl \
  -D ~/tmp/ram/postgresql/ \
  -l ~/tmp/ram/postgresql/log/postgresql-9.3-main.log stop
sudo vi ~/tmp/ram/postgresql/postgresql.conf
fsync=off
sudo -u postgres /usr/lib/postgresql/9.3/bin/pg_ctl \
  -D ~/tmp/ram/postgresql/ \
  -l ~/tmp/ram/postgresql/log/postgresql-9.3-main.log start

Run the test on the ramdisk again.

rake spec:models

RAMDISK

Finished in 1 minute 36.52 seconds (files took 1.71 seconds to load)
Finished in 1 minute 35.45 seconds (files took 1.7 seconds to load)
Finished in 1 minute 36.59 seconds (files took 1.68 seconds to load)

No difference, so my tests didn’t move enough data to make the syncing operations relevant.

Conclusions

You can keep your test DB on a spinning disk and the OS buffering will make it fast.
If you want quick tests you probably have to mock everything and do without the DB.

Alternative: run tests in parallel with the parallel_tests gem.

Questions

Could I have done something better to make the ramdisk based DB run faster?

Standard