Friday, January 2, 2009

Erlang and PostgreSQL

Despite the lack of blog posts, the last four months at heysan have been quite busy. Most of December was spent developing a new web application, entirely in Erlang, which compliments our existing service. This application makes extensive use of PostgreSQL 8.3, which eventually lead me to resurrecting and releasing my own database driver, available in this mercurial repository:

http://glozer.net/src/epgsql

Note: this is a 'static' HTTP repository, so if you're using a version of mercurial older than 1.1 the URL is static-http://glozer.net/src/epgsql.

Yes, I've inflicted yet-another PostgreSQL driver with an incompatible API (and even conflicting module names!) on the Erlang community. I'm very sorry =) But please, hear me out! I'd like to present my case and hopefully convince you to use my driver. While I must admit implementing PostgreSQL's network protocol was rather fun and entertaining, I did attempt to use the existing drivers before embarking on the project of writing a new one.

pgsql (originally from jungerl) is a small and simple driver, which was also forked by Process One for use in ejabberd. I was able to get up and running quickly with this driver, since it supports the "trusted" auth mechanism and has a simple API. On the other hand psql:squery/2 drops null columns from results and psql:pquery/3 fails during row decoding when nulls are present. All columns are returned as Erlang lists, which results in a lot of escaping overhead for bytea columns.

psql (originally from Erlang Consulting) is quite the opposite of pgsql with a very complicated API that requires you to configure connection pools using the application environment. This driver is also prone to hanging in various states, for example when trying to use an authentication method other than MD5. On the bright side it does decode column values into native Erlang types, but fails on very simple queries such as: psql:sql_query(C, "select 1, null").

Since neither driver was suitable for my needs, I spent Christmas vacation writing a new one, epgsql, which features a simple API, converts common db types to native Erlang types, and uses PostgreSQL's binary format for common types. For example the SQL statement "select 1" will return a numeric 1, "select true" will return the atom true, and "select 'hi'::bytea" will return a binary.

This type conversion is used in the extended query protocol, i.e. pgsql:equery and epgsql:parse, epgsql:bind, epgsql:execute:


{ok, C} = pgsql:connect("localhost", []),
{ok, Cols, Rows} = pgsql:equery(C, "select 1, null"),
[{column,<<"?column?">>,int4,4,-1,0}, {column,<<"?column?">>,unknown,-2,-1,0}] = Cols,
[{1,null}] = Rows.


The simple query protocol used by epgsql:squery returns values (other than null which is always the atom 'null') as Erlang binaries:


{ok, C} = pgsql:connect("localhost", []),
{ok, Cols, Rows} = pgsql:squery(C, "select 1, null"),
[{column,<<"?column?">>,int4,4,-1,0}, {column,<<"?column?">>,unknown,-2,-1,0}] = Cols,
[{<<"1">>,null}] = Rows.


Documentation for epgsql is a little sparse at the moment, but the README file covers the basics. I intend to continue developing and supporting this driver so please send me bug reports and enhancement requests!

4 comments:

Jonke said...

Could you please expand a little bit what your needs was because I didn't get it from the post. A nice little conclusion maybe?

ppolv said...

Hello,
this looks a lot like a pgsql branch I started some months ago:

https://forge.process-one.net/browse/ejabberd-modules/pgsql/branches/gen_fsm_rewrite

It also performs data encoding/decoding for common types, can execute batch operations, and works on binaries rather than lists. It has the option to work with the postgres binary or text wire protocol (but I'm going to deprecate the binary wire protocol, in my tests it didn't provide much benefit and caused some incompatibility problems).
Since this week it also support prepared statements.

I agree, having so many incompatible driver implementations is just too confusing.

Perhaps we could work to define a simple, performance-oriented API, even for different DB backends, something like:

http://github.com/archaelus/edbi/tree/master

nikolaeff said...

Experiencing some problems with mercurial repository:

$ hg clone http://glozer.net/src/epgsql
destination directory: epgsql
abort: 'http://glozer.net/src/epgsql' does not appear to be an hg repository!

Will Glozer said...

Hey nikolaeff, if you're using a version of mercurial older than 1.1, prepend 'static-' to the URL schema, as in static-http://glozer.net/src/epgsql.