Tuesday, December 23, 2014

Connecting to Denodo Virtual DataPort from Go

Perhaps I can reuse the setup of my previous post for another database connectivity experiment, this time from Go.

I have very little knowledge of Go. It seems like a pragmatic language with good tooling. And there exists a PostgreSQL driver written in pure Go that I can use.

The first step is to install Go and set up the environment. Once ready, the PostgreSQL driver can be installed in the workspace with the command go get github.com/lib/pq.

Then we create a project named github.com/danidiaz/hellopq in the workspace, with the following Go file:

The code was adapted from the one shown in this video. The connection parameters were taken from here.

Executing the program, it seems to work!

But I have a remaining doubt: how to properly handle the nullability of the last_updated column?

Monday, December 22, 2014

Connecting to Denodo Virtual DataPort from Haskell

My employer Denodo has released an "Express" version of its data virtualization platform. The Virtual DataPort component lets you do cool things like joining tables that reside in completely different databases (say, one table in Oracle and another in MySQL). And it comes with a boatload of connectors for a wide range of data sources.

I'm interested in accessing all that data goodness from Haskell. There are a couple of ways of doing it.

One would be to use Virtual DataPort's REST interface. That's a valid option and maybe the subject of another post.

Another is to connect through the ODBC interface on port 9996. As it happens, the wire protocol aims to be compatible with that of PostgreSQL, so maybe I could just use Haskell's popular postgresql-simple library. Let's try it out.

Musicbrainz


First, we need an underlying database instance that we can "wrap" with Virtual DataPort. Let's use the Musicbrainz database, a perennial favorite for testing. We can download it as a virtual machine and run it on Virtual Box. I usually set the guest networking to NAT and only forward PostgreSQL's port, which is 5432.

Virtual DataPort


Then we download and install Denodo Express, start the Virtual DataPort server, and launch the graphical administration tool.

From the administration tool's GUI, we have to import a few tables from the Musicbrainz database. First we create a JDBC data source:


(The Musicbrainz database is called musicbrainz_db, the user/password is musicbrainz/musicbrainz.)

And then import the artist_name and release tables:


Haskell


Ok, now for the Haskell part. postgresql-simple depends on native libraries. In CentOS 7 for example, we'll have to install the package postgresql-devel using yum, before invoking cabal install postgresql-simple.

Once we have the required dependencies, we can dabble with the following snippet (notice that we are connecting to Virtual DataPort, not directly to Musicbrainz):

...and it works! A sweet stream of data flows from the Musicbrainz database, passes through Virtual DataPort and arrives at our Haskell client.

And thanks to the virtualization capabilities of Virtual DataPort, that stream could potentially represent the combined flow of a number of affluents.