Suprtool

Migrating Data to PostgreSQL

By Aaron Holmes, Robelle Junior Programmer

There comes a time in every programmer's life when they are faced with the problem of how to store and retrieve data on a larger scale and a simple flat file will not suffice. Being a web developer and open source advocate, I have always turned to mySQL for my DBMS needs. However, I have been hearing more and more good things about PostgreSQL and have finally decided to give it a try to see what all the hype was about.

PostgreSQL is an Object-Relational database management system that supports almost all SQL constructs, including subselects, transactions, and user-defined types and functions. It is free to download, use, and modify provided the included copyright notice is included. Also, it is able to run under many OS types, include Unix variants and Windows machines under the Cygwin environment.

For my testing purposes I used an older AMD Duron 800mhz system and installed FreeBSD 4.5 Stable on it. I then compiled Apache 1.3.26 with PHP 4.2.1 and PostgreSQL 7.2.1 PostgreSQL itself was very easy to install - a simple CVSup updated of the linux ports directory allowing me to compile a FreeBSD compatible version of the newest PostgreSQL source files.

To obtain the source files and help documentation for PostgreSQL you can point your browser over to www.postgresql.org and select a mirror closest to you.

With the software compiled and running, I created a new database. This was handled by the included createdb command which accepts the syntax createdb databasename.

I needed some test data for my new database so I decided to use some old data I had laying around from an address book program I used to use, which was stored in a comma delimited text file (Suprtool creates such files easily from TurboIMAGE data on the HP 3000).

Now seemed like a good time to build a new table within the database. To do this I use the psql command which is also included with PostgreSQL. To run psql type psql mydbname from the command prompt. Psql is the interactive terminal program which allows you to interactively enter, edit, and execute SQL commands directly. The actual code to create a table within psql is as follows:

    CREATE TABLE addressbook (
    	firstname           	 varchar(25),
    	lastname            	 varchar(25),     
    	email         	     	 varchar(20),          
    	phone                	 int(11),    
	age		    	 int(2), 
    );

I built the table to mimic that of the data I was using, included were a couple int columns and several varchar columns of varying size. The design of the table was overall very simple and resembled many of the tables I have created in mySQL for online use.

The next step was to populate the table with the data from the text file. This was easy to do by using the COPY command, which loads large amounts of data (either character or Binary) from flat-text files. You tell COPY which file to load by specifying FROM '/directory/file'. By default, COPY uses a tab ("\t") character as a delimiter between fields, I had to change this behavior by including USING DELIMETERS ','

The COPY command, however, has a few potential pitfalls. Such as if you don't have enough columns in the file, you will get an error, but if you have too many columns you will get a warning only and the extra columns are ignored. Also remember that COPY is executed as a transaction, meaning that a single error in the data causes an undo of the entire import operation. As always it is good practice to read over the intricacies of the COPY command in the PostgreSQL help docs (http://www.postgresql.org/idocs/).

It is entirely possible to access, modify and remove the data from your database completely within the psql tool, but I often find it much more desirable to use PHP access the database via the web. PHP, when compiled with the -with pgsql directive, comes with a vast library of functions for interacting with PostgreSQL (much the same as with mySQL).

Overall, the migration to PostgreSQL went smoothly without any unwarranted surprises. Of course this would not have been possible without the help of a few resources. Kudos to the PostgreSQL team for supplying the online documentation files on their website. I also used two books to help me:

"PostgreSQL Developer's Handbook" by Evald Geschwinde and Hans-Jurgen Schonig. Published by Sams, 2002.

"Beginning Databases with PostgreSQL" by Richard stones and Neil Matthew. Published by Wrox Press, 2001


For more information on exporting to PostgreSQL, read Bob Green's HPWorld 2002 paper on "Transforming TurboIMAGE Data.

An alternative to PostgreSQL is SAPdb, read our introduction, comparision and evaluation.

To learn more about migrating data to new environments, visit the Robelle Migration Web Center.

You might also be interested in two Suprtool articles on exporting 3000 data to mySQL and exporting to Excel, and exporting to Oracle.

Or another article on exporting to XML, with an introduction to XML for newcomers.