MS SQL to Postgres ETL / Replication / Synch

  • I've recently been tasked getting my SQL Server 2008 R2 data into a Postgres 9.2.2 server on Linux.

    My source SB is roughly 100GB in size, probably only 15% of which changes daily. I need to propogate those changes to the Postgres server but I'm having trouble with the speed of the various methods I've tried.

    I've tried using the 32bit Postgres ODCB connection in SSIS, this was very slow. On the order of 100 Rows/sec inserting to Postgres. I could not get the 64bit connection to work at all.

    I've tried using CloverETL and open source ETL platform, performance was again pretty slow, On the order of a 1000 rows/sec inserting to Postgres.

    I've also tried Kettle/Spoon ETL platform, which is getting me around 5000 rows/sec into Postgres, which is still very slow for my dataset.

    I think I'm going to have to resort to a BCP out ---> COPY FROM solution. Does anyone have any guidance or insight into how best to propagate data changes into Postgres?

  • My guess is that the pulling of the data from SQL Server is not the bottleneck in this scenario it is the inserting of data into Postgres as 5000 rows/second is nowhere near the max of what even a low-powered SQL Server can support. It sounds like most scenarios save for the bcp out from SQL Server => COPY FROM into Postgres are resulting in the inserting of one row at a time.

    A quick google search turned up this thread which points to this document on how to best load data into Postgres.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • While using CloverETL - did you use its Postre native loader ? You may also use the classic DBOutputTable with BATCH mode switched on. Both options should give you much better performance.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply