Extract Speed from DB2

  • Hello again all, I am back again.

    This time with a new company, new city, and new role!

    I am now in a focused SSIS role, on a BI group, with a path toward advanced SSAS work. (yay!)

    So here is my first question, anyone have any tricks to improve the speed of data pulls from DB2? These are data dumps from a source system to a staging table set on 2008 R2.

    Presently it is using the MS OLE DB2 driver, and pulling very slow, 5-10 minutes for 1 million rows, and over all we are talking close to 100 million rows needed. This is just one of three source systems, the others being either Oracle or SQL Servers, and these other connections extract at far greater speeds, making me less suspect the hardware on the destination server.

    Any thoughts?

    Thanks tons!

  • This is a hard question to answer because there are so many variables.

    Do you have the latest drivers?

    Is the pipe, and all the components in between you and the DB2 database big enough to handle your traffic?

    Is your query being throttled at the source?

    Did you remember to use the fast load setting on your destination?

    What packet size setting are you using in your data connection?

  • Yes, it is a hard question to answer, that I am aware of.

    The drivers are those which come with a SQL 2008 R2 installation, so I would guess not the latest drivers.

    I believe the pipe & components between are good. I don't have much info on the DB2 machine itself, but other source systems to this particular server function at much higher rates of through put.

    I will have to find out on the throttling at the source, that is one of the things I was wondering about myself.

    It is using the fast load settings.

    Packet size is set to: I am sure it is set to the default.

    Would anyone know if there is a high speed DB2 driver anywhere? I know for example, that for an Oracle system Attunity is a good high speed driver which has helped transfer speed for me in the past.

    Maybe a simpler question would be what is a reasonable through put for a data pull from DB2? I have no experience with DB2 elsewhere, my expectations might be too high.

  • We have the same problem. A SELECT directly from a linked server is absolutely horrible. For some reason, an OPENROWSET through the same linked server worked a lot better but is still disappointing.

    One thing to make sure of, and this isn't a joke... I'm dead serious, is to make sure that the battery that powers the cache system on the DB2 box (typically an AS400) hasn't run out of juice. It didn't help a huge amount with data transfers but the DB2 system nearly quadrupled in speed because... it finally could use cache.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I guess I'll ask a similar question on these thread. Is there any way to bulk export data from DB2 to a Tab delimited file? One of the "fixes" we made for previous large data/long haul problems (between like servers, though) was to do such an export and Fedex the table backup of the export. Compared to an 'over the line' transfer for the same amout of data, it was a lot faster.

    I'd like to do the same thing locally but no one here (not even the AS400 guys) know how to export DB2 files (tables) to delimited files.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, you always manage to give me such surprising answers.

    Battery, hmm, now to find out in the IT hierarchy of this place, whom could check and tell me that.

    Oh and yes it is an AS400.

  • The driver we use is

    IBM DB2 for i IBMDASQL OLE DB Provider.

    If I am not mistaken there is a download, install, and configuration process to get this driver working. I did not install or configure it so I am not sure. But it seems fairly performant. We have not had substantial issues with it.

  • David.Lester (4/29/2013)


    Jeff, you always manage to give me such surprising answers.

    Battery, hmm, now to find out in the IT hierarchy of this place, whom could check and tell me that.

    Oh and yes it is an AS400.

    Heh... it's because I'm 3 days older than dirt and if I haven't seen it yet, it's usually because it hasn't happened, yet. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you Daniel, that was a driver I was looking into as well.

  • heh, well I am thinking my experiencing oddness levels are going to increase radically. This place is 20 times larger than the last place I was at. I think it is safe to say that increases the odds of weirdness.

  • Jeff Moden (4/29/2013)


    I guess I'll ask a similar question on these thread. Is there any way to bulk export data from DB2 to a Tab delimited file? One of the "fixes" we made for previous large data/long haul problems (between like servers, though) was to do such an export and Fedex the table backup of the export. Compared to an 'over the line' transfer for the same amout of data, it was a lot faster.

    I'd like to do the same thing locally but no one here (not even the AS400 guys) know how to export DB2 files (tables) to delimited files.

    For an AS400, take a look at the CPYTOIMPF and CPYTOSTMF commands. They can copy to the integrated file system that's accessible from a PC. It's been a while since I used an AS400, but I was copying data to text and dbase files in 1996, and it was very fast. If the DB2 is the mainframe version, I have no clue as to how to export data.

  • Another thing to consider, too, is do you have any sort of identifier or key in the DB2 source that could allow you to effectively break your source data into multiple sets/processes in your SSIS export package? If so, you could design your SSIS package(s) so that you can have multiple data flow tasks running concurrently, each with its own specific range of data to extract from the source, likely reducing the overall export time to your stage table.

  • Thanks dg.

    Actually, this is connected to a large warehouse staging SSIS process, we are pulling roughly 50 tables from the db2 source, and it already uses concurrency. We still need to find a way to reduce the extract times beyond this.

  • Ross McMicken (5/1/2013)


    Jeff Moden (4/29/2013)


    I guess I'll ask a similar question on these thread. Is there any way to bulk export data from DB2 to a Tab delimited file? One of the "fixes" we made for previous large data/long haul problems (between like servers, though) was to do such an export and Fedex the table backup of the export. Compared to an 'over the line' transfer for the same amout of data, it was a lot faster.

    I'd like to do the same thing locally but no one here (not even the AS400 guys) know how to export DB2 files (tables) to delimited files.

    For an AS400, take a look at the CPYTOIMPF and CPYTOSTMF commands. They can copy to the integrated file system that's accessible from a PC. It's been a while since I used an AS400, but I was copying data to text and dbase files in 1996, and it was very fast. If the DB2 is the mainframe version, I have no clue as to how to export data.

    Thanks, Ross. I'll check it out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, you should also check to see if the AS400 you need to extract from has any third party query tools on it, or Query/400. Those generally have th eability to save data to the area that's accessible from other computers.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply