Datawarehouse, "large" INSERTS, memory usage

  • We are developing our first datawarehouse. When running a developer query (an INSERT of a modest dataset, 4 millions rows) the DW Server ran out of memory and had to be rebooted! The develoepr query was created using a thrid-party ETL tool.

    The x64 DW server has 8GB of RAM but SQL 2005 takes a large chunk of it. Task manager, Physical Memory (K), Available, shows around 200,000 K (i.e. 200MB) of free memory when the INSERT process is running.  That seems too little to me. I'm hoping the developer can "break up" the INSERT statement but should I also trim the amount of memory allocated to sql?

    Here's a bit more info from "winmsd"

    Total Physical memory:  8,023.28 MB

    Available Physical memory: 199.13MB

    Total Virtual memory 19.17GB

    Available Virtual memory 11.82 GB

    Page File Space  11.75GB

     

    Barkingdog

     

  • I'm no DBA, so I won't dwell too much on the techie stuff like expanding the temp space or moving it to a disk with more free capacity etc. I'll leave that to a real DBA.

    However, something you can do through programming alone is to batch your update. Assuming your INSERT draws data from other table(s), you could say, just SELECT and INSERT 100k or 500k rows at a time. The simplest way would be to put an IDENTITY column on one of your SELECTed tables and firstly process only where the IDENTITY column is in the range 1 to 100K, then after the commit, loop round and process IDENTITIY 100,001 to 200K and so on.

    Does that help?

  • Does the table the developer have primary key, foreign key and indexes? If the table has those, it will take a long time and a lot of resources. To make the system runs fast, drop those PK, FK, indexes before the insert, do the insert and put the PK, FK and indexes back to the table, it will run much faster.

    my 2 cents.

  • Many ETL tools (e.g. SSIS, Informatica) allow you to specify the "batch size" to be used during inserts which causes the tool to send X number of rows to the target server, commit and then send the next X number of rows... have your developer check for a similar setting in your tool.

  • We can also use cursors for bulk inserts.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply