Breaking Up is Easy to Do

, 2007-07-03

Breaking Up is Easy to Do
Hard

It's nice to be able to package a process into a single, tidy, elegant query, but it isn't always possible to do so. And even if it is possible, it may not be the best way to do it. Often, we can get better performance out of large or complex queries by breaking them up into smaller pieces. I encountered a great example of this today.

A developer asked me about a query that was taking a really long time to process in the test environment. The particular step that was having issues is building a long string by concatenating short strings to pass through to a remote server for processing of data on the remote server. It was a simple, recursive string building query.

Declare @RowIDs nvarchar(max)

Select @RowIDs = IsNull(@RowIDs + ',', '') + '''' + RowID + ''''

From StagingTable with(nolock)

When there were only 12,000 ID's to concatenate, it ran in about 9 seconds. When the number increased to 18K ID's, it took twice as long, 18 seconds. At 20K ID's, it was taking 2 minutes, and now at 90K records, it is taking almost 1.5 hours. Obviously, the developer was not happy with performance and asked if I knew why the process time increased exponentially.

Upon observing the query run, I determined that the query was using a really large amount of memory, CPU time, and causing a lot of disk I/O. It was simply trying to do too much at once. So, I decided to see what would happen if I broke it down into smaller, more manageable chunks.

I set up a looping process to build several smaller strings. Then it would loop through the smaller strings and concatenate them together into the large, final string. The process could now build the string of 90K+ ID's, starting with short strings of 10K ID's per string, in 26 seconds. If I lowered the number of ID's in the initial set of shorter strings, it ran even faster. 5K of ID's per short string ran in 21 seconds and 1K of ID's ran in 19 seconds.

This is the final query I ended up with:

Declare @Rows nvarchar(max),

        @Loops int,

        @CurrLoop int,

        @MaxIDsPerLoop int

Declare @RowIDs Table (RowIDKey int identity(1, 1) not null primary key,

                        RowID nvarchar(15) not null)

Declare @RowIDRows Table (RowIDRowKey int identity(1, 1) not null primary key,

                        RowIDRow nvarchar(max) not null)

Set NoCount On

Set @MaxIDsPerLoop = 1000

Set @CurrLoop = 0

Insert Into @RowIDs (RowID)

Select RowID

From StagingDatabase with(nolock)

Select @Loops = (count(RowIDKey) / @MaxIDsPerLoop>) + 1

From @RowIDs

While @CurrLoop < @Loops

    Begin

        Set @Rows = Null

        Select @Rows = IsNull(@Rows + ',', '') + '''' + RowID + ''''

        From @RowIDs

        Where RowIDKey % @Loops = @CurrLoop

        Insert Into @RowIDRows (RowIDRow)

        Select @Rows

        Set @CurrLoop = @CurrLoop + 1

    End

Set @Rows = Null

Select @Rows = IsNull(@Rows + ',', '') + RowIDRow

From @RowIDRows

Select @Rows

Set NoCount Off

Rate

Share

Share

Rate

Related content

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

2009-02-23

1,567 reads

Networking - Part 4

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

2009-02-13

360 reads