Blog Post

How to Fix that Never-Ending Join Dragging Down the Whole DB Server Instance – Pre-Populate the Largest Bugger in Temp with Indexes

,

Now that it has been a

good five years I have been blogging away here and on SSC, the editors recently thanked

us for our work. They also provided valuable feedback that we should give real-world situations that DBAs encounter.  The following has a target of optimising performance, from an actual task that has re-occurred several times since I first wrote on the subject, in various production environments, on an instance that is bogged down by that one

massive query within a stored procedure that has to run all the time, yet is so

huge, important and/or complex everyone is

afraid or unsure how to resolve.

In this post I hope to clearly explain how the combination

of the use of data definition language for your temporary tables and

non-clustered indexes can improve the performance of stored procedures that join data from one or many

large tables by up to seventeen times (at least that was the case

previous time I saw this type of query to optimise) - as I have seen on

stored proc.s that work with tables in the tens of millions.


Temporary tables, if

used frequently or in stored procedures, in turn, end up with significant

input/output disk consumption. To start, one thing we should be aware of is that they are also

created as a heap by default.  As experience has shown, if you are cutting

up a very large table and using the temporary database, it is best to first do

your DDL (data definition language) before running through the rest of your

operation with the temporary data - as opposed Select * INTO #temp. Thus, we should avoid Select * into

#temp as much as possible, unless the number of rows is insignificant,

because being in a single statement, it will create great disk

contention within the temp database:

(N.B. the assumed

pre-requisite is that you've identified the worst query from your plan cache or

have seen the code from Recent Expensive queries listed in Activity Monitor, sorted by worst performing resource)

CREATE TABLE

#MyLot  -- you’ll see that we only need a few columns join in the

end

       (

       [LotId] [int] IDENTITY(1,1) NOT NULL,

       [LotCode] [nvarchar](10) NOT NULL

       )

INSERT into #MyLot ( LotId, LotCode )

 -- e.g. you can also avoid NULLs by using ISNULL(col,0)

Select LotId, LotCode

from MyBigLotTable

Where clause / matching variables

 -- this is where you found out what joins this massive

table with the others and slice it up
 -- horizontally and vertically before (!) making that big join,
 -- and

that is where we obtain the significant performance gains

Create NonClustered Index #idx_Lot ([LotCode] ASC )
-- create index on matching column used in the 'big' join (this case it was a 5 table join)
-- the glaring ID field
-- integrate all this preparation of

#MyLot into the main super slow query

INSERT INTO @result
([Number],[LocId],[BinId],[LotCode],[LotId],[PCode],[PId],[Stock],[StatusCode],[UnitCode])

SELECT
[BIResult].[Number], [Loc].[LocId], [BLoc].[BILocId],[BIResult].[LotCode], #MyLot.[LotId],[BIResult].[PCode],[P].[PId],[BIResult].[Stock],ISNULL([BIResult].[StatusCode],[BIResult].[UnitCode]
FROM OPENXML (@handle"WITH"

                        (

                              [Number]

SMALLINT N'@Number'

                              [LocID] NVARCHAR(10) N'@LocID'

                              [PCode]

NVARCHAR(18) N'@PCode'

                              [LocCode]

NVARCHAR(4) N'@LotCode'

                              [PCode]

NVARCHAR(10) N'@LotId'

                              [Stock] NVARCHAR(MAX) N'@Stock'

                              [StatusCode]

NVARCHAR(3) N'@StatusCode'

                              [UnitCode]

NVARCHAR(1) N'@UnitCode'

                        ) AS [BIResult]

                JOIN [Pt] ON [Pt].[Number] = [BIResult].[Number]
                LEFT JOIN #MyLot --[Lot] was here before, the huge table

                            ON #MyLot.[LotCode] = [BIResult].[LotCode]

                JOIN [P] ON [P].[PtId] = [Pt].[PtId]
                 AND [P].[PCode] = [BIResult].[PCode]

                JOIN [SLoc] ON [SLoc].[PtId] = [Pt].[PId]
                 AND [SLoc].[SLocCode] =

[BIResult].[SLocCode]

                JOIN [BLoc] ON [BLoc].[LocId] = [Loc].[LocId]
                 AND [BLoc].[BLocCode] = [BIResult].[BLocCode]

               WHERE CAST([BIResult].[Stock] AS DECIMAL(13)
              )

-- always explicitly/clearly drop the

temp at the end of the stored proc.

drop table #MyLot -- and the

respective index is dropped too with it

Happy optimising!

Shed the heavy weight of that extra slow query bogging your instance down.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating