SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


INSERT with ordered select


INSERT with ordered select

Author
Message
Rasmus Remmer Bielidt
Rasmus Remmer Bielidt
SSChasing Mays
SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)

Group: General Forum Members
Points: 639 Visits: 142
I am trying to optimise a script for performing DELETES in a DW staging table. DELETES are not logged on the source system, only an extract of all current rows is possible and then deleting on the staging table using a ..WHERE NOT EXISTS () query. I am performing the extract through a linked server.

Currently this is done by the following query.
-- Drop table if it should somehow exist at run-time (requirement for using SELECT INTO that the table does not exist)
IF EXISTS (
SELECT * from tempdb.dbo.sysobjects o
WHERE o.xtype in ('U') AND o.id = object_id(N'tempdb..#XALdboStocktransRecIDtable')
) DROP TABLE #XALdboStocktransRecIDtable;

-- Select all DATASET, ROWNUMBER into the temporary table (at tempdb)
SELECT [DATASET], [ROWNUMBER]
INTO #XALdboStocktransRecIDtable
FROM [SQLXAL].[XALDRIFT].[dbo].[STOCKTRANS]
WHERE [DATASET] IN ('FPC','MAL');

-- Delete all rows in the VALID instance where DATASET, ROWNUMBER does no longer exist on the source (SQLXAL)
DELETE A
FROM [TX_Stage].[dbo].[XAL_dbo_STOCKTRANS_V] A
WHERE NOT EXISTS (
SELECT *
FROM #XALdboStocktransRecIDtable B
WHERE [A].[DATASET]=[B].[DATASET] AND [A].[ROWNUMBER]=[B].[ROWNUMBER]
);

-- Drop the table to clean up the tempdb
IF EXISTS (
SELECT * from tempdb.dbo.sysobjects o
WHERE o.xtype in ('U') AND o.id = object_id(N'tempdb..#XALdboStocktransRecIDtable')
) DROP TABLE #XALdboStocktransRecIDtable;



Now, this seems to perform better if there is a clustered index on the temporary table #XALdboStocktransRecIDtable. However, approximately the same time is being consumed inserting into the table with the index as is saved by creating the table with index as in the below script.
-- Drop table if it should somehow exist at run-time 
IF EXISTS (
SELECT * from tempdb.dbo.sysobjects o
WHERE o.xtype in ('U') AND o.id = object_id(N'tempdb..#TestTable')
) DROP TABLE #TestTable;

CREATE TABLE #TestTable(
[DATASET] [varchar](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[ROWNUMBER] [int] NOT NULL
)

CREATE CLUSTERED INDEX [PK] ON #TestTable
( [DATASET] ASC,
[ROWNUMBER] ASC
)

INSERT INTO #TestTable ([DATASET],[ROWNUMBER])
SELECT [DATASET], [ROWNUMBER]
FROM [SQLXAL].[XALDRIFT].[dbo].[STOCKTRANS]
WHERE [DATASET] IN ('FPC','MAL')
ORDER BY [DATASET] ASC ,[ROWNUMBER] ASC

-- Delete all rows in the VALID instance where DATASET, ROWNUMBER does no longer exist on the source (SQLXAL)
DELETE A
FROM TX_Stage.dbo.XAL_dbo_STOCKTRANS_V A
WHERE NOT EXISTS (
SELECT *
FROM #TestTable B
WHERE [A].[DATASET]=[B].[DATASET] AND [A].[ROWNUMBER]=[B].[ROWNUMBER]
);

-- Drop the table to clean up the tempdb
IF EXISTS (
SELECT * from tempdb.dbo.sysobjects o
WHERE o.xtype in ('U') AND o.id = object_id(N'tempdb..#TestTable')
) DROP TABLE #TestTable;



The source system table is indexed on DATASET, ROWNUMBER so no additional strain is placed on that system by the "ORDER BY" clause. The staging table has a non-clustered index on DATASET, ROWNUMBER - the clustered index on this table is DW_Id.

INSERT INTO the temporary table defined with the clustered index is roughly one third more costly (time) than the SELECT INTO which is currently used, however the DELETES performed on the staging table is lower by roughly the same amount of time.

I read about the possibility of specifying "ORDER" clause for the source data on a BULK INSERT to tell the SQL Server hosting the staging table that the input is coming in already ordered by the clustered key, but for the life of me, I cannot figure out the correct syntax for it. My hope is to mix the best of both worlds with a lowest time cost insert and lowest time cost DELETES.

I hope someone is able to help with this Smile

[edit]
I tried doing the DELETE statement with the source linked server table as test, but that turns out to take much longer than either of the two methods described above.
sturner
sturner
SSCertifiable
SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)

Group: General Forum Members
Points: 5214 Visits: 3259
try inserting into the temp table without an index (or just use select into ) and without the order by, then create the clustered index after the insert.

Also, create the index as a unique clustered index if rownumber-dataset are in fact all unique.

The probability of survival is inversely proportional to the angle of arrival.
Rasmus Remmer Bielidt
Rasmus Remmer Bielidt
SSChasing Mays
SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)SSChasing Mays (639 reputation)

Group: General Forum Members
Points: 639 Visits: 142
Thank you for the tips.

Specifying the index as unique did shave off time in the overall processing from what I had tested previously.

But creating the index after the insert adds time to the overall process (both for the "SELECT INTO" and the "CREATE TABLE then INSERT INTO" methods).
sturner
sturner
SSCertifiable
SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)

Group: General Forum Members
Points: 5214 Visits: 3259
rrn 62873 (8/15/2012)
Thank you for the tips.

Specifying the index as unique did shave off time in the overall processing from what I had tested previously.

But creating the index after the insert adds time to the overall process (both for the "SELECT INTO" and the "CREATE TABLE then INSERT INTO" methods).


Well it was worth a try, but it does depend upon how much performance your tempdb is configured for.

You might even experiment with a permanent staging table, then create the index using SORT_IN_TEMPDB option. Again, any possible advantage depends upon how your disks are set up and how the database is configured.

The probability of survival is inversely proportional to the angle of arrival.
Jason-299789
Jason-299789
SSCertifiable
SSCertifiable (5.1K reputation)SSCertifiable (5.1K reputation)SSCertifiable (5.1K reputation)SSCertifiable (5.1K reputation)SSCertifiable (5.1K reputation)SSCertifiable (5.1K reputation)SSCertifiable (5.1K reputation)SSCertifiable (5.1K reputation)

Group: General Forum Members
Points: 5081 Visits: 3232
Have you tried turning the problem around and doing a select of the Dataset & RowNumbers that are missing between the Staging and source tables

As this should then produce the dataset that you want to delete

something like this (Needs to be checked that you are getting the rows that are deleted back).


SELECT [DATASET], [ROWNUMBER]
INTO #TestTable
FROM TX_Stage.dbo.XAL_dbo_STOCKTRANS A
Where NOT EXISTS (Select 1 From [SQLXAL].[XALDRIFT].[dbo].[STOCKTRANS] B
Where [A].[DATASET]=[B].[DATASET] AND [A].[ROWNUMBER]=[B].[ROWNUMBER])


DELETE B
FROM TX_Stage.dbo.XAL_dbo_STOCKTRANS_V B
JOIN #TestTable A on [A].[DATASET]=[B].[DATASET]
AND [A].[ROWNUMBER]=[B].[ROWNUMBER]



_________________________________________________________________________
SSC Guide to Posting and Best Practices
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search