Remote Query Performance Tuning

  • I have a stored procedure that runs a remote query to fill a temp table. It is by far the most IO and CPU intensive query on my server. Hopefully someone can help me understand how to optimize this query. The remote table has a bit over 3 million rows. Looking at the execution plan the remote scan is 97% of the query cost. I'm thinking that I should be able to select just a subset of that table rather than the whole thing but I am still trying to decipher the CTE (I'm still relatively new to them and find their syntax baffling). Thank you for your help.

    Nathan

    This is the remote table definition

    USE [usascada]

    GO

    /****** Object: Table [dbo].[FloatTable] Script Date: 06/09/2010 10:04:17 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[FloatTable](

    [DateAndTime] [datetime] NULL,

    [Millitm] [smallint] NULL,

    [TagIndex] [smallint] NULL,

    [Val] [float] NULL,

    [Status] [varchar](1) NULL,

    [Marker] [varchar](1) NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    USE [usascada]

    GO

    /****** Object: Index [FloatTableIndex] Script Date: 06/09/2010 10:04:52 ******/

    CREATE CLUSTERED INDEX [FloatTableIndex] ON [dbo].[FloatTable]

    (

    [DateAndTime] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 70) ON [PRIMARY]

    USE [usascada]

    GO

    /****** Object: Index [TagIndex] Script Date: 06/09/2010 10:05:15 ******/

    CREATE NONCLUSTERED INDEX [TagIndex] ON [dbo].[FloatTable]

    (

    [TagIndex] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    Here is the query

    declare @sd datetime;

    select @sd = dateadd([day], datediff([day], '19000101', getdate()) - 1, '19000101');

    -- create temp tagData table

    CREATE TABLE #tagData (

    [DateAndTime] [datetime] NULL ,

    [Millitm] [smallint] NULL ,

    [TagIndex] [smallint] NULL ,

    [Val] [float] NULL ,

    [Status] [char] (1) NULL ,

    [Marker] [char] (1) NULL );

    INSERT INTO #tagData SELECT * FROM OPENQUERY([scada-02],'SELECT * FROM usascada.dbo.FloatTable');

    WITH cteSequence AS

    ( SELECT TagIndex, DateAndTime, Millitm, Val, ROW_NUMBER() OVER(PARTITION BY TagIndex ORDER BY DateAndTime, Millitm DESC) as row

    FROM #tagData),cteSeqDups AS

    ( SELECT *FROM cteSequence s1WHERE EXISTS

    ( SELECT * FROM cteSequence s2 WHERE s2.TagIndex = s1.TagIndex AND s2.row = s1.row - 1 AND s2.Val = s1.Val)

    )

    DELETE FROM cteSeqDups WHERE DateAndTime < @sd;

    INSERT INTO scada_archive.dbo.ScadaArchive SELECT * FROM #tagData WHERE DateAndTime < @sd;

  • This is the cause of the Remote Scan, not the CTE:

    INSERT INTO #tagData SELECT * FROM OPENQUERY([scada-02],'SELECT * FROM usascada.dbo.FloatTable');

    What I'd suggest is putting some filters into that SELECT statement to limit the amount of data being moved across the system. That's going to give you the biggest hit.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • That is what I figured however in order to know what filter to apply to the select statement I have to know what the CTE is doing to make sure I get the right records.

    Nathan

  • In this line of code: INSERT INTO scada_archive.dbo.ScadaArchive SELECT * FROM #tagData WHERE DateAndTime < @sd;

    you filter on DateAndTime... just filtering your remote table on that will return fewer records.

    Is there an index on the DateAndTime column? If not, add one, it will help as well.

  • bleister88 (1/14/2011)


    In this line of code: INSERT INTO scada_archive.dbo.ScadaArchive SELECT * FROM #tagData WHERE DateAndTime < @sd;

    you filter on DateAndTime... just filtering your remote table on that will return fewer records.

    Is there an index on the DateAndTime column? If not, add one, it will help as well.

    The where will definitely, maybe help. But the index will most likely not help unless it's clustered. Assuming you're getting all the data before midnight today, then you're going to get almost all the data in the table anyways, which renders and extra index useless.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply