Understanding a CTE

  • Hi everyone, I have a CTE that I inherited and am having trouble understanding what it does. If someone could help me out or point me to some good tutorials on CTEs so I can figure it out myself I would appreciate it. The code is below. Thanks.

    CREATE TABLE #tagData (

    [DateAndTime] [datetime] NULL ,

    [Millitm] [smallint] NULL ,

    [TagIndex] [smallint] NULL ,

    [Val] [float] NULL ,

    [Status] [char] (1) NULL ,

    [Marker] [char] (1) NULL );

    INSERT INTO #tagData SELECT * FROM OPENQUERY([scada-02],'SELECT * FROM usascada.dbo.FloatTable');

    WITH cteSequence AS

    ( SELECT TagIndex, DateAndTime, Millitm, Val, ROW_NUMBER() OVER(PARTITION BY TagIndex ORDER BY DateAndTime, Millitm DESC) as row

    FROM #tagData),cteSeqDups AS

    ( SELECT *FROM cteSequence s1WHERE EXISTS

    ( SELECT * FROM cteSequence s2 WHERE s2.TagIndex = s1.TagIndex AND s2.row = s1.row - 1 AND s2.Val = s1.Val)

    )

    DELETE FROM cteSeqDups WHERE DateAndTime < @sd;

    INSERT INTO scada_archive.dbo.ScadaArchive SELECT * FROM #tagData WHERE DateAndTime < @sd;

    I would really like to know what the CTE does but my concern is weather I can change the statement that fills the temp table to select * where DateAndTime < @sd

  • At the risk of getting blasted by an expert, here is my take:

    The CTEs are used to find all the duplicate instances of data for columns TagIndex and Val. The duplicates are deleted and then placed in an archive table. The CTEs are written so that the most recent instance of a row that is known to have duplicates is excluded (and thus, excluded from the delete statement).

    Here is my best attempt to explain through the code:

    IF OBJECT_ID('tempdb..#tagData') IS NOT NULL DROP TABLE #tagData

    CREATE TABLE #tagData (

    [DateAndTime] [datetime] NULL ,

    [Millitm] [smallint] NULL ,

    [TagIndex] [smallint] NULL ,

    [Val] [float] NULL ,

    [Status] [char] (1) NULL ,

    [Marker] [char] (1) NULL );

    -- Use sample data to duplicate the query behavior

    DECLARE @sd DATETIME

    SET @sd = '1/5/2010'

    -- Load all the tag data into temp table

    INSERT INTO #tagData

    -- SELECT * FROM OPENQUERY([scada-02],'SELECT * FROM usascada.dbo.FloatTable');

    SELECT

    '1/1/2010',

    1,

    1,

    100,

    NULL,

    NULL

    UNION ALL

    SELECT

    '1/2/2010',

    5,

    1,

    100,

    NULL,

    NULL

    UNION ALL

    SELECT

    '1/3/2010',

    10,

    1,

    100,

    NULL,

    NULL

    UNION ALL

    SELECT

    '1/1/2010',

    1,

    2,

    200,

    NULL,

    NULL

    UNION ALL

    SELECT

    '1/2/2010',

    1,

    2,

    200,

    NULL,

    NULL

    -- Prepare first CTE

    ;WITH cteSequence AS

    (

    -- Select all rows from temp table

    -- Create a row number for each row

    -- The row number resets back to 1 for each change in TagIndex

    -- The row number is ordered by a combination of DateAndTime, MilliTm in descending order

    -- In other words, the data is ordered from most recent to earliest

    SELECT

    TagIndex,

    DateAndTime,

    Millitm,

    Val,

    ROW_NUMBER() OVER ( PARTITION BY TagIndex ORDER BY DateAndTime, Millitm DESC) as row

    FROM #tagData)

    -- Prepare second CTE

    -- Get all the records from the first CTE where...

    -- there are two different rows with the same value for TagIndex and Val

    --

    -- Note: this CTE excludes the most recent row that has a duplicate

    -- value in order to keep the at least one instance of the row

    ,cteSeqDups AS

    (

    SELECT *

    FROM cteSequence s1

    WHERE EXISTS

    ( SELECT * FROM cteSequence s2 WHERE s2.TagIndex = s1.TagIndex AND s2.row = s1.row - 1 AND s2.Val = s1.Val) )

    -- Delete all instances of duplicate entries of TagIndex and Val

    -- from the temp table (via the CTEs) where the DateAndTime value

    -- is less than the @sd parameter

    DELETE FROM cteSeqDups WHERE DateAndTime < @sd;

    -- Put those duplicates into an archive table

    -- INSERT INTO scada_archive.dbo.ScadaArchive

    SELECT * FROM #tagData WHERE DateAndTime < @sd;

  • It sure looks like it's going through a de-dup process to me.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you very much this really helped me understand what is going on. I really appreciate the help and the clarity of your answer.

    Nathan

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply