need help with a query

  • I want to compare column values of 2 consecutive records in a table of 425 records and capture the column values that existed in both records.

    Example:

    DDL of the table is below.

    CREATE TABLE [dbo].[Test](

    [Date] [datetime] Not NUll,

    [c1] [int] NOT NULL,

    [c2] [int] NOT NULL,

    [c3] [int] NOT NULL,

    [c4] [int] NOT NULL,

    [c5] [int] NOT NULL

    ) ON [PRIMARY]

    GO

    Sample Output:

    Date C1 C2 C3 C4 C5

    2012-11-21 15 24 39 48 51

    2012-11-20 0 15 31 24 7

    2012-11-19 7 6 5 4 3

    2012-11-18 8 9 10 11 12

    since 15, 24 exist in records 1 and I want to capture in a new table with date 2012-11-21.

    Similarly 7 exists in records 2 and 3 so I want to capture 7 with date 2012-11-20.

    Similarly no column values match in records 3 and 4 so I want capture none such non matching columns. Is this possible?

    Thanks for your help and Happy Thanksgiving.

    Thanks

    “If your actions inspire others to dream more, learn more, do more and become more, you are a leader.” -- John Quincy Adams

  • sapen (11/21/2012)


    Is this possible?

    Sure! Anything is possible in the SQLverse. Assuming of course that when you say "consecutive" the Date column controls this. Try the following:

    CREATE TABLE #Test(

    [Date] [datetime] Not NUll,

    [c1] [int] NOT NULL,

    [c2] [int] NOT NULL,

    [c3] [int] NOT NULL,

    [c4] [int] NOT NULL,

    [c5] [int] NOT NULL

    ) ON [PRIMARY]

    INSERT INTO #Test

    SELECT '2012-11-21', 15, 24, 39, 48, 51

    UNION ALL SELECT '2012-11-20', 0, 15, 31, 24, 7

    UNION ALL SELECT '2012-11-19', 7, 6, 5, 4, 3

    UNION ALL SELECT '2012-11-18', 8, 9, 10, 11, 12

    UNION ALL SELECT '2012-11-17', 15, 24, 39, 48, 51

    ;WITH Test AS (

    SELECT [Date], c1, c2, c3, c4, c5

    ,n=ROW_NUMBER() OVER (ORDER BY [Date] DESC)

    FROM #Test),

    Test2 AS (

    SELECT b.[Date], b.n, c

    FROM Test a

    CROSS APPLY (

    VALUES ([Date], n, c1),([Date], n, c2),([Date], n, c3)

    ,([Date], n, c4),([Date], n, c5)) b([Date], n, c))

    SELECT [Date]=CASE WHEN a.[Date] > b.[Date] THEN a.[Date] ELSE b.[Date] END, a.c

    FROM Test2 a

    INNER JOIN Test2 b ON a.n + 1 = b.n AND a.c = b.c

    DROP TABLE #Test

    Note that I added a row to your test data to show that these numbers (identical to the first row) are not matched because they're not consecutive.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Oh man.... I can't remember if it was Lowell or Magoo or ??? but we've been through this problem before (couple of years ago?) and somebody posted some ingenious code that absolutely smoked all other methods. There was just no competition for it that even came close.

    I posted that little tidbit of information hoping that the one responsible shows up for this one and can actually put his hands on the link.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/21/2012)


    Oh man.... I can't remember if it was Lowell or Magoo or ??? but we've been through this problem before (couple of years ago?) and somebody posted some ingenious code that absolutely smoked all other methods. There was just no competition for it that even came close.

    I posted that little tidbit of information hoping that the one responsible shows up for this one and can actually put his hands on the link.

    For the record, I wasn't expecting to win any performance races with this one. 😛


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (11/21/2012)


    Jeff Moden (11/21/2012)


    Oh man.... I can't remember if it was Lowell or Magoo or ??? but we've been through this problem before (couple of years ago?) and somebody posted some ingenious code that absolutely smoked all other methods. There was just no competition for it that even came close.

    I posted that little tidbit of information hoping that the one responsible shows up for this one and can actually put his hands on the link.

    For the record, I wasn't expecting to win any performance races with this one. 😛

    Just to be sure, I wasn't pointing my comment towards at anyone. I just wish I had the link. At the time, I never thought I'd ever run into such a problem ever again and didn't keep the link. Shame on me.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/21/2012)


    dwain.c (11/21/2012)


    Jeff Moden (11/21/2012)


    Oh man.... I can't remember if it was Lowell or Magoo or ??? but we've been through this problem before (couple of years ago?) and somebody posted some ingenious code that absolutely smoked all other methods. There was just no competition for it that even came close.

    I posted that little tidbit of information hoping that the one responsible shows up for this one and can actually put his hands on the link.

    For the record, I wasn't expecting to win any performance races with this one. 😛

    Just to be sure, I wasn't pointing my comment towards at anyone. I just wish I had the link. At the time, I never thought I'd ever run into such a problem ever again and didn't keep the link. Shame on me.

    No worries! This was just a case of get a solution out there quickly as a straw man for people to burn down. Besides, I just like using CROSS APPLY VALUES because it is cool! 😎

    I would like to see that link if someone comes up with it because, despite what I just said I'd prefer to see something faster that works so I can remember the approach for the next time.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Heres a version using the unpivot function (and Dwains temp table) with a self join on Value where the date isnt the same

    CREATE TABLE #Test(

    [Date] [datetime] Not NUll,

    [c1] [int] NOT NULL,

    [c2] [int] NOT NULL,

    [c3] [int] NOT NULL,

    [c4] [int] NOT NULL,

    [c5] [int] NOT NULL

    ) ON [PRIMARY]

    INSERT INTO #Test

    SELECT '2012-11-21', 15, 24, 39, 48, 51

    UNION ALL SELECT '2012-11-20', 0, 15, 31, 24, 7

    UNION ALL SELECT '2012-11-19', 7, 6, 5, 4, 3

    UNION ALL SELECT '2012-11-18', 8, 9, 10, 11, 12

    UNION ALL SELECT '2012-11-17', 15, 24, 39, 48, 51

    ;With CTE AS(

    Select *

    from (

    Select

    Row_NUMBER() OVER(ORDER BY Date DESC) rn

    , *

    from #Test) up

    UNPIVOT (Value

    for col in (c1,c2,c3,c4,c5)) as unpvt

    )

    Select Max( a.[DAte]),a.value

    from CTE a

    JOIN CTE b on a.Value=b.Value

    And a.rn=b.rn-1

    group by a.value

    I'm not sure of its performance on a large dataset but its an alternte Idea.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • I was working on something very similar to Jason's

    ;with

    unpivoted as (

    select [date], dense_rank() over (order by [date] desc) seq, value

    from #test a

    unpivot (value for col in (c1, c2, c3, c4, c5)) u

    )

    select b.[date]

    from unpivoted a

    inner join unpivoted b on a.seq = b.seq + 1 and a.value = b.value

    group by b.[date]

    unfortunately I couldn't compare it to Dwain's as I am currently using 2005.

  • Jason and MickyT - You may want to take a look at this article: http://www.sqlservercentral.com/articles/CROSS+APPLY+VALUES+UNPIVOT/91234/. It compares UNPIVOT vs. the CROSS APPLY VALUES approach from a performance perspective. There is also more information in the discussion thread.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks Dwain, I've read that article (+1 BTW), I did say I wasnt sure about the performance on a large dataset, and it was just another Option to consider.

    Looking at the results it apears that if you have around 25% sparseness the two methods perform about the same (-3% cpu, +3% elapsed time) it would be interesting to see on a sparseness of between 20-40% if there is a sweet spot for the UnPivot over the cross apply.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Jason-299789 (11/22/2012)


    Thanks Dwain, I've read that article (+1 BTW), I did say I wasnt sure about the performance on a large dataset, and it was just another Option to consider.

    Looking at the results it apears that if you have around 25% sparseness the two methods perform about the same (-3% cpu, +3% elapsed time) it would be interesting to see on a sparseness of between 20-40% if there is a sweet spot for the UnPivot over the cross apply.

    The sweet spot seems to be not related to sparseness rather number of records. When large, SQL parallelizes the query plan and produces much faster elapsed times for CROSS APPLY VALUES. As I said, more detail on this is in the discussion thread.

    Thanks for reading it. So you were the one. 😀


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I'm sure there was more than one. 😉

    I do take your point though about number of records being a factor. I'm pretty much a horses for courses person, know as many methods then apply the one thats most relevant.

    I hope someone posts a link to the solution Jeff mentioned as I'd love to see it.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply