Interesting SQL challenge - compare differences in cells and rows

  • I have a table... below is a limited version of what it looks like...

    ID, Received, TrailerID, Tyre1, Tyre2, Tyre3, Tyre4, Tyre5, Tyre6

    1, 10/1/2014, ABC123, 12345, 23456, 34567, 45678, 56789, 67890

    2, 15/1/2014, ABC123, 1Z345, 23A56, 34567, 45678, 56789, 67890 // Note tyre 1 and 2 are different

    3, 20/1/2014, ABC123, 23456, 34567, 45678, 12345, 56789, 67890 // Note tyre 4 has moved from tyre 1, but is same set as first record

    4, 25/1/2014, ABC123, 1234S, Z3456, 345G7, 4S678, 5G789, 67B90 // Note, ALL tyres appear to be different (got letters instead of numbers in some cases)

    What I need to find is...

    Latest tyre (record 4) and previous tyre (record 3) only, based on the trailer IDs. Also... ONLY if the tyres are different... however, transposed columns do NOT make a tyre different as the set is still the same (i.e. record 3 is technically the same as record 1, and if only those two records were in the set, neither would appear in my result set, as the tyres are identical)

    The difference between record 4 and record 3 need only be just one character across all 6 tyre columns, which will make it different, but as mentioned before, purely transposed columns are not a difference.

    If the previous record has no tyres, then keep going backwards (just back up to 1/1/2014 as that is when the recording of this data started) and if no previous, I don't need to know about the record.

    In my result set, I don't need to see the earlier tyre... just the newest tyre. I will be picking up the earlier later in the process.

    I have spent ages trying to work this out. I know it is a challenge and I would be willing to accept ideas which may point me to a solution to making this work. I don't really want this to be heavy on the server, but I fear it might be.

    Thank you for reading this far.

  • Hi

    I'm not sure that I have your requirements right, but this may help you with your problem. Basically I am unpivoting your data so I can reorder the tyres in another pivot. This can be used to compare the tyre sets to the next row. I think that there is enough here to get the results that you require?

    WITH trailertyredata AS (

    SELECT *

    FROM (VALUES

    (1, CAST('20140110' AS DATETIME), 'ABC123', '12345', '23456', '34567', '45678', '56789', '67890'),

    (2, CAST('20140115' AS DATETIME), 'ABC123', '1Z345', '23A56', '34567', '45678', '56789', '67890'),

    (3, CAST('20140120' AS DATETIME), 'ABC123', '23456', '34567', '45678', '12345', '56789', '67890'),

    (4, CAST('20140125' AS DATETIME), 'ABC123', '1234S', 'Z3456', '345G7', '4S678', '5G789', '67B90'),

    (5, CAST('20140110' AS DATETIME), 'XYZ456', '12345', '23456', '34567', '45678', '56789', '67890'),

    (6, CAST('20140115' AS DATETIME), 'XYZ456', '1Z345', '23A56', '34567', '45678', '56789', '67890'),

    (7, CAST('20140120' AS DATETIME), 'XYZ456', 'Z3456', '1234S', '345G7', '4S678', '56789', '5G789'),

    (8, CAST('20140125' AS DATETIME), 'XYZ456', '1234S', 'Z3456', '345G7', '4S678', '5G789', '56789')

    ) a (ID, Received, TrailerID, Tyre1, Tyre2, Tyre3, Tyre4, Tyre5, Tyre6)

    ),

    -- Unpivot and resort the tyres the data

    unpivottyre AS (

    SELECT ID, Received, TrailerID, a.Tyre,

    ROW_NUMBER() OVER (PARTITION BY ID, TrailerID ORDER BY a.Tyre) TyreRank,

    DENSE_RANK() OVER (PARTITION BY TrailerID ORDER BY Received dESC) ReceivedRank

    FROM trailertyredata t

    CROSS APPLY (SELECT tyre FROM (VALUES(t.Tyre1), (t.tyre2), (t.tyre3), (t.tyre4), (t.tyre5), (t.tyre6)) x(tyre)) a

    ),

    -- pivot the data again with sorted tyres

    repivottyre AS (

    SELECT ID, Received, TrailerID,

    MIN(CASE WHEN TyreRank = 1 THEN Tyre END) Tyre1,

    MIN(CASE WHEN TyreRank = 2 THEN Tyre END) Tyre2,

    MIN(CASE WHEN TyreRank = 3 THEN Tyre END) Tyre3,

    MIN(CASE WHEN TyreRank = 4 THEN Tyre END) Tyre4,

    MIN(CASE WHEN TyreRank = 5 THEN Tyre END) Tyre5,

    MIN(CASE WHEN TyreRank = 6 THEN Tyre END) Tyre6,

    MIN(ReceivedRank) ReceivedRank

    FROM unpivottyre

    GROUP BY ID, Received, TrailerID

    ),

    -- compare to the next set of tyres based on date

    comparetyres AS(

    SELECT a.ID, a.Received, a.TrailerID, a.Tyre1, a.Tyre2, a.Tyre3, a.Tyre4, a.Tyre5, a.Tyre6, a.ReceivedRank,

    CASE WHEN a.tyre1 = b.tyre1 and a.tyre2 = b.tyre2 and a.tyre3 = b.tyre3 and a.tyre4 = b.tyre4 and a.tyre5 = b.tyre5 and a.tyre6 = b.tyre6 THEN

    1

    ELSE

    0

    END SameAsNext

    FROM repivottyre a

    LEFT OUTER JOIN repivottyre b ON a.ReceivedRank = b.ReceivedRank + 1 AND a.TrailerID = b.TrailerID

    )

    select ID, Received, TrailerID, Tyre1, Tyre2, Tyre3, Tyre4, Tyre5, Tyre6

    from comparetyres

    where ReceivedRank = 1 or

    (ReceivedRank = 2 and SameAsNext = 0);Edit: Added some more data

  • Wow!!!

    That is some sql.

    I will give it a go and report back. Thank you very much for your help. 🙂

  • Thank you again for your help... though as yet, I am trying to understand what you have done so that I can better apply to my situation.

    I have plugged the query into Management Studio, changed the sample data to a select statement to select my real data and I have the following...

    (I have over 2300 records with this)

    IDReceivedTrailerIDTyre1Tyre2Tyre3Tyre4Tyre5Tyre6

    453472014-02-26 13:33:19.4633774-05

    448692014-02-11 12:36:41.98751262726L346858296L8873KO048So524K

    441342014-01-22 08:47:24.157673913NULLNULLNULLNULLNULLNULL

    435412014-01-08 10:18:07.740GE087NULLNULLNULLNULLNULLNULL

    451302014-02-19 18:00:36.233ss760038L5698Z5O6l85O78K97l65lO45S

    461492014-03-21 09:46:19.72330106-06004110091301607030120421304713

    451532014-02-20 16:21:29.36041841-135ZO7l717ZZ7l7337l7367l7Z38ZO80

    447522014-02-08 09:07:12.45741446-10NULLNULLNULLNULLNULLNULL

    441112014-01-21 14:19:55.4603304-05

    433492014-01-02 10:26:12.7636664-12NULLNULLNULLNULLNULLNULL

    As you can see, I have some null and some empty tyres. I think I will need to just validate these have an earlier record with data, so if they do, then we might have cracked this.

    I don't (yet) understand pivot / unpivot or fully what you have done... so if you have the time to expand on it a little, it would be appreciated.

    Thank you.

  • Hi,

    Sorry, I know it's bad form to keep replying to my own posts...

    to validate the data received, I have done a regular call with sql...

    select ID, TrailerID, Received, Tyre1, Tyre2, Tyre3, Tyre4, Tyre5, Tyre6

    from Trailers

    where TrailerID = '5126'

    and Received > '1 January 2014'

    order by ID desc

    I only have one record...

    IDTrailerIDReceivedTyre1Tyre2Tyre3Tyre4Tyre5Tyre6

    4486951262014-02-11 12:36:41.9872726LO048S8873K346858296Lo524K

    I should only be displaying a record if there is a previous record with something in the tyres.

    Thanks again for your help.

  • sqlservercentral 18091 (4/1/2014)


    Thank you again for your help... though as yet, I am trying to understand what you have done so that I can better apply to my situation.

    I have plugged the query into Management Studio, changed the sample data to a select statement to select my real data and I have the following...

    (I have over 2300 records with this)

    IDReceivedTrailerIDTyre1Tyre2Tyre3Tyre4Tyre5Tyre6

    453472014-02-26 13:33:19.4633774-05

    448692014-02-11 12:36:41.98751262726L346858296L8873KO048So524K

    441342014-01-22 08:47:24.157673913NULLNULLNULLNULLNULLNULL

    435412014-01-08 10:18:07.740GE087NULLNULLNULLNULLNULLNULL

    451302014-02-19 18:00:36.233ss760038L5698Z5O6l85O78K97l65lO45S

    461492014-03-21 09:46:19.72330106-06004110091301607030120421304713

    451532014-02-20 16:21:29.36041841-135ZO7l717ZZ7l7337l7367l7Z38ZO80

    447522014-02-08 09:07:12.45741446-10NULLNULLNULLNULLNULLNULL

    441112014-01-21 14:19:55.4603304-05

    433492014-01-02 10:26:12.7636664-12NULLNULLNULLNULLNULLNULL

    As you can see, I have some null and some empty tyres. I think I will need to just validate these have an earlier record with data, so if they do, then we might have cracked this.

    I don't (yet) understand pivot / unpivot or fully what you have done... so if you have the time to expand on it a little, it would be appreciated.

    Thank you.

    On the unpivot and pivot, here's a couple of articles about them: Cross Tabs and Pivots, Part 1[/url], An Alternative (Better?) Method to UNPIVOT[/url].

    The reason I unpivoted your data was to allow me to to resort the tyres consistently before pivoting them back. This allowed me to compare the rows easily.

    While unpivoting the data I also ranked the trailers based on descending date. This could have been done separately to the unpivot.

    The compare part of the CTE checks the trailer tyres against the next row and flags if they are the same or not.

    I wasn't sure exactly what result you wanted to see, so I went with show the latest set of tryes for the trailer and the previous set if they differ.

    If this wasn't what you want, have a read of this article [/url] and post a set of data with expected results that demonstrates each of the criteria.

    You could filter out the records by adding the following where clause onto the unpivot CTE.

    WHERE Tyre1 is not null and tyre1 <> ''

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply