Query Help for moving Type to Type

  • Hello Experts

    Need help with a query.  Greatly Appreciated!

    In the below example, I need to pull only the members who transitioned from One type to another type and moved back to previous type.

    (e.g) A -> B -> A

    DECLARE @table1 TABLE (
    memid varchar(5),
    mname varchar(15),
    memtype char(1),
    mdate datetime)


    insert @table1 (memid,mname,memtype,mdate)
    values('m1','john doe','1','01/05/2019')
    ,('m1','john doe','2','02/11/2019')
    ,('m1','john doe','1','03/16/2019')
    ,('m2','jane doe','1','02/06/2019')
    ,('m2','jane doe','2','03/12/2019')
    ,('m2','jane doe','2','04/11/2019')
    ,('m3','Sam doe','2','06/03/2019')
    ,('m3','Sam doe','3','06/20/2019')
    ,('m3','Sam doe','2','07/07/2019')
    ,('m4','Rick doe','1','02/03/2019')
    ,('m4','Rick doe','1','04/20/2019')
    ,('m4','Rick doe','1','04/27/2019')
    ,('m5','Nick doe','3','03/02/2019')
    ,('m5','Nick doe','4','05/20/2019')
    ,('m5','Nick doe','3','05/20/2019')

    --Expected Result

    'm1','john doe','1','01/05/2019'
    'm1','john doe','2','02/11/2019'
    'm1','john doe','1','03/16/2019'
    'm3','Sam doe','2','06/03/2019'
    'm3','Sam doe','3','06/20/2019'
    'm3','Sam doe','2','07/07/2019'
    'm5','Nick doe','3','03/02/2019'
    'm5','Nick doe','4','05/20/2019'
    'm5','Nick doe','3','05/20/2019'

     

  • This is a fairly standard Gaps and Islands problem.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • I see a few things that will make this tricky.  The first being that the datetime column has no time.  So how do you know that m5 went from 3 to 4 to 3 and not 3 to 3 to 4?

    For my sanity purposes, I changed your input data so that the last value for m5 is the 21st instead of the 20th.

    Now, to make the query to work with your data set and my minor change, I believe that the following will give you what you want:

    DECLARE @table1 TABLE
    (
    [memid] VARCHAR(5)
    , [mname] VARCHAR(15)
    , [memtype] CHAR(1)
    , [mdate] DATETIME
    );


    INSERT @table1
    (
    [memid]
    , [mname]
    , [memtype]
    , [mdate]
    )
    VALUES
    (
    'm1'
    , 'john doe'
    , '1'
    , '01/05/2019'
    )
    , (
    'm5'
    , 'Nick doe'
    , '4'
    , '05/20/2019'
    )
    , (
    'm1'
    , 'john doe'
    , '2'
    , '02/11/2019'
    )
    , (
    'm1'
    , 'john doe'
    , '1'
    , '03/16/2019'
    )
    , (
    'm2'
    , 'jane doe'
    , '1'
    , '02/06/2019'
    )
    , (
    'm2'
    , 'jane doe'
    , '2'
    , '03/12/2019'
    )
    , (
    'm2'
    , 'jane doe'
    , '2'
    , '04/11/2019'
    )
    , (
    'm3'
    , 'Sam doe'
    , '2'
    , '06/03/2019'
    )
    , (
    'm3'
    , 'Sam doe'
    , '3'
    , '06/20/2019'
    )
    , (
    'm3'
    , 'Sam doe'
    , '2'
    , '07/07/2019'
    )
    , (
    'm4'
    , 'Rick doe'
    , '1'
    , '02/03/2019'
    )
    , (
    'm4'
    , 'Rick doe'
    , '1'
    , '04/20/2019'
    )
    , (
    'm4'
    , 'Rick doe'
    , '1'
    , '04/27/2019'
    )
    , (
    'm5'
    , 'Nick doe'
    , '3'
    , '03/02/2019'
    )
    , (
    'm5'
    , 'Nick doe'
    , '5'
    , '05/21/2019'
    )
    , (
    'm5'
    , 'Nick doe'
    , '3'
    , '05/22/2019'
    );
    -- Get the max date and max type
    WITH [getMaxs]
    AS
    (
    SELECT
    [memid]
    , [mname]
    , [memtype]
    , [mdate]
    , MAX([mdate]) OVER (PARTITION BY
    [memid]
    ) AS [maxdate]
    , MAX([memtype]) OVER (PARTITION BY
    [memid]
    ) AS [maxtype]
    FROM @table1
    )
    -- Check if any item was reverted. We do this by
    --comparing if the last value for that row has a type that is less than the maximum. If so, it was reverted
    , [checkForReverted]
    AS
    (
    SELECT
    [getMaxs].[memid]
    , [getMaxs].[mname]
    , [getMaxs].[memtype]
    , [getMaxs].[mdate]
    , CASE
    WHEN [getMaxs].[mdate] = [getMaxs].[maxdate]
    AND [getMaxs].[maxtype] <> [getMaxs].[memtype]
    THEN 1
    ELSE 0
    END AS [reverted]
    FROM [getMaxs]
    )
    -- Get the revert value
    , [RevertValue]
    AS
    (
    SELECT
    [checkForReverted].[memid]
    , [checkForReverted].[mname]
    , [checkForReverted].[memtype]
    , [checkForReverted].[mdate]
    , [checkForReverted].[reverted]
    , CASE
    WHEN [checkForReverted].[reverted] = 1
    THEN [checkForReverted].[memtype]
    ELSE -1
    END AS [RevertValue]
    FROM [checkForReverted]
    )
    -- Find the values that should be used
    , [FindValueToUse]
    AS
    (
    SELECT
    [RevertValue].[memid]
    , [RevertValue].[mname]
    , [RevertValue].[memtype]
    , [RevertValue].[mdate]
    , MAX([RevertValue].[RevertValue]) OVER (PARTITION BY
    [RevertValue].[memid]
    ) AS [UseMe]
    FROM [RevertValue]
    )
    SELECT
    [FindValueToUse].[memid]
    , [FindValueToUse].[mname]
    , [FindValueToUse].[memtype]
    , [FindValueToUse].[mdate]
    FROM [FindValueToUse]
    WHERE [FindValueToUse].[UseMe] >= 0
    ORDER BY [FindValueToUse].[memid]
    , [FindValueToUse].[mdate];

    May not be the most elegant solution, but for your sample data, it works.  I also set it up to handle more than 3 value reverts for example, if you go from 3 to 4 to 5 and back to 3, it will show 4 rows.  If the user reverts from 3 to 4 to 5 to 4 to 3, it will show 5 rows.

    The case that would be missed with the above code would be if it did the opposite for reverting.  That is, it went from 4 to 3 to 4.  The code above assumes you are always reverting lower.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Appreciate both of your responses.

    I can manage to bring in the time factor so that part will be covered.

    Regarding the type , i used some sample data.  But, it is character value meaning values can be A,B,C instead 1,2,3.

    So, any change like below needs to be picked up.

    A->B->A

    B->A->B

    Thanks

  • Any help is greatly appreciated!

  • The problem with trying to manage 1,2,3 as well as a,b,c is what happens when you hit 1,a,1?  One solution could be to just sign each letter a numeric value and then go from there. That should solve your problem, no?

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Mr. Brian Gale wrote:

    The problem with trying to manage 1,2,3 as well as a,b,c is what happens when you hit 1,a,1?  One solution could be to just sign each letter a numeric value and then go from there. That should solve your problem, no?

    The same thing that would happen if you hit 1, 2, 1, because you're actually dealing with (single character) strings, not numbers.  All you need is a well-defined unique sort, which is already provided for strings.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Hi bmg002 , Thanks for your support.

    I managed to assign numeric values for the memType column.  Your query works as long as they are in ascending order.  If I had this below rows to the dummy set, the logic fails.  m6 - member should not be pulled as the type changes only once.

    Please review, thanks a lot again .

    (

    'm6'

    , 'jon snow'

    , '2'

    , '05/21/2019'

    ),

    (

    'm6'

    , 'jon snow'

    , '2'

    , '05/24/2019'

    )

    , (

    'm6'

    , 'jon snow'

    , '1'

    , '05/27/2019'

    )

  • All,

    My query is not going to perform well, but it will get accurate results for the existing data, in which I included the m6 example and where I changed the date to add 1 hour for Nick doe on the 2nd 5/21 entry.  Also, it doesn't show the data as separate rows, so I'm not sure it would work as well with data that might have multiple repetitions of A -> B -> A for the same member.   It provided the additional data as additional columns.   The only benefit here is that it was super quick to put together.   It took longer to post the query than it did to create and test it.

    CREATE TABLE #table1 (
    memid varchar(5),
    mname varchar(15),
    memtype char(1),
    mdate datetime,
    CONSTRAINT PK_T1_TEMP_allcolumns_ PRIMARY KEY CLUSTERED (
    memid ASC,
    mname ASC,
    memtype ASC,
    mdate ASC
    )
    );
    INSERT INTO #table1 (memid, mname, memtype, mdate)
    VALUES('m1','john doe','1','01/05/2019'),
    ('m1','john doe','2','02/11/2019'),
    ('m1','john doe','1','03/16/2019'),
    ('m2','jane doe','1','02/06/2019'),
    ('m2','jane doe','2','03/12/2019'),
    ('m2','jane doe','2','04/11/2019'),
    ('m3','Sam doe','2','06/03/2019'),
    ('m3','Sam doe','3','06/20/2019'),
    ('m3','Sam doe','2','07/07/2019'),
    ('m4','Rick doe','1','02/03/2019'),
    ('m4','Rick doe','1','04/20/2019'),
    ('m4','Rick doe','1','04/27/2019'),
    ('m5','Nick doe','3','03/02/2019'),
    ('m5','Nick doe','4','05/20/2019'),
    ('m5','Nick doe','3','05/20/2019 01:00:00'),
    ('m6','jon snow','2','05/21/2019'),
    ('m6','jon snow','2','05/24/2019'),
    ('m6','jon snow','1','05/27/2019');

    --Expected Result
    /*
    'm1','john doe','1','01/05/2019'
    'm1','john doe','2','02/11/2019'
    'm1','john doe','1','03/16/2019'
    'm3','Sam doe','2','06/03/2019'
    'm3','Sam doe','3','06/20/2019'
    'm3','Sam doe','2','07/07/2019'
    'm5','Nick doe','3','03/02/2019'
    'm5','Nick doe','4','05/20/2019'
    'm5','Nick doe','3','05/20/2019'
    */

    SELECT
    T1.memid,
    T1.mname,
    T1.memtype,
    T1.mdate,
    TWO.memtype AS memtype2,
    TWO.mdate AS mdate2,
    THREE.memtype AS memtype3,
    THREE.mdate AS mdate3
    FROM #table1 AS T1
    CROSS APPLY (
    SELECT T2.memtype, T2.mdate
    FROM #table1 AS T2
    WHERET2.memid = T1.memid
    AND T2.mname = T1.mname
    AND T2.memtype <> T1.memtype
    AND T2.mdate > T1.mdate
    ) AS TWO
    CROSS APPLY (
    SELECT T3.memtype, T3.mdate
    FROM #table1 AS T3
    WHERET3.memid = T1.memid
    AND T3.mname = T1.mname
    AND T3.memtype = T1.memtype
    AND T3.mdate > ISNULL((
    SELECT MIN(T4.mdate)
    FROM #table1 AS T4
    WHERET4.memid = T3.memid
    AND T4.mname = T3.mname
    AND T4.memtype <> T3.memtype
    AND T4.mdate > T1.mdate
    ), '1900-01-01')
    ) AS THREE

    DROP TABLE #table1;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Here is another option to add to your list.

    WITHBaseData
    AS(
    SELECTmemid,
    memtype,
    mdate,
    ROW_NUMBER() OVER (PARTITION BY memid ORDER BY mdate) AS RowAsc,
    ROW_NUMBER() OVER (PARTITION BY memid ORDER BY mdate DESC) AS RowDesc
    FROM#table1
    ),
    ChangedMems
    AS(
    SELECTmemid
    FROM#table1
    GROUP BY memid
    HAVINGCOUNT(DISTINCT memtype) > 1
    )
    SELECTt.memid,
    t.mname,
    t.memtype,
    t.mdate
    FROM(
    SELECTa.memid,
    MIN(a.mdate) AS startdate,
    MAX(d.mdate) AS enddate
    FROM(
    SELECTmemid, memtype, mdate
    FROMBaseData WHERE RowAsc = 1
    ) a
    JOIN(
    SELECTmemid, memtype, mdate
    FROMBaseData WHERE RowDesc = 1
    ) d
    ONd.memid = a.memid
    ANDd.memtype = a.memtype
    JOINChangedMems c
    ONc.memid = a.memid
    GROUP BY a.memid
    ) g
    JOIN#table1 t
    ONt.memid = g.memid
    ANDt.mdate BETWEEN g.startdate and g.enddate
    ORDER BY t.memid,
    t.mdate

    This should catch those that move several times and land back on the same type for "first" and "last".  I have not done performance testing because there was only a small set of data to use.

  • Thank you so much Steve and Fahey Jonathan!  Both queries works perfect.

     

     

  • https://www.red-gate.com/simple-talk/sql/t-sql-programming/state-transition-constraints/

    Please post DDL and follow ANSI/ISO standards when asking for help. 

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply