Tricky SQL Query

  • --The objective is to compare the Quals in Qual column.

    The stored procedure requires input for multiple students (i.e. @Student).

    Business logic is that if Qual changes then Rollover becomes 1, otherwise 0.

    This means that like in the sample provided:

    For 2011 the Qual was 'HBPP112'. So here Rollover must be 0 (initial Qual).

    For 2012 the Qual changed to 'BACS315'. So here Rollover must be 1.

    For 2013 and 2014 Rollover must be 0 (no Qual change).

    Solution must cater for multiple StudentUID's, in the below sample I only used 1 StudentUID over 4 years (2011 to 2014), whereas there could be over 100 to 1000 different StudentUID's over 4 to 8 years (2011 to 2018).

    Below is a sample of the data I used:

    CREATE TABLE [dbo].[AAA_Mytable](

    [TermCalendarID] [int] NULL,

    [RegYear] [nvarchar](4) NULL,

    [StudentUID] [int] NULL,

    [Qual] [nvarchar](120) NULL

    ) ON [PRIMARY]

    GO

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual]) VALUES (2, N'2011', 8314, N'HBPP112')

    GO

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual]) VALUES (3, N'2012', 8314, N'BACS315')

    GO

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual]) VALUES (5, N'2013', 8314, N'BACS315')

    GO

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual]) VALUES (6, N'2014', 8314, N'BACS315')

    GO

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual]) VALUES (6, N'2014', 8314, N'BMNG311')

    GO

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual]) VALUES (3, N'2012', 241401, N'SPJMA018')

    GO

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual]) VALUES (5, N'2013', 241401, N'SPJMA018')

    GO

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual]) VALUES (6, N'2014', 241401, N'SBM018')

    GO

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual]) VALUES (6, N'2014', 241401, N'SOPMA018')

    GO

  • Kevin,

    This design worries me. Can you please describe the business problem you're trying to solve? I would be willing to wager that what you're trying to do would be pretty easy with a windowing function, maybe LAG/LEAD, but I don't want to start answering a question I'm not sure I understand.

    If RegYear were an INT, you could sort it properly and you could use LAG(QualYear,1) to read the previous value. Further, if you suddenly had to add more years, your solution wouldn't break.

    Sorry to come across like the Grinch, but I've worked on designs like this before and they take serious voodoo to deal with later on.

    Pieter

  • I agree Pieter - sounds like a pretty simple Windowing Function query would do the trick. I agree with your analysis of the situation too.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Do you mean something like this?

    declare @AAA_Mytable TABLE(

    [TermCalendarID] [int] NOT NULL,

    [RegYear] int NOT NULL,

    [StudentUID] [int] NOT NULL,

    [QualYear1] [nvarchar](120) NULL,

    [QualYear2] [nvarchar](120) NULL,

    [QualYear3] [nvarchar](120) NULL,

    [QualYear4] [nvarchar](120) NULL,

    [QualYear5] [nvarchar](120) NULL,

    [Rollover] [int] NULL

    ) ;

    INSERT @AAA_Mytable ([TermCalendarID], [RegYear], [StudentUID], [QualYear1], [QualYear2], [QualYear3], [QualYear4], [QualYear5]) VALUES (2, N'2011', 8314, N'HBPP112', NULL, NULL, NULL, NULL)

    INSERT @AAA_Mytable([TermCalendarID], [RegYear], [StudentUID], [QualYear1], [QualYear2], [QualYear3], [QualYear4], [QualYear5]) VALUES (3, N'2012', 8314, NULL, N'BACS315', NULL, NULL, NULL)

    INSERT @AAA_Mytable ([TermCalendarID], [RegYear], [StudentUID], [QualYear1], [QualYear2], [QualYear3], [QualYear4], [QualYear5]) VALUES (5, N'2013', 8314, NULL, NULL, N'BACS315', NULL, NULL)

    INSERT @AAA_Mytable ([TermCalendarID], [RegYear], [StudentUID], [QualYear1], [QualYear2], [QualYear3], [QualYear4], [QualYear5]) VALUES (6, N'2014', 8314, NULL, NULL, NULL, N'BACS315', NULL)

    SELECT * FROM @AAA_Mytable

    SELECT

    a.[TermCalendarID]

    ,a.[RegYear]

    ,a.[StudentUID]

    ,t.rollover

    from @AAA_Mytable a

    outer apply (

    select

    rollover = case when a.[QualYear1] is not null then case a.[QualYear1] when b.[QualYear2] then 1 end

    when a.[QualYear2] is not null then case a.[QualYear2] when b.[QualYear3] then 1 end

    when a.[QualYear3] is not null then case a.[QualYear3] when b.[QualYear4] then 1 end

    when a.[QualYear4] is not null then case a.[QualYear4] when b.[QualYear5] then 1 end

    end

    from @AAA_Mytable b

    where b.StudentUID=a.StudentUID and b.RegYear = a.RegYear+1) t

  • In SQL Server 2012 the Windowing Functionality in SQL Server was VASTLY improved, including the addition of LAG and LEAD. This makes for a VERY simple and clean solution to the OP's request. See Books Online for usage.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Hi Serg-52, thanks for posting the code. Have not tried it yet, will do tomorrow morning.

    The stored procedure requires input for multiple students (i.e. @Student), and the example I posted

    only catered for one student at a time, therefore earlier today I simply used a scalar function (quick fix) to resolve the problem (but past experience has taught me that scalar functions are not good for performance, they get worse if you run them over several years of data), so anything NOT scalar function will be an improvement.

    Below is a sample of the data I used for scalar function:

    (business logic is that if Qual changes then Rollover becomes 1, otherwise 0), must cater for multiple StudentUID's, in the below sample I only used 1 StudentUID over 4 years, whereas there could be over 100 to 1000 different StudentUID's over 4 to 8 years.

    CREATE TABLE [dbo].[AAA_Mytable](

    [TermCalendarID] [int] NOT NULL,

    [RegYear] [varchar](12) NOT NULL,

    [StudentUID] [int] NOT NULL,

    [Qual] [nvarchar](120) NULL,

    [Rollover] [int] NULL

    ) ON [PRIMARY]

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual], Rollover)

    VALUES

    (2, N'2011', 8314, N'HBPP112', NULL)

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual], Rollover)

    VALUES

    (3, N'2012', 8314, N'BACS315', NULL)

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual], Rollover)

    VALUES

    (5, N'2013', 8314, N'BACS315', NULL)

    INSERT [dbo].[AAA_Mytable] ([TermCalendarID], [RegYear], [StudentUID], [Qual], Rollover)

    VALUES

    (6, N'2014', 8314, N'BACS315', NULL)

  • TheSQLGuru (12/4/2014)


    In SQL Server 2012 the Windowing Functionality in SQL Server was VASTLY improved, including the addition of LAG and LEAD. This makes for a VERY simple and clean solution to the OP's request. See Books Online for usage.

    Yes it took MS some time to adopt Windowing but finally it's here and it's really a big feature.

    But for this paticular request plus taking possible number of students into account i don't expect

    select

    [TermCalendarID]

    ,[RegYear]

    ,[StudentUID]

    ,rollover = case when [QualYear1] is not null then

    case [QualYear1] when LEAD([QualYear2]) OVER (PARTITION BY StudentUID ORDER BY RegYear) then 1 end

    when [QualYear2] is not null then

    case [QualYear2] when LEAD([QualYear3]) OVER (PARTITION BY StudentUID ORDER BY RegYear) then 1 end

    when [QualYear3] is not null then

    case [QualYear3] when LEAD([QualYear4]) OVER (PARTITION BY StudentUID ORDER BY RegYear) then 1 end

    when [QualYear4] is not null then

    case [QualYear4] when LEAD([QualYear5]) OVER (PARTITION BY StudentUID ORDER BY RegYear) then 1 end

    end

    from @AAA_Mytable a

    to run much faster.

  • Hi Kevin.

    Your second table definition greatly differs from the first one. While

    the first one allows something like (don't know if it makes any sense)

    TermCalendarIDRegYearStudentUIDQualYear1QualYear2QualYear3QualYear4QualYear5Rollover

    220118314HBPP112NULLNULLNULLNULLNULL

    320128314NULLBACS315NULLNULLNULLNULL

    420138314NULLBACS315NULLNULLNULLNULL

    520148314NULLNULLBACS315NULLNULLNULL

    620158314NULLNULLNULLBACS315NULLNULL

    , the second one has no information about the student's "QualYear".

    So the query above against the first version of data will return only 2 rows with Rollover=1. And any query against the second version of the same data i can imagine will return 3 rows with Rollover=1.

    Please clarify this point.

  • Hi Serg-52

    Based on [dbo].[AAA_Mytable], you get the following:

    TermCalendarIDRegYearStudentUID Qual Rollover

    2 2011 8314 HBPP112 NULL

    3 2012 8314 BACS315NULL

    4 2013 8314 BACS315NULL

    5 2014 8314 BACS315NULL

    6 2015 8314 BACS315NULL

  • kevin_nikolai (12/5/2014)


    Hi Serg-52

    Based on [dbo].[AAA_Mytable], you get the following:

    TermCalendarIDRegYearStudentUID Qual Rollover

    2 2011 8314 HBPP112 NULL

    3 2012 8314 BACS315NULL

    4 2013 8314 BACS315NULL

    5 2014 8314 BACS315NULL

    6 2015 8314 BACS315NULL

    I see. My point is that using this data any suitable query i can imagine will return 3 rows with Rollover=1. Because all [RegYear] = 2012,2013,2014 has the same properties when compared to the next [RegYear]. Contrary to the previous version of the table which has an information that the student was at the same [QualYear2] for the [RegYear]= 2012 and 2013 and was at different [QualYear3], [QualYear4] for the later [RegYear]'s.

    So those two data models aren't equivalent. Please choose which one is correct for your application.

  • Updated my first (original) post - Posted Yesterday @ 5:49 AM.

    You can ignore all other posts relating to Tricky SQL Query.

  • OK, then LAG() is the best choice as it was proposed by colleages.

    with [AAA_Mytable] as(

    select *

    from (

    values

    (12, N'2011', 18314, N'HBPP112', NULL)

    ,(13, N'2012', 18314, N'BACS315', NULL)

    ,(15, N'2013', 18314, N'BACS315', NULL)

    ,(16, N'2014', 18314, N'BACS315', NULL)

    ,(112, N'2011', 118314, N'HBPP112', NULL)

    ,(113, N'2012', 118314, N'BACS315', NULL)

    ,(115, N'2013', 118314, N'BACS315', NULL)

    ,(116, N'2014', 118314, N'BACS315', NULL)

    ) T ([TermCalendarID], [RegYear], [StudentUID], [Qual], Rollover)

    )

    select [TermCalendarID], [StudentUID], [RegYear], [Qual]

    , Rollover = case [Qual] when LAG([Qual],1,[Qual]) OVER (PARTITION BY StudentUID ORDER BY RegYear) then 0 else 1 end

    from [AAA_Mytable]

    order by [StudentUID], [RegYear];

  • Is this what you are looking for ?

    with MYcte as (

    select studentUID, qual, max(regyear) as throughYear

    from #AAA_Mytable

    group by studentUID, qual

    )

    SELECTt1.termCalendarID, t1.regyear, t1.studentUid, t1.qual

    ,coalesce(oa.rollover, 0) as Rollover

    from #AAA_Mytable as t1

    outer apply (

    SELECT 1 as rollover

    frommyCTE

    WHEREmycte.studentuid=t1.studentuid

    and mycte.throughYear=t1.regyear-1

    ) as oa

    This is portable to older SQL Server systems not yet running sql server 2012 and greater than or equal to 2005.

    ----------------------------------------------------

  • Hi MMartin1, your code works great, even with multiple studentuid.

    Thanks.

  • Thanks for the feedback, glad this worked out for you.

    ----------------------------------------------------

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply