SQL Query improvement.

  • I have a table("Applications") that stores leave applications. The leaves are stored in "StartDate" and "EndDate" columns. Whenever a user apply for a leave he will be passing "FromDate" and "ToDate". The requirement is to find out all the days between "FromDate" to"ToDate" from "Applications" table. Basically the query needs to count the already applied dates. For example, if a user applies for a leave from Jan 05, 2016 to Jan 10, 2016 (6 Days), his previous applied leave days(if any) must be deducted from these 6 days. I have written the following query for this:

    DECLARE @StartDate DATETIME = '2016-01-01 00:00:00.000'

    ,@EndDate DATETIME = '2016-01-10 00:00:00.000'

    IF OBJECT_ID('tempdb..#TempHolidayDaysList') IS NOT NULL

    DROP TABLE #TempHolidayDaysList

    CREATE TABLE #TempHolidayDaysList (HolidayDate DATETIME)

    IF OBJECT_ID('tempdb..#TempHolidayDays') IS NOT NULL

    DROP TABLE #TempHolidayDays

    CREATE TABLE #TempHolidayDays (

    Id INT IDENTITY(1, 1)

    ,StartDate DATETIME

    ,EndDate DATETIME

    )

    INSERT INTO #TempHolidayDays

    SELECT '2016-01-01 00:00:00.000'

    ,'2016-01-01 00:00:00.000'

    UNION

    SELECT '2016-01-08 00:00:00.000'

    ,'2016-01-10 00:00:00.000'

    UNION

    SELECT '2016-01-17 00:00:00.000'

    ,'2016-01-19 00:00:00.000'

    UNION

    SELECT '2016-01-21 00:00:00.000'

    ,'2016-01-22 00:00:00.000'

    IF OBJECT_ID('tempdb..#Applications') IS NOT NULL

    DROP TABLE #Applications

    CREATE TABLE #Applications (Holiday DATETIME);

    WITH DateRange

    AS (

    SELECT HolidayDate = DATEADD(dd, 0, @StartDate)

    WHERE DATEADD(dd, 0, @StartDate) <= @EndDate

    UNION ALL

    SELECT DATEADD(dd, 1, HolidayDate)

    FROM DateRange

    WHERE DATEADD(dd, 1, HolidayDate) <= @EndDate

    )

    INSERT INTO #Applications

    SELECT HolidayDate

    FROM DateRange;

    DECLARE @LoopCount INT

    ,@LoopCounter INT = 1

    SELECT @LoopCount = ROW_NUMBER() OVER (

    ORDER BY Id

    )

    FROM #TempHolidayDays

    WHILE @LoopCounter <= @LoopCount

    BEGIN

    SELECT @StartDate = StartDate

    ,@EndDate = EndDate

    FROM #TempHolidayDays

    WHERE Id = @LoopCounter;

    WITH Holidays

    AS (

    SELECT dt = DATEADD(dd, 0, @StartDate)

    WHERE DATEADD(dd, 0, @StartDate) <= @EndDate

    UNION ALL

    SELECT DATEADD(dd, 1, dt)

    FROM Holidays

    WHERE DATEADD(dd, 1, dt) <= @EndDate

    )

    INSERT INTO #TempHolidayDaysList

    SELECT dt

    FROM Holidays

    SET @LoopCounter = @LoopCounter + 1

    END

    --SELECT * FROM #TempHolidayDaysList

    --SELECT * FROM #Applications

    SELECT T.HolidayDate AS [OverlappingDates]

    FROM #TempHolidayDaysList T

    INNER JOIN #Applications A ON T.HolidayDate = A.Holiday

    My question is what is the correct and the best way to do what I need? Can the above query be improved? Please note that the tables provided above are only for question's purpose. The real table does have a primary key and the leaves are stored with "userId". For simplicity's sake just consider we are working for a single user 🙂

  • First and foremost I want to thank you for posting your question with all the details required to provide an answer. Posting all the temp tables ddl, sample data and desired output makes this a LOT easier.

    You have a few performance problems with your code. When you a cte like you are doing for counting is really row by row processing behind the scenes. http://www.sqlservercentral.com/articles/T-SQL/74118/[/url]

    So we want to replace those with a set based query. Also, you have using a while loop for inserts which is another major performance problem. We can instead use a tally table for this type of thing which will greatly increase the performance. You can read more about the tally table concept here. http://www.sqlservercentral.com/articles/T-SQL/62867/[/url]

    In my system I use a view for my tally table. Using the methodology results in zero reads to generate 10,000 rows. Here is the code for my view.

    create View [dbo].[cteTally] as

    WITH

    E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

    cteTally(N) AS

    (

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4

    )

    select N from cteTally

    GO

    Before I apply this to your code I wanted to point out another issue with your code. You have this code in your solution:

    SELECT @LoopCount = ROW_NUMBER() OVER (

    ORDER BY Id

    )

    FROM #TempHolidayDays

    This is problematic because you are using that to get a count of rows from the temp table. What is really happening here is that the value of your variable gets assigned to the last row in the result set. This is not only a potential performance issue it is also counting on a sort order for the temp table when no order by was applied. If you want to get the count of rows in a table you should use COUNT.

    However, since there really is no need to get the count of rows we can skip this entirely.

    Here is how I would do this to get rid of all the RBAR processing going on in here. This produces the same output as the original query you posted but will be much more efficient without all the looping.

    DECLARE @StartDate DATETIME = '2016-01-01 00:00:00.000'

    ,@EndDate DATETIME = '2016-01-10 00:00:00.000'

    IF OBJECT_ID('tempdb..#TempHolidayDaysList') IS NOT NULL

    DROP TABLE #TempHolidayDaysList

    CREATE TABLE #TempHolidayDaysList (HolidayDate DATETIME)

    IF OBJECT_ID('tempdb..#TempHolidayDays') IS NOT NULL

    DROP TABLE #TempHolidayDays

    CREATE TABLE #TempHolidayDays (

    Id INT IDENTITY(1, 1)

    ,StartDate DATETIME

    ,EndDate DATETIME

    )

    INSERT INTO #TempHolidayDays

    SELECT '2016-01-01 00:00:00.000'

    ,'2016-01-01 00:00:00.000'

    UNION

    SELECT '2016-01-08 00:00:00.000'

    ,'2016-01-10 00:00:00.000'

    UNION

    SELECT '2016-01-17 00:00:00.000'

    ,'2016-01-19 00:00:00.000'

    UNION

    SELECT '2016-01-21 00:00:00.000'

    ,'2016-01-22 00:00:00.000'

    IF OBJECT_ID('tempdb..#Applications') IS NOT NULL

    DROP TABLE #Applications

    CREATE TABLE #Applications (Holiday DATETIME);

    INSERT INTO #Applications

    select DATEADD(DAY, N - 1, @StartDate)

    from cteTally

    where N <= DATEDIFF(DAY, @StartDate, @EndDate) + 1

    INSERT INTO #TempHolidayDaysList

    select x.HolidayDate

    from #TempHolidayDays T

    cross apply

    (

    select DATEADD(DAY, N - 1, t.StartDate) as HolidayDate

    from cteTally

    where N <= DATEDIFF(DAY, t.StartDate, t.EndDate) + 1

    ) x

    order by HolidayDate

    SELECT T.HolidayDate AS [OverlappingDates]

    FROM #TempHolidayDaysList T

    INNER JOIN #Applications A ON T.HolidayDate = A.Holiday

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks Sean. This indeed is perfect and sorry for my late response.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply