How to speed up this query?

  • Hi everyone

    My query is taking over 5 minutes to run and still no results so I think something is wrong.  I am not sure how to fix it.  Is someone able to help me?

    Query:

    DROP TABLE IF EXISTS #TEMP1;

    SELECT DISTINCT T1.QUOTE_DATE
    INTO #TEMP1
    FROM DBO.OptionsEOD T1

    CREATE CLUSTERED INDEX #CI_TEMP1 ON #TEMP1
    (QUOTE_DATE
    );

    SELECTT1.UNDERLYING_SYMBOL,
    T1.QUOTE_DATE,
    T1.ROOT_SYMBOL,
    T1.EXPIRATION,
    T1.STRIKE,
    T1.OPTION_TYPE,
    T1.CLOSE_PRICE,
    LAG(T1.CLOSE_PRICE) OVER (PARTITION BY T1.UNDERLYING_SYMBOL, T1.ROOT_SYMBOL, T1.EXPIRATION, T1.STRIKE, T1.OPTION_TYPE ORDER BY T1.QUOTE_DATE) AS PREV_CLOSE_PRICE
    FROMDBO.OptionsEOD AS T1 INNER JOIN
    (
    SELECTT2.DATE_FROM,
    T2.DATE_TO
    FROM(
    SELECTLAG(T1.QUOTE_DATE,1) OVER (ORDER BY T1.QUOTE_DATE) AS DATE_FROM,
    LAG(T1.QUOTE_DATE,0) OVER (ORDER BY T1.QUOTE_DATE) AS DATE_TO,
    ROW_NUMBER() OVER (ORDER BY T1.QUOTE_DATE DESC) AS ROW_NUM
    FROM#TEMP1 AS T1
    ) AS T2
    WHERET2.ROW_NUM = 1
    ) AS T3 ON T1.QUOTE_DATE BETWEEN T3.DATE_FROM AND T3.DATE_TO

    Explanation:

    I want to find the previous close price for a security.  The composite key are the following fields:

    UNDERLYING_SYMBOL

    QUOTE_DATE

    ROOT_SYMBOL

    EXPIRATION

    STRIKE

    OPTION_TYPE

    The LAG function finds the previous value.  If I run the query as is without the sub-query then it produces way too many useless results. So the solution is to produce the results for today only.  This is financial stock data so there is only data for business days so today (Sunday) means Friday (or Thursday if Friday is a holiday).  The sub-query produces the latest day (DATE_TO) and the previous day (DATE_FROM).

    The problem is I am not sure how to feed the results of the sub-query to the main query.  I used BETWEEN but I am not really sure.  There must be a better way.

    The main query needs to restrict results so only DATE_FROM and DATE_TO are used.  Here is a pseudocode:

    Select *

    from table

    where quote_date <=DATE_TO and quote_date >= DATE_FROM

    I don't know how to do this.

    How can I fix my query?

    Thank you

  • Have you looked at the query execution plan?  There might be some clues as to what is causing the problem in there.  What kind of data volumes are you dealing with?  A query that takes 5+ minutes over thousands of rows might be a problem but it might be reasonable if you're querying millions of rows.  What is the structure of the dbo.OptionsEOD table?

    People can probably help you but we're going to need more to go on first.

     


    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
    —Charles Babbage, Passages from the Life of a Philosopher

    How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537

  • Nothing jumps out as problematic, so it is going to be down to knowing the structures and execution plan. Without that, we can't even guess.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • If you don't already have an index on the dbo.OptionsEOD table to directly support the "LAG(T1.CLOSE_PRICE) OVER PARTITION BY T1.UNDERLYING_SYMBOL, T1.ROOT_SYMBOL, T1.EXPIRATION, T1.STRIKE, T1.OPTION_TYPE ORDER BY T1.QUOTE_DATE)" you should try creating one.

    That is, the index keys would be ( UNDERLYING_SYMBOL, ROOT_SYMBOL, EXPIRATION, STRIKE,  OPTION_TYPE, QUOTE_DATE ) /* in that order */ and the included column would be CLOSE_PRICE.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • SELECT T1.UNDERLYING_SYMBOL,
    T1.QUOTE_DATE,
    T1.ROOT_SYMBOL,
    T1.EXPIRATION,
    T1.STRIKE,
    T1.OPTION_TYPE,
    T1.CLOSE_PRICE,
    T2.PREV_CLOSE_PRICE
    FROM DBO.OptionsEOD AS T1
    OUTER APPLY(SELECT TOP(1)
    T2.CLOSE_PRICE AS PREV_CLOSE_PRICE
    FROM DBO.OptionsEOD AS T2
    WHERE T2.UNDERLYING_SYMBOL = T1.UNDERLYING_SYMBOL
    AND T2.ROOT_SYMBOL = T1.ROOT_SYMBOL
    AND T2.EXPIRATION = T1.EXPIRATION
    AND T2.STRIKE = T1.STRIKE
    AND T2.OPTION_TYPE = T1.OPTION_TYPE
    AND T2.QUOTE_DATE < T1.QUOTE_DATE
    ORDER BY T2.QUOTE_DATE DESC) T2
    WHERE T1.QUOTE_DATE = '2023-07-03'
    ;

    You might need to add an index to make it perform well.

  • Another option:

    WITH CTE AS
    (
    SELECT T1.UNDERLYING_SYMBOL,
    T1.QUOTE_DATE,
    T1.ROOT_SYMBOL,
    T1.EXPIRATION,
    T1.STRIKE,
    T1.OPTION_TYPE,
    T1.CLOSE_PRICE,
    LAG(T1.CLOSE_PRICE) OVER (PARTITION BY T1.UNDERLYING_SYMBOL, T1.ROOT_SYMBOL, T1.EXPIRATION, T1.STRIKE, T1.OPTION_TYPE ORDER BY T1.QUOTE_DATE) AS PREV_CLOSE_PRICE
    FROM DBO.OptionsEOD AS T1
    )
    SELECT *
    FROM CTE
    WHERE QUOTE_DATE = '2023-07-03'
    ;
  • To increase the performance you could limit the rows the query in the CTE produces by selecting on the QUOTE_DATE  going back a given amount of days to when you are sure there will be another quote date. This code looks back 10 days.

    DECLARE @QUOTE_DATE date = '2023-07-03';

    ;WITH CTE AS
    (
    SELECT T1.UNDERLYING_SYMBOL,
    T1.QUOTE_DATE,
    T1.ROOT_SYMBOL,
    T1.EXPIRATION,
    T1.STRIKE,
    T1.OPTION_TYPE,
    T1.CLOSE_PRICE,
    LAG(T1.CLOSE_PRICE) OVER (PARTITION BY T1.UNDERLYING_SYMBOL, T1.ROOT_SYMBOL, T1.EXPIRATION, T1.STRIKE, T1.OPTION_TYPE ORDER BY T1.QUOTE_DATE) AS PREV_CLOSE_PRICE
    FROM DBO.OptionsEOD AS T1
    WHERE T1.QUOTE_DATE BETWEEN DATEADD(dd, -10, @QUOTE_DATE) AND @QUOTE_DATE
    )
    SELECT *
    FROM CTE
    WHERE QUOTE_DATE = @QUOTE_DATE
    ;

     

  • try the following code.

    and then give us the actual explain plan for both this code and your code.

    drop table if exists #TEMP1;
    drop table if exists #TEMP2;

    select distinct T1.QUOTE_DATE
    into #TEMP1
    from dbo.OptionsEOD T1

    CREATE CLUSTERED INDEX #CI_TEMP1 ON #TEMP1
    (QUOTE_DATE
    );

    select T2.DATE_FROM
    , T2.DATE_TO
    into #temp2
    from (
    select lag(T1.QUOTE_DATE, 1) over (order by T1.QUOTE_DATE) as DATE_FROM
    , lag(T1.QUOTE_DATE, 0) over (order by T1.QUOTE_DATE) as DATE_TO
    , row_number() over (order by T1.QUOTE_DATE desc) as ROW_NUM
    from #TEMP1 as T1) as T2
    where T2.ROW_NUM = 1



    select T1.UNDERLYING_SYMBOL
    , T1.QUOTE_DATE
    , T1.ROOT_SYMBOL
    , T1.EXPIRATION
    , T1.STRIKE
    , T1.OPTION_TYPE
    , T1.CLOSE_PRICE
    , lag(T1.CLOSE_PRICE) over (partition by T1.UNDERLYING_SYMBOL, T1.ROOT_SYMBOL, T1.EXPIRATION, T1.STRIKE, T1.OPTION_TYPE order by T1.QUOTE_DATE) as PREV_CLOSE_PRICE
    from dbo.OptionsEOD as T1
    inner join #temp2 as T3
    on T1.QUOTE_DATE between T3.DATE_FROM and T3.DATE_TO
  • I think the following index should help performance:

     CREATE INDEX IX_OptionsEOD_1 
    ON dbo.OptionsEOD (UNDERLYING_SYMBOL, ROOT_SYMBOL, EXPIRATION, STRIKE, OPTION_TYPE, QUOTE_DATE)
    INCLUDE (CLOSE_PRICE)
    ;
  • frederico_fonseca wrote:

    try the following code.

    and then give us the actual explain plan for both this code and your code.

    drop table if exists #TEMP1;
    drop table if exists #TEMP2;

    select distinct T1.QUOTE_DATE
    into #TEMP1
    from dbo.OptionsEOD T1

    CREATE CLUSTERED INDEX #CI_TEMP1 ON #TEMP1
    (QUOTE_DATE
    );

    select T2.DATE_FROM
    , T2.DATE_TO
    into #temp2
    from (
    select lag(T1.QUOTE_DATE, 1) over (order by T1.QUOTE_DATE) as DATE_FROM
    , lag(T1.QUOTE_DATE, 0) over (order by T1.QUOTE_DATE) as DATE_TO
    , row_number() over (order by T1.QUOTE_DATE desc) as ROW_NUM
    from #TEMP1 as T1) as T2
    where T2.ROW_NUM = 1



    select T1.UNDERLYING_SYMBOL
    , T1.QUOTE_DATE
    , T1.ROOT_SYMBOL
    , T1.EXPIRATION
    , T1.STRIKE
    , T1.OPTION_TYPE
    , T1.CLOSE_PRICE
    , lag(T1.CLOSE_PRICE) over (partition by T1.UNDERLYING_SYMBOL, T1.ROOT_SYMBOL, T1.EXPIRATION, T1.STRIKE, T1.OPTION_TYPE order by T1.QUOTE_DATE) as PREV_CLOSE_PRICE
    from dbo.OptionsEOD as T1
    inner join #temp2 as T3
    on T1.QUOTE_DATE between T3.DATE_FROM and T3.DATE_TO

    thanks for the reply.

    your code:

    run time is about 8 seconds

    https://www.brentozar.com/pastetheplan/?id=BJ3LCKgY3

    my code:

    run time is about 7 m 6 s

    https://www.brentozar.com/pastetheplan/?id=HyTD15eY3

    there is big improvement!  is it possible to save even more time?

  • maybe this but not guaranteed.

    drop table if exists #TEMP1;
    drop table if exists #TEMP2;
    drop table if exists #TEMP3;

    select distinct T1.QUOTE_DATE
    into #TEMP1
    from dbo.OptionsEOD T1

    CREATE CLUSTERED INDEX #CI_TEMP1 ON #TEMP1
    (QUOTE_DATE
    );

    select T2.DATE_FROM
    , T2.DATE_TO
    into #temp2
    from (
    select lag(T1.QUOTE_DATE, 1) over (order by T1.QUOTE_DATE) as DATE_FROM
    , lag(T1.QUOTE_DATE, 0) over (order by T1.QUOTE_DATE) as DATE_TO
    , row_number() over (order by T1.QUOTE_DATE desc) as ROW_NUM
    from #TEMP1 as T1) as T2
    where T2.ROW_NUM = 1

    select T1.UNDERLYING_SYMBOL
    , T1.QUOTE_DATE
    , T1.ROOT_SYMBOL
    , T1.EXPIRATION
    , T1.STRIKE
    , T1.OPTION_TYPE
    , T1.CLOSE_PRICE
    into #temp3
    from dbo.OptionsEOD as T1
    where 0 = 1;

    create clustered index #ci_temp3 on #temp3
    (UNDERLYING_SYMBOL
    , ROOT_SYMBOL
    , EXPIRATION
    , STRIKE
    , OPTION_TYPE
    , QUOTE_DATE
    )

    insert into #temp3 with (tablock)
    select T1.UNDERLYING_SYMBOL
    , T1.QUOTE_DATE
    , T1.ROOT_SYMBOL
    , T1.EXPIRATION
    , T1.STRIKE
    , T1.OPTION_TYPE
    , T1.CLOSE_PRICE
    from dbo.OptionsEOD as T1
    inner join #temp2 as T3
    on T1.QUOTE_DATE between T3.DATE_FROM and T3.DATE_TO

    select T1.UNDERLYING_SYMBOL
    , T1.QUOTE_DATE
    , T1.ROOT_SYMBOL
    , T1.EXPIRATION
    , T1.STRIKE
    , T1.OPTION_TYPE
    , T1.CLOSE_PRICE
    , lag(T1.CLOSE_PRICE) over (partition by T1.UNDERLYING_SYMBOL, T1.ROOT_SYMBOL, T1.EXPIRATION, T1.STRIKE, T1.OPTION_TYPE order by T1.QUOTE_DATE) as PREV_CLOSE_PRICE
    from #temp3 as T1
  • frederico_fonseca wrote:

    maybe this but not guaranteed.

    drop table if exists #TEMP1;
    drop table if exists #TEMP2;
    drop table if exists #TEMP3;

    select distinct T1.QUOTE_DATE
    into #TEMP1
    from dbo.OptionsEOD T1

    CREATE CLUSTERED INDEX #CI_TEMP1 ON #TEMP1
    (QUOTE_DATE
    );

    select T2.DATE_FROM
    , T2.DATE_TO
    into #temp2
    from (
    select lag(T1.QUOTE_DATE, 1) over (order by T1.QUOTE_DATE) as DATE_FROM
    , lag(T1.QUOTE_DATE, 0) over (order by T1.QUOTE_DATE) as DATE_TO
    , row_number() over (order by T1.QUOTE_DATE desc) as ROW_NUM
    from #TEMP1 as T1) as T2
    where T2.ROW_NUM = 1

    select T1.UNDERLYING_SYMBOL
    , T1.QUOTE_DATE
    , T1.ROOT_SYMBOL
    , T1.EXPIRATION
    , T1.STRIKE
    , T1.OPTION_TYPE
    , T1.CLOSE_PRICE
    into #temp3
    from dbo.OptionsEOD as T1
    where 0 = 1;

    create clustered index #ci_temp3 on #temp3
    (UNDERLYING_SYMBOL
    , ROOT_SYMBOL
    , EXPIRATION
    , STRIKE
    , OPTION_TYPE
    , QUOTE_DATE
    )

    insert into #temp3 with (tablock)
    select T1.UNDERLYING_SYMBOL
    , T1.QUOTE_DATE
    , T1.ROOT_SYMBOL
    , T1.EXPIRATION
    , T1.STRIKE
    , T1.OPTION_TYPE
    , T1.CLOSE_PRICE
    from dbo.OptionsEOD as T1
    inner join #temp2 as T3
    on T1.QUOTE_DATE between T3.DATE_FROM and T3.DATE_TO

    select T1.UNDERLYING_SYMBOL
    , T1.QUOTE_DATE
    , T1.ROOT_SYMBOL
    , T1.EXPIRATION
    , T1.STRIKE
    , T1.OPTION_TYPE
    , T1.CLOSE_PRICE
    , lag(T1.CLOSE_PRICE) over (partition by T1.UNDERLYING_SYMBOL, T1.ROOT_SYMBOL, T1.EXPIRATION, T1.STRIKE, T1.OPTION_TYPE order by T1.QUOTE_DATE) as PREV_CLOSE_PRICE
    from #temp3 as T1

    thank you.

    it takes the same time as before...8s

    https://www.brentozar.com/pastetheplan/?id=SkCHR0lK2

  • From the plan, it appears that you are returning 5.3mil rows.

    Do you REALLY need that much data in the results?

  • 5.3 million rows is the estimated - the actual is 37k rows

    Statistics on columnstore can be a bit tricky as they don't quite work as other tables - but I would not be surprised that this table is subject to update/deletes (very  bad on columnstore) and/or stats not updated frequently.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply