Slow Script

  • Hello. I ran the following script that too 3 hours 25 minutes to insert about 40,000 rows. Is there any logic that I can put into this to speed it up? Thank you!

    insert into T3

    select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3

    from T1

    join T2 on T1.COL4 = T2.COL3

    where T2.COL4 != 'data1'

    and T1.COL1 is not NULL

    and T1.COL5 is not NULL

    and T1.COL6 = 'data2'

    and CAST (T2.COL1 as VARCHAR)+'-'+CAST(T2.COL2 as VARCHAR)+'-'+CAST(T1.COL1 as VARCHAR)+'-'+CAST(T1.COL2 as VARCHAR) not in

    (select CAST(COL1 as VARCHAR)+'-'+CAST(COL2 as VARCHAR)+'-'+CAST(COL3 as VARCHAR)+'-'+CAST(COL4 as VARCHAR) from T3)

  • there are many causes for performance. but it's good to post little bit more details about this.

    Like, do you have created index on some column if yes which type, do you use somewhere have transaction where this tables is used,

    i don't know, try this

    select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3,

    CAST(T2.COL1 as VARCHAR)+'-'+CAST(T2.COL2 as VARCHAR)+'-'+CAST(T1.COL1 as VARCHAR)+'-'+CAST(T1.COL2 as VARCHAR) tmpColumn into #temp

    from T1 (nolock) join T2 (nolock) on T1.COL4 = T2.COL3

    where T2.COL4 != 'data1'

    and T1.COL1 is not NULL

    and T1.COL5 is not NULL

    and T1.COL6 = 'data2'

    insert into T3

    select * from #temp where tmpColumn not in

    (select CAST(COL1 as VARCHAR)+'-'+CAST(COL2 as VARCHAR)+'-'+CAST(COL3 as VARCHAR)+'-'+CAST(COL4 as VARCHAR) from T3)

  • There isn't a lot to go on here. If you could share some more information like: How large are the tables in question (both size and number of rows)? Are there indexes that cover this query? What does the Execution Plan show? Did this used to run fast and now it is slow? Is this an ad-hoc query or part of a stored procedure? etc.



    Microsoft Certified Master - SQL Server 2008
    Follow me on twitter: @keith_tate

    Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • And some examples of actual data in the tables, along with the CREATE TABLE code..

  • easy_goer (10/1/2013)


    Hello. I ran the following script that too 3 hours 25 minutes to insert about 40,000 rows. Is there any logic that I can put into this to speed it up? Thank you!

    insert into T3

    select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3

    from T1

    join T2 on T1.COL4 = T2.COL3

    where T2.COL4 != 'data1'

    and T1.COL1 is not NULL

    and T1.COL5 is not NULL

    and T1.COL6 = 'data2'

    and CAST (T2.COL1 as VARCHAR)+'-'+CAST(T2.COL2 as VARCHAR)+'-'+CAST(T1.COL1 as VARCHAR)+'-'+CAST(T1.COL2 as VARCHAR) not in

    (select CAST(COL1 as VARCHAR)+'-'+CAST(COL2 as VARCHAR)+'-'+CAST(COL3 as VARCHAR)+'-'+CAST(COL4 as VARCHAR) from T3)

    We can't really help performance issues on theoretical tables. I can say that your performance issues are all based in your where clause. You have lots of nonSARGable predicates here.

    != will force a full scan.

    Also you cast all these columns to varchar. You didn't specify a length so it will use the default length. Do you know the default length for varchar? Me neither, that is why you should always specify the length.

    If you want some real assistance with the performance issues you should take a look at this article.

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • This I reckon is the correct way to write your query:

    select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3

    from T1

    join T2

    on T1.COL4 = T2.COL3

    where T2.COL4 != 'data1'

    and T1.COL1 is not NULL

    and T1.COL5 is not NULL

    and T1.COL6 = 'data2'

    AND NOT EXISTS (SELECT 1 FROM T3

    WHERE T3.COL1 = T2.COL1

    AND T3.COL2 = T2.COL2

    AND T3.COL3 = T1.COL1

    AND T3.COL4 = T1.COL2)

    Check the results against the SELECT part of your original.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Worked like a charm. Thanks for the assistance!

  • Chris, is this the equivalent of a LEFT JOIN based on 4 columns ?

    AND NOT EXISTS (SELECT 1 FROM T3

    WHERE T3.COL1 = T2.COL1

    AND T3.COL2 = T2.COL2

    AND T3.COL3 = T1.COL1

    AND T3.COL4 = T1.COL2)

  • homebrew01 (10/5/2013)


    Chris, is this the equivalent of a LEFT JOIN based on 4 columns ?

    AND NOT EXISTS (SELECT 1 FROM T3

    WHERE T3.COL1 = T2.COL1

    AND T3.COL2 = T2.COL2

    AND T3.COL3 = T1.COL1

    AND T3.COL4 = T1.COL2)

    Yeh... but faster.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ChrisM@Work (10/1/2013)


    This I reckon is the correct way to write your query:

    select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3

    from T1

    join T2

    on T1.COL4 = T2.COL3

    where T2.COL4 != 'data1'

    and T1.COL1 is not NULL

    and T1.COL5 is not NULL

    and T1.COL6 = 'data2'

    AND NOT EXISTS (SELECT 1 FROM T3

    WHERE T3.COL1 = T2.COL1

    AND T3.COL2 = T2.COL2

    AND T3.COL3 = T1.COL1

    AND T3.COL4 = T1.COL2)

    Check the results against the SELECT part of your original.

    So in this case, it will select from T1 and T2, only if not in T3 ?

    What if you want to include non-matches against T3 ? Would you add to the WHERE clause ?

    and T3.COL1 is NULL

    and T3.COL2 is NULL

    and T3.COL3 is NULL

    and T3.COL4 is NULL

  • homebrew01 (10/6/2013)


    What if you want to include non-matches against T3 ? Would you add to the WHERE clause ?

    and T3.COL1 is NULL

    and T3.COL2 is NULL

    and T3.COL3 is NULL

    and T3.COL4 is NULL

    I guess it depends on what you are actually trying to do but if any of the t3 columns are NULL, they won't match any of the t2 columns even if they are also NULL so the result is that the rows won't exist and the WHERE NOT EXISTS will still work.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • homebrew01 (10/6/2013)


    ChrisM@Work (10/1/2013)


    This I reckon is the correct way to write your query:

    select T2.COL1, T2.COL2, T1.COL1, T1.COL2, T1.COL3

    from T1

    join T2

    on T1.COL4 = T2.COL3

    where T2.COL4 != 'data1'

    and T1.COL1 is not NULL

    and T1.COL5 is not NULL

    and T1.COL6 = 'data2'

    AND NOT EXISTS (SELECT 1 FROM T3

    WHERE T3.COL1 = T2.COL1

    AND T3.COL2 = T2.COL2

    AND T3.COL3 = T1.COL1

    AND T3.COL4 = T1.COL2)

    Check the results against the SELECT part of your original.

    So in this case, it will select from T1 and T2, only if not in T3 ?

    Yes, that's correct.

    What if you want to include non-matches against T3 ? Would you add to the WHERE clause ?

    Isn't this the same as "not in T3"?

    and T3.COL1 is NULL

    and T3.COL2 is NULL

    and T3.COL3 is NULL

    and T3.COL4 is NULL

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply