Find the brackets

  • funbi (11/17/2016)


    Luis Cazares (11/17/2016)


    funbi (11/17/2016)


    Laurie Dunn (11/17/2016)


    This would also work and keep the code simple...

    SELECT *

    FROM dbo.MyData

    WHERE Left(mychar,1)='[';

    Yeah I got

    select * from MyData where CHARINDEX('[',mychar,0) = 1

    which works fine, then looked at the options and realised today's lesson was about ESCAPE :hehe:

    Both of these queries are non SARGable, which would result in an index scan instead of an index seek.

    The table has 7 rows.

    Real world tables don't have 7 rows. A small row count is not a valid excuse for bad code.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (11/17/2016)


    funbi (11/17/2016)


    The table has 7 rows.

    Real world tables don't have 7 rows. A small row count is not a valid excuse for bad code.

    It's only bad code if there are a lot more than 7 rows 😛

  • funbi (11/17/2016)


    Luis Cazares (11/17/2016)


    funbi (11/17/2016)


    The table has 7 rows.

    Real world tables don't have 7 rows. A small row count is not a valid excuse for bad code.

    It's only bad code if there are a lot more than 7 rows 😛

    Totally incorrect. The number of rows in a table doesn't dictate how often your code will be used nor who will copy it for something else. Besides, regardless of row count, why would you write any bad code when you know (or should know) the correct way to write it? Take the opportunity to lead by good example instead of making excuses for bad code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Stewart "Arturius" Campbell (11/17/2016)


    Laurie Dunn (11/17/2016)


    This would also work and keep the code simple...

    SELECT *

    FROM dbo.MyData

    WHERE Left(mychar,1)='[';

    To further illustrate Luis' point, try this out:

    set nocount on;

    INSERT #MyData

    (mychar)

    VALUES

    ('This is a string'),

    ('"A Quoted String"'),

    ('''Single quoted string'''),

    ('''more single quotes'''),

    ('[My bracketed string]'),

    (''),

    ('Can I find [this string]')

    ;

    GO 1000000

    CREATE INDEX #IX_MyData_myChar ON #MyData(myChar)

    GO

    SELECT mychar FROM #MyData WHERE mychar LIKE '[[%' ESCAPE '[';

    SELECT mychar FROM #MyData WHERE mychar LIKE '[[]%';

    SELECT mychar FROM #MyData WHERE mychar LIKE '%[[%' ESCAPE '[' ;

    SELECT mychar FROM #MyData WHERE Left(mychar,1)='[';

    The execution plans reflect:

    Query 1 uses 12%

    Query 2 uses 12%

    Query 3 uses 37%

    Query 4 uses 38%

    which one would you rather use?

    I appreciate what you're doing but, to answer your question, we don't know yet because % of Batch is only an estimate even for so-called "Actual Execution Plans", which can be totally backwards. In order to select the best method for performance, you need a substantial quantity of non-grooved data to test against and then measure the actual performance. It may be accurate in this case (would have to do the test to actually find out) but execution plans should only be used for doing research because they frequently don't represent anything having to do with actual performance.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/17/2016)


    funbi (11/17/2016)


    Luis Cazares (11/17/2016)


    funbi (11/17/2016)


    The table has 7 rows.

    Real world tables don't have 7 rows. A small row count is not a valid excuse for bad code.

    It's only bad code if there are a lot more than 7 rows 😛

    Totally incorrect. The number of rows in a table doesn't dictate how often your code will be used nor who will copy it for something else. Besides, regardless of row count, why would you write any bad code when you know (or should know) the correct way to write it? Take the opportunity to lead by good example instead of making excuses for bad code.

    It's a QotD on a forum. The table has 7 rows. Efficiency is not a factor in this instance. My sql was not incorrect - it brought back the right results. Maybe it is not the *preferred* way, but it is certainly correct.

  • Here's an example of what I'm talking about with respect to the execution plan. See the attached "Actual Execution Plan" and look at the % of Batch listings for both queries. If you believed the % of Batch as the only indication of performance, you'd pick the first query.

    This is a classic problem, which has actually mislead many people into believing that recursive CTEs (rCTE) are a good thing. Here's the code that produced the "Actual Execution Plan". Go ahead and run it.

    DECLARE @Bitbucket INT

    ;

    RAISERROR('--===== Reursive CTE =====',0,0) WITH NOWAIT;

    SET STATISTICS TIME,IO ON;

    WITH cte AS

    (

    SELECT N = 1

    UNION ALL

    SELECT N = N + 1

    FROM cte

    WHERE N < 1000000

    )

    SELECT @Bitbucket = N

    FROM cte

    OPTION (MAXRECURSION 0)

    ;

    SET STATISTICS TIME,IO OFF;

    PRINT REPLICATE('-',119)

    ;

    RAISERROR('--===== Pseudo-Cursor =====',0,0) WITH NOWAIT;

    SET STATISTICS TIME,IO ON;

    SELECT TOP 1000000

    @Bitbucket = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    SET STATISTICS TIME,IO OFF;

    PRINT REPLICATE('-',119);

    GO

    Here's what the results of that code are on my little i5/6GB Ram laptop look like.

    --===== Reursive CTE =====

    Table 'Worktable'. Scan count 2, [font="Arial Black"]logical reads 6000001[/font], physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    [font="Arial Black"] CPU time = 8190 ms, elapsed time = 8839 ms.[/font]

    -----------------------------------------------------------------------------------------------------------------------

    --===== Pseudo-Cursor =====

    Table 'syscolrdb'. Scan count 1, [font="Arial Black"]logical reads 110[/font], physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'syscolpars'. Scan count 2, logical reads 15, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    [font="Arial Black"] CPU time = 125 ms, elapsed time = 150 ms.[/font]

    -----------------------------------------------------------------------------------------------------------------------

    Even if we output to a table, the actual performance is an order of magnitude different that what the % of Batch indicates.

    RAISERROR('--===== Reursive CTE =====',0,0) WITH NOWAIT;

    SET STATISTICS TIME,IO ON;

    WITH cte AS

    (

    SELECT N = 1

    UNION ALL

    SELECT N = N + 1

    FROM cte

    WHERE N < 1000000

    )

    SELECT *

    INTO #MyHead1

    FROM cte

    OPTION (MAXRECURSION 0)

    ;

    SET STATISTICS TIME,IO OFF;

    PRINT REPLICATE('-',119)

    ;

    RAISERROR('--===== Pseudo-Cursor =====',0,0) WITH NOWAIT;

    SET STATISTICS TIME,IO ON;

    SELECT TOP 1000000

    N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    INTO #MyHead2

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    SET STATISTICS TIME,IO OFF;

    PRINT REPLICATE('-',119);

    GO

    DROP TABLE #MyHead1,#MyHead2

    GO

    Here's what the results of that code are on my little i5/6GB Ram laptop look like.

    --===== Reursive CTE =====

    Table 'Worktable'. Scan count 2, logical reads 6000001, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    [font="Arial Black"] CPU time = 8518 ms, elapsed time = 9184 ms.[/font]

    (1000000 row(s) affected)

    -----------------------------------------------------------------------------------------------------------------------

    --===== Pseudo-Cursor =====

    Table 'syscolrdb'. Scan count 1, logical reads 110, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'syscolpars'. Scan count 2, logical reads 15, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    SQL Server Execution Times:

    [font="Arial Black"] CPU time = 249 ms, elapsed time = 733 ms.[/font]

    (1000000 row(s) affected)

    -----------------------------------------------------------------------------------------------------------------------

    Now, I'll admit that % of Batch is normally a better indication of which code will perform better than in this example but it does show how far off it can be compared to actual performance. You MUST test and properly measure code for performance and resource usage to decide which is best. You CANNOT rely the execution plan to tell you.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/16/2016)

    IMHO, cleaner code if you take the shortcut.

    SELECT mychar FROM mydata WHERE mychar LIKE '[[]%'

    That's how I went about doing it and then had to take a closer look at the possible answers 🙂

    Cheers

    That is is what I came up with on my own and when I looked at the answers I quickly realised it was not one of the options. Through process of elimination I decided to take a chance that the ESCAPE keyword exists since I had not seen it before. I got lucky here.

    ----------------------------------------------------

  • MMartin1 (11/17/2016)


    Jeff Moden (11/16/2016)

    IMHO, cleaner code if you take the shortcut.

    SELECT mychar FROM mydata WHERE mychar LIKE '[[]%'

    That's how I went about doing it and then had to take a closer look at the possible answers 🙂

    Cheers

    That is is what I came up with on my own and when I looked at the answers I quickly realised it was not one of the options. Through process of elimination I decided to take a chance that the ESCAPE keyword exists since I had not seen it before. I got lucky here.

    You would have If you had read the article that is being republished tomorrow. 😀

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • This was removed by the editor as SPAM

  • Stewart "Arturius" Campbell (11/17/2016)


    Another cause of disparity are scalar functions, which the execution plan appears to ignore completely.

    Agreed. In fact, those little beasties can be downright troublesome to measure for performance. Although I agree that they should be generally avoided, it shouldn't be because of SET STATISICS measurements against them because SET STATISTICS can make them look hundreds of times worse than they actually are.

    For more on that nuance, please see the following article.

    [font="Arial Black"]How to Make Scalar UDFs Run Faster (SQL Spackle)

    [/font][/url]

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • In the problem posed all the suggested choices involved "like"

    Is there a reason i should use "Like " instead of the following

    select *

    from MyData

    where LEFT(mychar,1) = '['

    thanks,

  • @adma3613 Another user has explained the reason why it's better to avoid this solution. I copy and paste his answer:

    "these queries are non SARGable, which would result in an index scan instead of an index seek".

    If you don't know the meaning of SARGable, please have a look at this https://ask.sqlservercentral.com/questions/1178/definition-of-sargable.html

  • funbi (11/17/2016)


    It's a QotD on a forum. The table has 7 rows. Efficiency is not a factor in this instance. My sql was not incorrect - it brought back the right results. Maybe it is not the *preferred* way, but it is certainly correct.

    That's true. But what happens tomorrow when someone inserts another 10 million rows into the table? Your previously "correct" SQL grinds to a halt. Much better to get it right today than have to rewrite it tomorrow.

  • Very interesting discussion.

  • Jeff Moden (11/17/2016)


    Stewart "Arturius" Campbell (11/17/2016)


    Another cause of disparity are scalar functions, which the execution plan appears to ignore completely.

    Agreed. In fact, those little beasties can be downright troublesome to measure for performance. Although I agree that they should be generally avoided, it shouldn't be because of SET STATISICS measurements against them because SET STATISTICS can make them look hundreds of times worse than they actually are.

    For more on that nuance, please see the following article.

    [font="Arial Black"]How to Make Scalar UDFs Run Faster (SQL Spackle)

    [/font][/url]

    Your reminder triggered me to take another look at this and the results I got make me think both that SET STATISTICS doesn't behave as badly in SQL Server 2016 as it did in SQL Server 2005 and that scalar UDFs are more of a performance hit (at least on simple stuff like doubling) on SS 2016 than on SS 2005. But probably my measurements on an old laptop with Windows 10 Home aren't very relevant to serious SQL Server usage.

    Tom

Viewing 15 posts - 16 through 30 (of 40 total)

You must be logged in to reply to this topic. Login to reply