An exercise in query optimisation.....

  • Hi all,

    I am starting to work in an environment where database DevOps plays a very important role and I am boning up on a few things to prepare myself.

    Does anyone know of any sites with prepared "bad" queries (intentionally bad or otherwise) that can be used as exercises?

    Thanks in advance!

    Regards,
    Kev

  • kevaburg - Thursday, September 28, 2017 4:48 AM

    Hi all,

    I am starting to work in an environment where database DevOps plays a very important role and I am boning up on a few things to prepare myself.

    Does anyone know of any sites with prepared "bad" queries (intentionally bad or otherwise) that can be used as exercises?

    Thanks in advance!

    Regards,
    Kev

    You mean something other than solving problems here in the forums? You can find nice challenges here.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I try and do that but most of the time the questions are already answered or the solution is a few jumps away from where I am at the moment.

     The problem is that I need a dataset against which to test a solution otherwise I will be posting code that even I haven't tested and that really isn't the point... 🙂

  • kevaburg - Thursday, September 28, 2017 8:17 AM

    I try and do that but most of the time the questions are already answered or the solution is a few jumps away from where I am at the moment.

     The problem is that I need a dataset against which to test a solution otherwise I will be posting code that even I haven't tested and that really isn't the point... 🙂

    I hear ya, bro.   Problem is, who's going to want all their problem's "exposed to the world", so to speak?   Often times, code can be at the minimum confidential, so there's that issue as well.   I've never found a site dedicated to that specific purpose, but if you do find one, I'd sure love to hear about it.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Then, the first thing you need to practice is how to generate sample data that is large enough to test for performance.
    Here are 2 articles that can help you start:
     http://www.sqlservercentral.com/articles/Data+Generation/87901/
     http://www.sqlservercentral.com/articles/Test+Data/88964/
    And I got to the following article about generating  company names using sys.messages:
    https://sqlsunday.com/2017/09/18/fun-with-random-names/
    However, I modified the query to remove the recursive cte and replace it with a nice tested function. Splitting Strings Based on Patterns - SQLServerCentral

    DECLARE @words TABLE (
      word   nvarchar(100) NOT NULL,
      PRIMARY KEY CLUSTERED (word)
    );

    WITH cteStrings AS (
      SELECT TOP (500) message_id, [text]
      FROM sys.messages
      WHERE language_id=1033
      ORDER BY NEWID()
    )
    INSERT INTO @words
    SELECT DISTINCT LOWER(s.Item)
    FROM cteStrings w
    CROSS APPLY dbo.PatternSplitCM( w.[text], N'%[A-Za-z]%') s
    WHERE LEN(s.Item)>=3
    AND s.Matched = 1;

    SELECT TOP (1000)
      UPPER(LEFT(a.word, 1))+SUBSTRING(a.word, 2, LEN(a.word))+
      ISNULL(N' '+b.word, '')+
      ISNULL(N' '+c.word, '') AS CoolStartupName

    --- First word:
    FROM @words AS a

    --- Second word:
    OUTER APPLY (
      SELECT TOP (1)
        UPPER(LEFT(x.word, 1))+
        SUBSTRING(x.word, 2, LEN(x.word)) AS word
      FROM @words AS x
      WHERE a.word <> x.word
      ORDER BY NEWID()
      ) AS b

    --- Third word (if there's space):
    OUTER APPLY (
      SELECT TOP (1)
        UPPER(LEFT(x.word, 1))+
        SUBSTRING(x.word, 2, LEN(x.word)) AS word
      FROM @words AS x
      WHERE a.word <> x.word
      AND b.word <> x.word
      AND LEN(a.word+b.word)<=12
      ORDER BY NEWID()
      ) AS c

    ORDER BY NEWID();

    And don't worry if a question has already been answered. Try to get your own conclusions, compare them to the ones posted and then find out which one is better and why.
    The first step to tune code is to find out if the code actually needs to be or can be tuned.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • I think it is time to build such a site..... 🙂

  • sgmunson - Thursday, September 28, 2017 8:27 AM

    kevaburg - Thursday, September 28, 2017 8:17 AM

    I try and do that but most of the time the questions are already answered or the solution is a few jumps away from where I am at the moment.

     The problem is that I need a dataset against which to test a solution otherwise I will be posting code that even I haven't tested and that really isn't the point... 🙂

    I hear ya, bro.   Problem is, who's going to want all their problem's "exposed to the world", so to speak?   Often times, code can be at the minimum confidential, so there's that issue as well.   I've never found a site dedicated to that specific purpose, but if you do find one, I'd sure love to hear about it.

    Stackoverflow has the database available for download for demo: https://www.brentozar.com/archive/2015/10/how-to-download-the-stack-overflow-database-via-bittorrent/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares - Thursday, September 28, 2017 8:40 AM

    Then, the first thing you need to practice is how to generate sample data that is large enough to test for performance.
    Here are 2 articles that can help you start:
     http://www.sqlservercentral.com/articles/Data+Generation/87901/
     http://www.sqlservercentral.com/articles/Test+Data/88964/
    And I got to the following article about generating  company names using sys.messages:
    https://sqlsunday.com/2017/09/18/fun-with-random-names/
    However, I modified the query to remove the recursive cte and replace it with a nice tested function. Splitting Strings Based on Patterns - SQLServerCentral

    DECLARE @words TABLE (
      word   nvarchar(100) NOT NULL,
      PRIMARY KEY CLUSTERED (word)
    );

    WITH cteStrings AS (
      SELECT TOP (500) message_id, [text]
      FROM sys.messages
      WHERE language_id=1033
      ORDER BY NEWID()
    )
    INSERT INTO @words
    SELECT DISTINCT LOWER(s.Item)
    FROM cteStrings w
    CROSS APPLY dbo.PatternSplitCM( w.[text], N'%[A-Za-z]%') s
    WHERE LEN(s.Item)>=3
    AND s.Matched = 1;

    SELECT TOP (1000)
      UPPER(LEFT(a.word, 1))+SUBSTRING(a.word, 2, LEN(a.word))+
      ISNULL(N' '+b.word, '')+
      ISNULL(N' '+c.word, '') AS CoolStartupName

    --- First word:
    FROM @words AS a

    --- Second word:
    OUTER APPLY (
      SELECT TOP (1)
        UPPER(LEFT(x.word, 1))+
        SUBSTRING(x.word, 2, LEN(x.word)) AS word
      FROM @words AS x
      WHERE a.word <> x.word
      ORDER BY NEWID()
      ) AS b

    --- Third word (if there's space):
    OUTER APPLY (
      SELECT TOP (1)
        UPPER(LEFT(x.word, 1))+
        SUBSTRING(x.word, 2, LEN(x.word)) AS word
      FROM @words AS x
      WHERE a.word <> x.word
      AND b.word <> x.word
      AND LEN(a.word+b.word)<=12
      ORDER BY NEWID()
      ) AS c

    ORDER BY NEWID();

    And don't worry if a question has already been answered. Try to get your own conclusions, compare them to the ones posted and then find out which one is better and why.
    The first step to tune code is to find out if the code actually needs to be or can be tuned.

    Sir, you are a gentleman and a scholar!  Time to go back to TSQL school......

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply