ROW_NUMBER(): An efficient alternative to subqueries – Scenario 2

  • Comments posted to this topic are about the item ROW_NUMBER(): An efficient alternative to subqueries – Scenario 2

  • How does the ROW_NUMBER() approach compare with using self-joins as follows?

    SELECT Previous.Category, Previous.Date, Next.Date

    FROM Dates.CategoryDate Previous LEFT JOIN Dates.CategoryDate Next ON Previous.Category = Next.Category AND Previous.Date < Next.Date

    LEFT JOIN Dates.CategoryDate Middle ON Previous.Category = Middle.Category AND Previous.Date < Middle.Date AND Middle.Date < Next.Date

    WHERE Middle.Category IS NULL;

  • Good article. There is however no need to include the same field in the order by predicate when included in the partition by predicate, as you are ordering the field's value within the field's own value partition. (e.g. order by 123, 123, 123 makes no difference)

  • mbergstrom (5/29/2009)


    How does the ROW_NUMBER() approach compare with using self-joins as follows?

    SELECT Previous.Category,

    Previous.Date,

    Next.Date

    FROM Dates.CategoryDate Previous

    LEFT JOIN Dates.CategoryDate Next

    ON Previous.Category = Next.Category

    AND Previous.Date < Next.Date

    LEFT JOIN Dates.CategoryDate Middle

    ON Previous.Category = Middle.Category

    AND Previous.Date < Middle.Date

    AND Middle.Date < Next.Date

    WHERE Middle.Category IS NULL;

    Take a look at the execution plans. The self-joins result in one hefty Hash Match join. This join is helpful if one of the tables is much smaller than the other. In this particular case, the tables are not so small. I don't understand the purpose of "Middle." Overall, the self-joins take a lot longer than ROW_NUMBER() for both indexed and non-indexed versions of the table.

  • Very educational. Thank you! It's interesting that sub-query version sometimes outperforms ROW_NUMBER version in your example when proper indexes are in place. I guess the moral is to test various solutions.

  • Very educational article. Lately I have to deal with performance issues created by similar queries as given in this article. I like to read more about subqueries and their alternatives. For instance, could row_number be used to search in a string-column of a table? I now have something like:

    DECLARE @var1 AS String, @var2 AS String

    SELECT c1, c2, c3 FROM Table

    WHERE StringColumn LIKE '%' + @var1 + '%' AND StringColumn LIKE '%' + @var2 + '%'

    I'm trying several aproaches and I'm sure I'll either figure it out or conclude it can't be made pretty.

    Tnx again.

    Greetz,
    Hans Brouwer

  • FreeHansje (6/2/2009)


    Very educational article. Lately I have to deal with performance issues created by similar queries as given in this article. I like to read more about subqueries and their alternatives. For instance, could row_number be used to search in a string-column of a table? I now have something like:

    DECLARE @var1 AS String, @var2 AS String

    SELECT c1, c2, c3 FROM Table

    WHERE StringColumn LIKE '%' + @var1 + '%' AND StringColumn LIKE '%' + @var2 + '%'

    I'm trying several aproaches and I'm sure I'll either figure it out or conclude it can't be made pretty.

    Tnx again.

    Very good question. I am working on a code set that might be able to help. The question gives me an idea for another article. Will try and post the code set soon.

  • FreeHansje (6/2/2009)


    Very educational article. Lately I have to deal with performance issues created by similar queries as given in this article. I like to read more about subqueries and their alternatives. For instance, could row_number be used to search in a string-column of a table? I now have something like:

    DECLARE @var1 AS String, @var2 AS String

    SELECT c1, c2, c3 FROM Table

    WHERE StringColumn LIKE '%' + @var1 + '%' AND StringColumn LIKE '%' + @var2 + '%'

    I'm trying several aproaches and I'm sure I'll either figure it out or conclude it can't be made pretty.

    Tnx again.

    You might also want to look into this article by Michael Coles. http://www.sqlservercentral.com/articles/Full-Text+Search+(2008)/64248/

  • FreeHansje (6/2/2009)


    Very educational article. Lately I have to deal with performance issues created by similar queries as given in this article. I like to read more about subqueries and their alternatives. For instance, could row_number be used to search in a string-column of a table? I now have something like:

    DECLARE @var1 AS String, @var2 AS String

    SELECT c1, c2, c3 FROM Table

    WHERE StringColumn LIKE '%' + @var1 + '%' AND StringColumn LIKE '%' + @var2 + '%'

    I'm trying several aproaches and I'm sure I'll either figure it out or conclude it can't be made pretty.

    Tnx again.

    For this scenario I'd say your best bet would be to use a method like n-gram search. Using LIKE '%' + @var1 + '%' will result in inefficient scans. Basically you lose the efficiencies associated with indexing. If you use a method like n-grams, you can index the resultant fixed-length character strings (3 characters = trigram, etc.) and retrieve results similar to the ones you would get with LIKE '%' + @var1 + '%'.

    When I do n-gram style searches in T-SQL I usually use ROW_NUMBER() to number each of the n-grams when I split words up.

  • Also refer

    http://sqlblogcasts.com/blogs/madhivanan/archive/2007/08/27/multipurpose-row-number-function.aspx


    Madhivanan

    Failing to plan is Planning to fail

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply