Row_number() over (partition by ....order by)

  • Hi guys,

    can anyone please explain to me in plain english what this update statement is doing (mainly the join select)

    Many thanks

    update tblEMAIL

    set EMAIL_No = EMAILNUMBER

    from tblEMAIL a

    JOIN (SELECT EMAIL_ID,

    row_number() over (partition by PersonID, Department order by EMAIL_ID) EMAILNUMBER

    from tblEMAIL) b

    on a.EMAIL_ID = b.EMAIL_ID

  • Well for 1, this is a 2005+ query and you posted in sql 2000 forum.

    #2, you can find all you need to know in books online 2k5 about this.

    The short answer is that this query is manually building a identity for each person / departement combinaisons.

    I will suggest you copy the tables to a dev environement and run the query yourself. You'll see exactly what it's doing.

  • To get you started - the row_number() function will return a sequence of numbers starting at 1 effectively allocating a number to each EMAIL_ID in EMAIL_ID order. The sequence restarts at 1 for each (PersonID, Department) combination.

    The join allows the number generated by the row_number() function to be set up as the EMAIL_NO in tblEMAIL.

    --------------------------------------------------------------

    “Doubt is not a pleasant condition, but certainty is absurd.” Voltaire

  • You might want to read my article that covers this at SQL Server Ranking Functions[/url]

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • askquestions (11/1/2010)


    Hi guys,

    can anyone please explain to me in plain english what this update statement is doing (mainly the join select)

    Many thanks

    update tblEMAIL

    set EMAIL_No = EMAILNUMBER

    from tblEMAIL a

    JOIN (SELECT EMAIL_ID,

    row_number() over (partition by PersonID, Department order by EMAIL_ID) EMAILNUMBER

    from tblEMAIL) b

    on a.EMAIL_ID = b.EMAIL_ID

    Rewritten to make it look a little more conventional (to me):

    UPDATE a

    SET EMAIL_No = b.EMAIL_No

    FROM tblEMAIL a

    JOIN (

    SELECT EMAIL_ID,

    EMAIL_No = ROW_NUMBER() OVER (PARTITION BY PersonID, Department ORDER BY EMAIL_ID)

    FROM tblEMAIL

    ) b

    ON a.EMAIL_ID = b.EMAIL_ID

    I reckon this will set the whole EMAIL_No column to 1, like this:

    DROP TABLE #Email

    CREATE TABLE #Email (EMAIL_ID INT IDENTITY, EMAIL_No INT, PersonID INT, Department VARCHAR(25))

    INSERT INTO #Email (EMAIL_No, PersonID, Department)

    SELECT 1, 1, 'Accounts' UNION ALL

    SELECT 1, 2, 'Accounts' UNION ALL

    SELECT 1, 3, 'Accounts' UNION ALL

    SELECT 1, 4, 'Accounts' UNION ALL

    SELECT 2, 5, 'Sales' UNION ALL

    SELECT 2, 6, 'Sales' UNION ALL

    SELECT 2, 7, 'Sales' UNION ALL

    SELECT 2, 8, 'Sales' UNION ALL

    SELECT 2, 9, 'Sales' UNION ALL

    SELECT 2, 10, 'Sales' UNION ALL

    SELECT 1, 4, 'Sales'

    SELECT * FROM #Email

    UPDATE a

    SET EMAIL_No = b.EMAIL_No

    FROM #Email a

    JOIN (

    SELECT EMAIL_ID,

    EMAIL_No = ROW_NUMBER() OVER (PARTITION BY PersonID, Department ORDER BY EMAIL_ID)

    FROM #Email

    ) b

    ON a.EMAIL_ID = b.EMAIL_ID

    SELECT * FROM #Email

    The first row of however many returned by the derived table will be used by the UPDATE. Only one UPDATE occurs regardless of how many rows match. The first row will have a value of 1 for EMAIL_NO.

    There's no restriction in the derived table.

    The self join ensures every row is touched.

    Next question - what is this query supposed to do?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris Morris-439714 (11/5/2010)


    I reckon this will set the whole EMAIL_No column to 1

    I agree that with the data you have specified the EMAIL_NO column will always be 1 because there aren’t any duplicate "PersonID, Department" entries. I guess the code only makes sense if individuals have multiple Email entries so PersonID values occur more than once in a department

    --------------------------------------------------------------

    “Doubt is not a pleasant condition, but certainty is absurd.” Voltaire

  • andy.roberts (11/5/2010)


    Chris Morris-439714 (11/5/2010)


    I reckon this will set the whole EMAIL_No column to 1

    I agree that with the data you have specified the EMAIL_NO column will always be 1 because there aren’t any duplicate "PersonID, Department" entries. I guess the code only makes sense if individuals have multiple Email entries so PersonID values occur more than once in a department

    Yep. I guess it's a bit easier to see with this:

    ;WITH CTEEmail AS (

    SELECT EMAIL_ID,

    EMAIL_No,

    NewEMAIL_No = ROW_NUMBER() OVER (PARTITION BY PersonID, Department ORDER BY EMAIL_ID)

    FROM #Email

    ) UPDATE CTEEmail SET EMAIL_No = NewEMAIL_No

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi

    I've only just realised that i never said thank you for your help with this query! So, thanks very much guys - much appreciated.

  • In the Given Query a new column will create at run time with name EmailNumber and that number will update the variable where the SET statement have been used.

  • shashikantnist (7/31/2014)


    In the Given Query a new column will create at run time with name EmailNumber and that number will update the variable where the SET statement have been used.

    What variable? No variables are referenced in the last query.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply