Group by Date, Then take 1st row of each id

  • Hi All,

    I think this might be complicated. New to this forum, but okay at SQL. I'm starting to delve into the more complicated statements. I have a bunch of widgets, that are created. And, I have a DB that records when each widget is made (WidgetID, WidgetName, DateCreated). I'm looking for a query that will return the first widget of it's type created for each day. So, if I have 3 types of different widgets; I will get a query grouped by a 24 hour period with only 3 widgets within each group. And, the number of groups in the query would be from the first day the widgets were created (basically every day of the year that is saved in the widget db). 

    I used the following format, but that returned the first day the widget was made. 


    SELECT * FROM
         (SELECT *,
              ROW_NUMBER()
                   OVER (PARTITION BY LOWER(WidgetID) ORDER BY WidgetName) AS rn
          FROM WidgetTable) CTE
    WHERE rn=1;

    Thanks In Advance!

  • You have to add the day number, or however you define your 24-hour periods, to the PARTITION BY clause. Then you will get a row 1 for every widget ID for every day.

  • Would help if you posted the DDL (CREATE TABLE statement) for the table(s), some sample data that represents your problem domain (as INSERT INTO statements) for the table(s), and the expected results based on the sample data.

  • CoffeeHeaven - Tuesday, December 12, 2017 3:08 PM

    Hi All,

    I think this might be complicated. New to this forum, but okay at SQL. I'm starting to delve into the more complicated statements. I have a bunch of widgets, that are created. And, I have a DB that records when each widget is made (WidgetID, WidgetName, DateCreated). I'm looking for a query that will return the first widget of it's type created for each day. So, if I have 3 types of different widgets; I will get a query grouped by a 24 hour period with only 3 widgets within each group. And, the number of groups in the query would be from the first day the widgets were created (basically every day of the year that is saved in the widget db). 

    I used the following format, but that returned the first day the widget was made. 


    SELECT * FROM
         (SELECT *,
              ROW_NUMBER()
                   OVER (PARTITION BY LOWER(WidgetID) ORDER BY WidgetName) AS rn
          FROM WidgetTable) CTE
    WHERE rn=1;

    Thanks In Advance!

    Assuming that LOWER(WidgetID) uniquely identifies a widget type
    SELECT * FROM
      (SELECT *,
        ROW_NUMBER()
         OVER (PARTITION BY CAST(DateCreated AS DATE), LOWER(WidgetID) ORDER BY DateCreated) AS rn
      FROM WidgetTable) CTE
    WHERE rn=1;

    Assuming that WidgetName uniquely identifies a widget type
    SELECT * FROM
      (SELECT *,
        ROW_NUMBER()
         OVER (PARTITION BY CAST(DateCreated AS DATE), WidgetName ORDER BY DateCreated) AS rn
      FROM WidgetTable) CTE
    WHERE rn=1;

  • FWIW I prefer to write these sorts of problems as Common Table Expressions. 

    WITH CTE_WidgetRank AS
    (
            SELECT
                        *,
                        ROW_NUMBER() OVER (PARTITION BY CAST(DateCreated AS DATE), WidgetName ORDER BY DateCreated) AS 'RN'
            FROM
                        WidgetTable
    )
    SELECT * FROM CTE_WidgetRank WHERE RN=1

    The separation of the sub-query makes it easier to test in isolation
    Giving it  a name makes it clearer when you use it what it is trying to do
    The CTE can be re-used: If you had a more complex query that needs the same calculation in two places, you only need to write and edit the sub-query once
    As far as I am aware query plans and performance will be identical

  • DesNorton - Tuesday, December 12, 2017 11:16 PM

    CoffeeHeaven - Tuesday, December 12, 2017 3:08 PM

    Hi All,

    I think this might be complicated. New to this forum, but okay at SQL. I'm starting to delve into the more complicated statements. I have a bunch of widgets, that are created. And, I have a DB that records when each widget is made (WidgetID, WidgetName, DateCreated). I'm looking for a query that will return the first widget of it's type created for each day. So, if I have 3 types of different widgets; I will get a query grouped by a 24 hour period with only 3 widgets within each group. And, the number of groups in the query would be from the first day the widgets were created (basically every day of the year that is saved in the widget db). 

    I used the following format, but that returned the first day the widget was made. 


    SELECT * FROM
         (SELECT *,
              ROW_NUMBER()
                   OVER (PARTITION BY LOWER(WidgetID) ORDER BY WidgetName) AS rn
          FROM WidgetTable) CTE
    WHERE rn=1;

    Thanks In Advance!

    Assuming that LOWER(WidgetID) uniquely identifies a widget type
    SELECT * FROM
      (SELECT *,
        ROW_NUMBER()
         OVER (PARTITION BY CAST(DateCreated AS DATE), LOWER(WidgetID) ORDER BY DateCreated) AS rn
      FROM WidgetTable) CTE
    WHERE rn=1;

    Assuming that WidgetName uniquely identifies a widget type
    SELECT * FROM
      (SELECT *,
        ROW_NUMBER()
         OVER (PARTITION BY CAST(DateCreated AS DATE), WidgetName ORDER BY DateCreated) AS rn
      FROM WidgetTable) CTE
    WHERE rn=1;

    Thanks for looping in. I tried this, but it's returning all widgets grouped by name. I left out the CAST, because I need the date field to remain the same. I'm not sure what the CAST will do to the milliseconds. I'm including some examples of what I have and what the query should return. 

    The query would return the first widget created in the day, it's random, and sometimes a particular widget is not created in a 24 hour period. The 24 hour period would be defined by 00:00:00.000 to 23:59:59.000.

  • CoffeeHeaven - Wednesday, December 13, 2017 8:11 AM

    DesNorton - Tuesday, December 12, 2017 11:16 PM

    CoffeeHeaven - Tuesday, December 12, 2017 3:08 PM

    Hi All,

    I think this might be complicated. New to this forum, but okay at SQL. I'm starting to delve into the more complicated statements. I have a bunch of widgets, that are created. And, I have a DB that records when each widget is made (WidgetID, WidgetName, DateCreated). I'm looking for a query that will return the first widget of it's type created for each day. So, if I have 3 types of different widgets; I will get a query grouped by a 24 hour period with only 3 widgets within each group. And, the number of groups in the query would be from the first day the widgets were created (basically every day of the year that is saved in the widget db). 

    I used the following format, but that returned the first day the widget was made. 


    SELECT * FROM
         (SELECT *,
              ROW_NUMBER()
                   OVER (PARTITION BY LOWER(WidgetID) ORDER BY WidgetName) AS rn
          FROM WidgetTable) CTE
    WHERE rn=1;

    Thanks In Advance!

    Assuming that LOWER(WidgetID) uniquely identifies a widget type
    SELECT * FROM
      (SELECT *,
        ROW_NUMBER()
         OVER (PARTITION BY CAST(DateCreated AS DATE), LOWER(WidgetID) ORDER BY DateCreated) AS rn
      FROM WidgetTable) CTE
    WHERE rn=1;

    Assuming that WidgetName uniquely identifies a widget type
    SELECT * FROM
      (SELECT *,
        ROW_NUMBER()
         OVER (PARTITION BY CAST(DateCreated AS DATE), WidgetName ORDER BY DateCreated) AS rn
      FROM WidgetTable) CTE
    WHERE rn=1;

    Thanks for looping in. I tried this, but it's returning all widgets grouped by name. I left out the CAST, because I need the date field to remain the same. I'm not sure what the CAST will do to the milliseconds. I'm including some examples of what I have and what the query should return. 

    The query would return the first widget created in the day, it's random, and sometimes a particular widget is not created in a 24 hour period. The 24 hour period would be defined by 00:00:00.000 to 23:59:59.000.

    The reason for the CAST in the PARTITION BY is to group by DATE.  The ORDER BY does not do the cast, so that it gets the first entry for the day

  • aaron.reese - Wednesday, December 13, 2017 8:02 AM

    FWIW I prefer to write these sorts of problems as Common Table Expressions. 

    WITH CTE_WidgetRank AS
    (
            SELECT
                        *,
                        ROW_NUMBER() OVER (PARTITION BY CAST(DateCreated AS DATE), WidgetName ORDER BY DateCreated) AS 'RN'
            FROM
                        WidgetTable
    )
    SELECT * FROM CTE_WidgetRank WHERE RN=1

    The separation of the sub-query makes it easier to test in isolation
    Giving it  a name makes it clearer when you use it what it is trying to do
    The CTE can be re-used: If you had a more complex query that needs the same calculation in two places, you only need to write and edit the sub-query once
    As far as I am aware query plans and performance will be identical

    I'm getting an error message on the first WITH. 
    'Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespace clause or change tracking context clause...'

  • CoffeeHeaven - Wednesday, December 13, 2017 8:11 AM

    Thanks for looping in. I tried this, but it's returning all widgets grouped by name. I left out the CAST, because I need the date field to remain the same. I'm not sure what the CAST will do to the milliseconds. I'm including some examples of what I have and what the query should return. 

    The query would return the first widget created in the day, it's random, and sometimes a particular widget is not created in a 24 hour period. The 24 hour period would be defined by 00:00:00.000 to 23:59:59.000.

    Taking out the CAST causes the the query to return the first widget for every 0.003 milliseconds of the day.  You do not want to take out the CAST.

    Also, your original query used LOWER(WidgetID) on what appears to be a numeric column.  There is generally no reason to use LOWER/UPPER for anything other than final presentation.  If you find that you need it, there is probably an issue with your database design.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • please read and respond accordingly......saves us all time and you get tried and tested answers.

    https://spaghettidba.com/2015/04/24/how-to-post-a-t-sql-question-on-a-public-forum/

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • CoffeeHeaven - Wednesday, December 13, 2017 8:18 AM

    aaron.reese - Wednesday, December 13, 2017 8:02 AM

    FWIW I prefer to write these sorts of problems as Common Table Expressions. 

    WITH CTE_WidgetRank AS
    (
            SELECT
                        *,
                        ROW_NUMBER() OVER (PARTITION BY CAST(DateCreated AS DATE), WidgetName ORDER BY DateCreated) AS 'RN'
            FROM
                        WidgetTable
    )
    SELECT * FROM CTE_WidgetRank WHERE RN=1

    The separation of the sub-query makes it easier to test in isolation
    Giving it  a name makes it clearer when you use it what it is trying to do
    The CTE can be re-used: If you had a more complex query that needs the same calculation in two places, you only need to write and edit the sub-query once
    As far as I am aware query plans and performance will be identical

    I'm getting an error message on the first WITH. 
    'Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespace clause or change tracking context clause...'

    The error message tells you exactly what the issue is and how to fix it.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Hi All,

    It's working. Thank you for all of the help. There was a DISTINCT clause that wasn't removed; once I removed that the query worked. I've encountered another issue with the query, but that is stemming from some complex joins involved. 

    Thanks!
    CoffeeHeaven

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply