SQL help with partition

  • With the following data, I need to create a column that will display a counter for each VID, Acc where PrgFld is not 'XXX'

    I am trying to do this with partition, but it's increasing the counter when PrgFld is XXX. I would prefer to not increase the counter or show 0 for XXX.

    Is this possible?

    CREATE TABLE #t1([VID] [varchar](100) NULL, [Acc] [varchar](100) NULL,[OrdDate] [datetime] NOT NULL,
    [PrgFld] [varchar](10) NULL,[PrgVal] [varchar](10) NULL
    )
    ON [PRIMARY]


    insert into #t1 values('1111','A100','2022-01-10 13:37:06.000','RNA','20221001')
    insert into #t1 values('1111','A100','2022-01-11 13:37:06.000','RNA','20221101')
    insert into #t1 values('1111','A100','2022-01-12 13:37:06.000','RNA','20221001')
    insert into #t1 values('1111','A100','2022-01-13 13:37:06.000','RNA','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-13 13:47:06.000','RNA','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-14 13:37:06.000','XXX','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-15 13:37:06.000','XXX','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-16 13:37:06.000','XXX','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-16 13:37:06.000','XXX','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-16 13:37:06.000','XXX','VSDVDSV')

    insert into #t1 values('1111','A100','2022-01-17 13:37:06.000','FFNP','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-18 13:37:06.000','FFNP','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-19 13:37:06.000','FFNP','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-20 13:37:06.000','FFNP','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-21 13:37:06.000','FFNP','VSDVDSV')

    insert into #t1 values('1111','A100','2022-01-22 13:37:06.000','NEFE','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-23 13:37:06.000','NEFE','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-24 13:37:06.000','NEFE','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-25 13:37:06.000','NEFE','VSDVDSV')
    insert into #t1 values('1111','A100','2022-01-26 13:37:06.000','NEFE','VSDVDSV')

    insert into #t1 values('2222','A300','2022-01-10 13:37:06.000','BAP4','SDVDS')
    insert into #t1 values('2222','A300','2022-01-10 14:37:06.000','BAS5','20221223')
    insert into #t1 values('2222','A300','2022-01-10 15:37:06.000','XXX','20221223')
    insert into #t1 values('3333','A400','2022-01-11 13:37:06.000','BAT2','FDBDSFB')
    insert into #t1 values('3333','A400','2022-01-12 13:37:06.000','BAS3','20221015')
    insert into #t1 values('3333','A400','2022-01-13 13:37:06.000','BAO3','DVBSDVDS')
    insert into #t1 values('3333','A400','2022-01-14 13:37:06.000','BAS1','20221021')
    insert into #t1 values('3333','A400','2022-01-15 13:37:06.000','XXX','20221021')
    insert into #t1 values('3333','A500','2022-01-16 13:37:06.000','BOS3','AFFEWF')
    insert into #t1 values('3333','A600','2022-01-17 13:37:06.000','BAS1','20220915')
    insert into #t1 values('3333','A600','2022-01-18 13:37:06.000','BAS2','20221126')

    select *,
    ROW_NUMBER() OVER(PARTITION BY Case When PrgFld not in ('XXX') THEN 1 ELSE 0 end,VID, Acc ORDER BY VID,Acc) [test2]

    from #t1 order by VID, Acc, OrdDate
    drop table #t1
  • I think all you need is this:

    SELECT *
    , [test2] = CASE WHEN PrgFld NOT IN ('XXX') THEN ROW_NUMBER() OVER(PARTITION BY VID, Acc ORDER BY OrdDate) ELSE 0 END
    FROM #t1
    ORDER BY
    VID
    , Acc
    , OrdDate

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thanks a lot for the response. Works great.

  • For this particular scenario the following would also work with no need for the Partition By clause.  Why get fancy when simple will do?

    CREATE TABLE #tbData
    ( [VID] VARCHAR(100) NULL
    ,[Acc] VARCHAR(100) NULL
    ,[OrdDate] DATETIME2(3) NOT NULL
    ,[PrgFld] VARCHAR(10) NULL
    ,[PrgVal] VARCHAR(10) NULL
    ) ON [PRIMARY];

    -- All your Inserts

    CREATE TABLE #tbRowNbrd
    ( [RowNbr] INT IDENTITY(1,1) NOT NULL
    ,[VID] VARCHAR(100) NULL
    ,[Acc] VARCHAR(100) NULL
    ,[OrdDate] DATETIME2(3) NOT NULL
    ,[PrgFld] VARCHAR(10) NULL
    ,[PrgVal] VARCHAR(10) NULL
    ) ON [PRIMARY];

    SELECT [VID]
    ,[Acc]
    ,[OrdDate]
    ,[PrgFld]
    ,[PrgVal]
    INTO #tbRowNbrd
    FROM ( SELECT [VID]
    ,[Acc]
    ,[OrdDate]
    ,[PrgFld]
    ,[PrgVal]
    FROM #tbData
    WHERE [PrgFld] IN ('XXX')
    ORDER BY [VID], [Acc], [OrdDate]
    ) AS [rslt];

    -- Or the following might be able to be down but could not test it -- so gave the above just in case.

    SELECT [VID]
    ,[Acc]
    ,[OrdDate]
    ,[PrgFld]
    ,[PrgVal]
    INTO #tbRowNbrd
    FROM #tbData
    WHERE [PrgFld] IN ('XXX')
    ORDER BY [VID], [Acc], [OrdDate];

    Also I would point out that DATETIME is no longer supported by either of the SQL Standards and you should be using DATETIME2(3) if you want to mimic a DATETIME value and you even save a byte of data for every record.

    • This reply was modified 1 year, 4 months ago by  Dennis Jensen. Reason: Changed NOT IN to IN as the former were the opposite results desired
  • Neither of these solve the OP's question:

    Dennis Jensen wrote:

    For this particular scenario the following would also work with no need for the Partition By clause.  Why get fancy when simple will do?

    CREATE TABLE #tbData
    ( [VID] VARCHAR(100) NULL
    ,[Acc] VARCHAR(100) NULL
    ,[OrdDate] DATETIME2(3) NOT NULL
    ,[PrgFld] VARCHAR(10) NULL
    ,[PrgVal] VARCHAR(10) NULL
    ) ON [PRIMARY];

    -- All your Inserts

    CREATE TABLE #tbRowNbrd
    ( [RowNbr] INT IDENTITY(1,1) NOT NULL
    ,[VID] VARCHAR(100) NULL
    ,[Acc] VARCHAR(100) NULL
    ,[OrdDate] DATETIME2(3) NOT NULL
    ,[PrgFld] VARCHAR(10) NULL
    ,[PrgVal] VARCHAR(10) NULL
    ) ON [PRIMARY];

    SELECT [VID]
    ,[Acc]
    ,[OrdDate]
    ,[PrgFld]
    ,[PrgVal]
    INTO #tbRowNbrd
    FROM ( SELECT [VID]
    ,[Acc]
    ,[OrdDate]
    ,[PrgFld]
    ,[PrgVal]
    FROM #tbData
    WHERE [PrgFld] NOT IN ('XXX')
    ORDER BY [VID], [Acc], [OrdDate]
    ) AS [rslt];

    -- Or the following might be able to be down but could not test it -- so gave the above just in case.

    SELECT [VID]
    ,[Acc]
    ,[OrdDate]
    ,[PrgFld]
    ,[PrgVal]
    INTO #tbRowNbrd
    FROM #tbData
    WHERE [PrgFld] NOT IN ('XXX')
    ORDER BY [VID], [Acc], [OrdDate];

    These both exclude the rows with XXX - and the OP doesn't want those rows excluded.  Your 'simpler' solution uses a derived table (not simpler) and both solve the 'row count' by using a more complicated SELECT INTO a pre-defined temp table to assign a row number.

    Also I would point out that DATETIME is no longer supported by either of the SQL Standards and you should be using DATETIME2(3) if you want to mimic a DATETIME value and you even save a byte of data for every record.

    There are many reasons to continue using DATETIME in SQL Server, not the least of which is performance related.  If you don't need the higher precision then there is no reason to suffer the penalties associated with DATETIME2 of any precision.  That isn't to say you shouldn't use DATE, DATETIME2 or DATETIMEOFFSET - because there are very good reasons to use those data types.

    Now, if you have a reference that says DATETIME2 is the official ANSI/ISO standard date/time data type to be used in all SQL implementations - I would be more than happy to review that documentation.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams wrote:

    Neither of these solve the OP's question:

    These both exclude the rows with XXX - and the OP doesn't want those rows excluded.  Your 'simpler' solution uses a derived table (not simpler) and both solve the 'row count' by using a more complicated SELECT INTO a pre-defined temp table to assign a row number.

    Ah yes a minor mistake in reading what they wanted a simple fix of changing NOT IN to IN which I will do.

    Jeffrey Williams wrote:

    There are many reasons to continue using DATETIME in SQL Server, not the least of which is performance related.  If you don't need the higher precision then there is no reason to suffer the penalties associated with DATETIME2 of any precision.  That isn't to say you shouldn't use DATE, DATETIME2 or DATETIMEOFFSET - because there are very good reasons to use those data types.

    Well it is pretty simple to address this fallacy as DATETIME which has a rounded 3 percision value has no more or less precision than at DATETIME2(3) and the former uses up an extra byte of data while the latter is more accurate. A win-win for DATETIME2(3).  Further DATETIME will always have a rounded precision of 3 where DATETIME2(0) can be reduced to no precision and save 2 Bytes of data. Another Win-Win for DATETIME2.  So yeah, I would be very interested in any rabbits that you can pull out of your hat that speaks truth to why someone would want to use the ANSI/ISO Standards deprecated DATETIME over its replacement DATETIME2 which has variable percision.

    Jeffrey Williams wrote:

    Now, if you have a reference that says DATETIME2 is the official ANSI/ISO standard date/time data type to be used in all SQL implementations - I would be more than happy to review that documentation.

    As for the Documentation on what is and is not in the ANSI/ISO Standards about the DATETIME and DATETIME2 you can google that one yourself. I simply have better things to do than LMGTFY when I am sure you are more than capable of finding that factual information on your own, and it seems you need to find it.

    On the flip-side though, how about we do this, instead of me going out again and getting that information which I do not need. You go out and find proof that DATETIME2 is not the ANSI/ISO Standard and that it did not replace DATETIME. If you can provide links to that then I will gladly go out and find the links that prove that wrong.

    Lastly one of the things we discovered fine tuning Stored Procedure is that the PARTITION BY statement can be dog when used in a query and that finding a simplier approach often sped the full results up significantly especially when dealing with big data where these kind of lags become more pronounced. So if what I presented was actually slower than I can see using the PARTITION BY but then mine was just a suggestion which I stated I was not able to test and I would assume any good developer would be able to use that and test it against their data to see which one implements the solution more efficiently. Also, I am sure there could be an even better methods than the one I proposed -- I was mainly trying to supply an alternate view that might open the door to other potential solutions or at least show that there is more than one way to solve a puzzle.

  • Forget about precision - I already stated that if you need that precision then use that data type.  That isn't the issue - the issue is that there are known performance issues with the newer data types and unless you actually need that additional precision then there is no reason to use the newer data types.

    And as far as ANSI/ISO standards - the standards do not specify a data type.  Not a single one of them - they specify the format to be:

    YYYYMMDD (basic) or YYYY-MM-DD (extended)

    And time is defined with or without time zone as hh:mm:ss.fff... where fff... is 1 to 9 fractions of a second.  Doesn't define what those fractions represent either - just fractions of a second.

    So no, I don't think you can pull up a reference that states DATETIME2 is the official recognized SQL Standard.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams wrote:

    Forget about precision - I already stated that if you need that precision then use that data type.  That isn't the issue - the issue is that there are known performance issues with the newer data types and unless you actually need that additional precision then there is no reason to use the newer data types.

    I was not the one that brought up precision, you did.  I just explained there there is no precision issue with DATETIME2 as you so claim.  DATETIME as a  precision of 3 always and its rounded so inaccurate if you need that precision but if you do not need a precision of 3 you cannot turn it off (I do not think).  While DATETIME2 has, as a stated, a variable precision of 0 to 7.  The bytes that a DATETIME2 takes up is equal to the bytes that a DATETIME  takes up. While a DATETIME(3) which has the same precision as a DATETIME uses one less Byte and does not round so is more accurate. Finally if you do not need a precision finer than just Seconds then you can use DATETIME2(0) which has absolutley no precision and takes up two less Bytes of data.

    Jeffrey Williams wrote:

    And as far as ANSI/ISO standards - the standards do not specify a data type.  Not a single one of them - they specify the format to be: YYYYMMDD (basic) or YYYY-MM-DD (extended)

    And time is defined with or without time zone as hh:mm:ss.fff... where fff... is 1 to 9 fractions of a second.  Doesn't define what those fractions represent either - just fractions of a second.

    So no, I don't think you can pull up a reference that states DATETIME2 is the official recognized SQL Standard.

    Hey look they got one on this site even -- I think you should go set that guy straight.

    https://www.sqlservercentral.com/blogs/sql-server-datetime-vs-datetime2#:~:text=DATETIME2%20is%20compliant%20with%20both%20the%20ANSI%20and,that%20you%20require%2C%20DATETIME2%20is%20a%20better%20choice.

  • Dennis Jensen wrote:

    I was not the one that brought up precision, you did.  I just explained there there is no precision issue with DATETIME2 as you so claim. 

    Jeffery never said anything of the sort.  Simply, if you do not need the extra precision of a datetime2, as well as the extra byte of storage required, continue to use datetime.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Michael L John wrote:

    Jeffery never said anything of the sort.  Simply, if you do not need the extra precision of a datetime2, as well as the extra byte of storage required, continue to use datetime.

    Really Michael then let me prove you wrong with the bolded text from the message where he brings up precision.

    Jeffrey Williams wrote:

    There are many reasons to continue using DATETIME in SQL Server, not the least of which is performance related.  If you don't need the higher precision then there is no reason to suffer the penalties associated with DATETIME2 of any precision.  That isn't to say you shouldn't use DATE, DATETIME2 or DATETIMEOFFSET - because there are very good reasons to use those data types.

    Now, if you have a reference that says DATETIME2 is the official ANSI/ISO standard date/time data type to be used in all SQL implementations - I would be more than happy to review that documentation.

    Also as denoted in a previous post, I supplied a reference that proved Jeffery wrong on the ANSI/ISO Standards comment that he made and if I needed to find more reference (which I do not), I know that I can because I never take a single reference from the internet. I always research them looking for contradictionary posts or references. I did not find any but did find other posts supporting the one link that I did provide.

    Further sure you can use Non-Standard SQL Code but why would you want to? Sooner or later, Microsoft is going to deprecate DATETIME (I think I even read references to this somewhere) since there is absolutely no reason why anyone should be using it.

    I mean DATETIME, takes up more space, has less accuracy, and inflexible precision thus it is all CONS no PROS. While DATETIME2 uses the same space with a much higher precision or less space with the same precision or less precision, is more accurate with the same precision, and has flexible precision thus all PROS and no CONS.

    Lastly the change from DATETIME to DATETIME2(3) within the database is practically invisible. I know as I have replaced numerous instances of DATETIME with a DATETIME2(3) within our fairly large system with absolutely no ill side-effects and we freed up a lot of bytes as well.

  • Dennis Jensen wrote:

    Michael L John wrote:

    Jeffery never said anything of the sort.  Simply, if you do not need the extra precision of a datetime2, as well as the extra byte of storage required, continue to use datetime.

    Really Michael then let me prove you wrong with the bolded text from the message where he brings up precision.

    Jeffrey Williams wrote:

    There are many reasons to continue using DATETIME in SQL Server, not the least of which is performance related.  If you don't need the higher precision then there is no reason to suffer the penalties associated with DATETIME2 of any precision.  That isn't to say you shouldn't use DATE, DATETIME2 or DATETIMEOFFSET - because there are very good reasons to use those data types.

    Now, if you have a reference that says DATETIME2 is the official ANSI/ISO standard date/time data type to be used in all SQL implementations - I would be more than happy to review that documentation.

    Also as denoted in a previous post, I supplied a reference that proved Jeffery wrong on the ANSI/ISO Standards comment that he made and if I needed to find more reference (which I do not), I know that I can because I never take a single reference from the internet. I always research them looking for contradictionary posts or references. I did not find any but did find other posts supporting the one link that I did provide.

    Further sure you can use Non-Standard SQL Code but why would you want to? Sooner or later, Microsoft is going to deprecate DATETIME (I think I even read references to this somewhere) since there is absolutely no reason why anyone should be using it.

    I mean DATETIME, takes up more space, has less accuracy, and inflexible precision thus it is all CONS no PROS. While DATETIME2 uses the same space with a much higher precision or less space with the same precision or less precision, is more accurate with the same precision, and has flexible precision thus all PROS and no CONS.

    Lastly the change from DATETIME to DATETIME2(3) within the database is practically invisible. I know as I have replaced numerous instances of DATETIME with a DATETIME2(3) within our fairly large system with absolutely no ill side-effects and we freed up a lot of bytes as well.

    Your highlight says exactly what Michael was referring to: If you don't need the higher precision then there is no reason to use DATETIME2.  I will repeat what I stated before, there is no reason to incur the performance penalties using the newer data types if there is no need for the higher precision or time zone.

    As to your statement that you debunked my claim - no you did not.  You referenced an article on this site that only shows the differences between DATETIME and DATETIME2.  The additional comments at the end claim that DATETIME2 is ANSI/ISO compliant - but also this:

    In conclusion, if it's range, precision, accuracy, storage space optimization or compliance with standards that you require, DATETIME2 is a better choice.

    All this states is that you should use DATETIME2 *if it meets your standards*.

    As far as being ANSI/ISO compliant - all date data types in SQL Server are compliant because the ANSI/ISO standards do not denote a specific data type.  They specify the external representation for a date and the fully compliant format is: YYYY-MM-DDTHH:MM:SS.FFF.  A simple CONVERT using style 126 or 127 and you have an ISO8601 compliant date/time.

    Now, to address this idea that converting from DATETIME to a DATETIME2(3) is virtually invisible.  Incorrect - that conversion requires a full rewrite of every row in the table.  It is not a meta-data change only as the storage is completely different - so making that change is absolutely not invisible.  Additionally, if you have any code using GETDATE() it has to be modified to use SYSDATETIME().  If you have any code using GETDATE()+1, then it also has to be modified because you cannot directly add or subtract from a DATETIME2.

    As for Microsoft deprecating DATETIME - that will almost certainly never happen.

    To go further, all of the date data types in SQL Server are based on an incremental number from a base date.  The base date for DATETIME2 is '0001-01-01' as the zero date and for the DATETIME data type it is '1900-01-01'.  Internally, that means the date is stored as a numeric value - which has no bearing on ANSI/ISO standards.  Note: DATE, DATETIME2 and DATETIMEOFFSET store the numeric value reversed which is why there is a performance penalty when using them.  To get the actual value it has to be reversed and parsed where DATETIME does not have that issue.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffery, you are wasting your time.  Dennis is always right, and he knows everything.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Is it possible to make the count start only for non XXX. With the following sample data, column gives me 3 and 4. But I want it to give 1 and 2. Meaning the increment should only work for non XXX. If XXX don't do anything.

    CREATE TABLE #t1([VID] [varchar](100) NULL, [Acc] [varchar](100) NULL,[OrdDate] [datetime] NOT NULL,
    [PrgFld] [varchar](10) NULL,[PrgVal] [varchar](10) NULL
    )
    ON [PRIMARY]

    insert into #t1 values('4444','A400','2022-06-22','XXX','20221021')
    insert into #t1 values('4444','A400','2022-06-22','XXX','20221021')
    insert into #t1 values('4444','A400','2023-05-29','AMB1','20220915')
    insert into #t1 values('4444','A400','2023-05-29','AMB1','20220915')

    select *,
    [test33] = CASE WHEN PrgFld NOT IN ('XXX') THEN ROW_NUMBER() OVER(PARTITION BY VID, Acc ORDER BY OrdDate) ELSE 0 END

    from #t1 order by VID, Acc, OrdDate
    drop table #t1

    Above gives me 0, 0, 3, 4 for the test2 col. I would like to have 0, 0, 1, 2

  • This should work:

    select *,
    [test33] = CASE WHEN PrgFld NOT IN ('XXX') THEN ROW_NUMBER() OVER(PARTITION BY VID, Acc, nullif(PrgFld, 'XXX') ORDER BY OrdDate) ELSE 0 END

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Michael L John wrote:

    Jeffery, you are wasting your time.  Dennis is always right, and he knows everything.

    Yeah - I am getting that impression.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply