Problem with NEWID (point of evaluation)

  • OK How do I get this code to work? The final 'select' results in different GUID values for [group]. If I just 'select * from g' (commented out) I get what is expected.

    pcd


    create table #test1( id int identity(1,1) not null, [value] varchar(100), [group] char(1) )
    go

    insert into #test1( [Value], [group] )
    values ( 'AA', 'A' )
       , ( 'Ab', 'A' )
       , ( 'Ba', 'B' )
       , ( 'Bb', 'B' )
    go

    ;with a
    as
    (
    select
       *
    from  #test1
    )
    , g
    as
    (
      select
        [group]
        , cast( NEWID() as varchar( 50 ) ) as [GUID]
      from  ( select distinct [group] from #test1 ) x
    )
    --select * from g

    select
       a.*
       , g.*
    from  a
    join  g
    on  a.[group] = g.[group]
    go

    drop table #test1
    go

  • Is that an option to use MD5 hash of the group?

    It should be identical in terms of the data type

    SELECT CAST(HASHBYTES('MD5','1') AS UNIQUEIDENTIFIER

  • What are you trying to do?
    NewID returns unique guids, different one every time it's called.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster - Friday, August 18, 2017 1:16 AM

    What are you trying to do?

    I believe the OP wants the same GUID for everything in Group A, and the same GUID for everything in Group B.

    Like Gail said though, NEWID() generates a different value every time, and it's not executed until you get to the last statement (which returns 4 rows, so 4 different IDs).

    You could use a temporary table to insert your distinct group data into, and then return from that (see below). Are you planning to keep these GUIDs and store them though? If so, personally, I'd suggest you normalise your data (which means splitting your Groups and Values into different tables).
    CREATE TABLE #Sample
      (id int IDENTITY(1,1),
      [Value] varchar(100),
      [Group] char(1));
    GO

    INSERT INTO #Sample ([Value], [Group])
    VALUES
      ( 'AA', 'A' ),
      ( 'Ab', 'A' ),
      ( 'Ba', 'B' ),
      ( 'Bb', 'B' );
    GO

    WITH Groups AS (
      SELECT DISTINCT [Group]
      FROM #Sample)
    SELECT [Group], NEWID() AS [GUID]
    INTO #Groups
    FROM Groups;

    SELECT *
    FROM #Groups G
      JOIN #Sample S ON G.[Group] = S.[Group];

    GO

    DROP TABLE #Sample;
    DROP TABLE #Groups;
    GO

    Edit: Clearly the coffee hadn't kicked in yet. I have no idea where that first sentence came from. It's fine, no one saw it. :hehe:

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • evgeniy.perfilyev - Thursday, August 17, 2017 9:46 PM

    Is that an option to use MD5 hash of the group?

    It should be identical in terms of the data type

    SELECT CAST(HASHBYTES('MD5','1') AS UNIQUEIDENTIFIER

    The problem is that the 'Group' might reappear on another day and needs to have a different identifier.

  • Thom A - Friday, August 18, 2017 1:58 AM

    GilaMonster - Friday, August 18, 2017 1:16 AM

    What are you trying to do?

    I believe the OP wants the same GUID for everything in Group A, and the same GUID for everything in Group B.

    Like Gail said though, NEWID() generates a different value every time, and it's not executed until you get to the last statement (which returns 4 rows, so 4 different IDs).

    You could use a temporary table to insert your distinct group data into, and then return from that (see below). Are you planning to keep these GUIDs and store them though? If so, personally, I'd suggest you normalise your data (which means splitting your Groups and Values into different tables).
    CREATE TABLE #Sample
      (id int IDENTITY(1,1),
      [Value] varchar(100),
      [Group] char(1));
    GO

    INSERT INTO #Sample ([Value], [Group])
    VALUES
      ( 'AA', 'A' ),
      ( 'Ab', 'A' ),
      ( 'Ba', 'B' ),
      ( 'Bb', 'B' );
    GO

    WITH Groups AS (
      SELECT DISTINCT [Group]
      FROM #Sample)
    SELECT [Group], NEWID() AS [GUID]
    INTO #Groups
    FROM Groups;

    SELECT *
    FROM #Groups G
      JOIN #Sample S ON G.[Group] = S.[Group];

    GO

    DROP TABLE #Sample;
    DROP TABLE #Groups;
    GO

    Edit: Clearly the coffee hadn't kicked in yet. I have no idea where that first sentence came from. It's fine, no one saw it. :hehe:

    Thom A, I think you hit the nail on the head, despite the explicit grouping in the second CTE, the NEWID() is not being evaluated until output (even if cast to a varchar).

    OK So what are we trying to do? We process multiple 'groups' of records, and when a 'group' has completed it is moved to a 'completed' table, this keeps the 'active' table relatively small and makes performance MUCH better. However a 'Group' that was processed yesterday, might get processed today, and hence needs to be given a new identifier, otherwise if we were calculating duration the duration would come out as the start of the first to the end of the second, not what is required. I'm thinking that using the date/time and rank() it should be possible to generate a uniqueifier.

    The challenge was to try and explain whats happening with the NEWID() and to see if there was a way around the problem - to my mind the 2nd CTE should have resolved the problem, but it doesn't. Maybe this should have been a QOTD?

    pcd

  • pcd_au - Friday, August 18, 2017 6:14 AM

     to my mind the 2nd CTE should have resolved the problem, but it doesn't.

    No, it wouldn't, because CTEs are 'temporary views'. they do not force execution of the first and then execution of the second. The entire query is simplified and executed, and the NewID resolves when the SELECT is executed, as the second-last clause.

    Thom's works, because the insert into the temp table is a separate statement, and hence the NewID is assigned there and then, and the fixed values evaluated later.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply