Issue: 'Select Into' with identity column

  • Hi everybody. I noticed a strange behaviour using a 'select into' statement with an identity column in source table.

    Usually I put the identity field in a cast statement in order to avoid its propagation in destination table, as follow:

    SELECT MyIdentity = convert(int, b.MyIdentity) INTO #t

    FROM MyTable1 a

    LEFT JOIN MyView2 b ON a.MyId = b.ExternalId

    Statement returns a one col / one row result set with a null value.

    In sample MyTable1 is a physical table and MyView2 is a view (a simple 'select * from MyTable2'). Everything works fine.

    I need to replace MyTable1 with a view with fixed values that returns the same result set (for performance purpose), and change my statement as follow

    SELECT MyIdentity = convert(int, b.MyIdentity) INTO #t

    FROM MyView1 a

    LEFT JOIN MyView2 b ON a.MyId = b.ExternalId

    When I execute this statement an error is raised: "Attempting to set a non-NULL-able column's value to NULL."

    It seems that the convert function does not work anymore by view usage. I noticed that qep are different for the two statements: by using the view I have a Compute Scalar operator ([Expr1007] = Scalar Operator(setidentity([test].[dbo].[MyTable2].[MyIdentity],(-7),(0),N'#t'))) that is missing by using the table, and this operator is the cause of my error.

    QEP using table:

    QEP using view:

    Briefly, by second statement 'Select * into' creates a table that inherit identity column, regardless to 'convert' operation. So the insert phase fails, because the join returns a null value. By the first statement no 'Scalar Operator(setidentity...' is used in qep and everything is ok.

    It seems a strange behaviour, very annoying if you need to build all your Tsql dinamically. Have you some explanation about?

    Here there's the code to reproduce the issue:

    CREATE TABLE [dbo].[MyTable1]([MyId] [int] NULL)

    INSERT INTO [dbo].[MyTable1] VALUES (1)

    GO

    CREATE TABLE [dbo].[MyTable2]([MyIdentity] [int] IDENTITY(1,1) NOT NULL, [ExternalId] [int] NOT NULL)

    INSERT INTO [dbo].[MyTable1] VALUES (2)

    INSERT INTO [dbo].[MyTable1] VALUES (3)

    GO

    CREATE VIEW [dbo].[MyView1] as SELECT MyId = 1

    GO

    CREATE VIEW [dbo].[MyView2] as SELECT * FROM MyTable2

    GO

    -- OK

    SELECT MyIdentity = convert(int, b.MyIdentity) INTO #t

    FROM MyTable1 a

    LEFT JOIN MyView2 b ON a.MyId = b.ExternalId

    drop table #t

    -- KO

    SELECT MyIdentity = convert(int, b.MyIdentity) INTO #t

    FROM MyView1 a

    LEFT JOIN MyView2 b ON a.MyId = b.ExternalId

    Thanks!

  • Literally just asked a very similar question like this today as I was seeing very close to the same thing

    As for your error message, according to Books Online here:

    When an existing identity column is selected into a new table, the new column inherits the IDENTITY property, unless one of the following conditions is true:

    • The SELECT statement contains a join.

    • Multiple SELECT statements are joined by using UNION.

    • The identity column is listed more than one time in the select list.

    • The identity column is part of an expression.

    • The identity column is from a remote data source.

    If any one of these conditions is true, the column is created NOT NULL instead of inheriting the IDENTITY property

    So you wrapping your identity column in an expression is getting rid of the identity property but it is also turning the column to NOT NULL, which you then try to insert a NULL into.

    P.S. that setidentity...(-7)? A red herring. As far as I can tell it shows up everytime you run an SELECT INTO.

    Your error is not because of that but because you are trying to insert a NULL.

  • Yes, my error is raised 'cause I'm inserting a null value in destination table create by 'first phase' of select into statement. But the reason why is visible by checking the property of that column: identity property is set, not 'non-nullable' only,

    The main issue is that by the first scenario (using a table) the result table has not identity set and column is nullable, by the second one (using a view) QEP changes and destination table too. More: converting data (the way to set nullability anyway) does not work in second scenario.

    Executing my little script the behaviour will be clear. It looks like a bug to me.

    I'm using this workaround:

    SELECT MyIdentity = coalesce(b.MyIdentity, NULL) INTO #t FROM MyView1 a

    LEFT JOIN MyView2 b ON a.MyId = b.ExternalId

    instead of

    SELECT MyIdentity = coalesce(b.MyIdentity, NULL) INTO #t FROM MyView1 a

    LEFT JOIN MyView2 b ON a.MyId = b.ExternalId

    but it's not so satisfying (performance isssue possibility in my environment).

  • The trick here seems to be that while joining two tables is sufficient to prevent SQL Server from propagating the identity property, it really does have to be two tables.

    From the query plan it looks like your view must just be selecting a bunch of constants (the join is between a table scan and a constant scan); unlike joining two tables, that does not prevent the IDENTITY function from propagating, apparently (on that note, the setidentity scalar operator will not show up in every SELECT...INTO; only those where an identity is being set, strangely enough :-))

    You can of course implement workarounds by breaking any of the other rules required for IDENTITY to propagate (like using it in an expression, which you've done, or doing a UNION ALL with an empty set). If I absolutely had to use a workaround for an int column, I'd probably prefer just multiplying it by 1 instead of using COALESCE, but that's neither here nor there.

    In this particular case I would recommend none of the above, though. Instead, just explicitly create the temp table with column definitions as you want them to be; I don't see any real advantage to using SELECT...INTO in this case.

    If this isn't your actual case, and you actually have many columns, varying numbers of columns, or unknown column names, so that you have to do a SELECT * INTO, then UNION ALL with an empty result set is probably the simplest way. I still prefer explicitly defining things when possible, but I understand that sometimes it isn't.

    Again though, if what you've posted here is like your actual scenario, and you know what columns you're pulling, I'd just define the table explicitly.

    Here's some code illustrating the above:

    SET NOCOUNT ON;

    CREATE TABLE #t1 (ID int IDENTITY);

    CREATE TABLE #t2 (ID int IDENTITY);

    GO

    INSERT INTO #t1

    DEFAULT VALUES;

    GO 2

    INSERT INTO #t2

    DEFAULT VALUES;

    SELECT unique_column_name_1=#t2.ID

    INTO #t3

    FROM #t1 LEFT JOIN #t2 ON #t1.ID=#t2.ID;

    --Instead of using a view that generates a list of constants

    --I'll just use VALUES. It's the same issue, in that you're

    --not actually joining two tables, so it allows the identity feature

    --propagate, which runs into the error if you're trying to insert NULLs

    SELECT unique_column_name_2=#t2.ID

    INTO #t4

    FROM (VALUES(1),(2))t1(ID) LEFT JOIN #t2 ON t1.ID=#t2.ID;

    GO

    --A couple workarounds

    SELECT unique_column_name_3=#t2.ID*1

    INTO #t5

    FROM (VALUES(1),(2))t1(ID) LEFT JOIN #t2 ON t1.ID=#t2.ID;

    --This one works even if you don't know the columns

    SELECT #t2.*

    INTO #t6

    FROM (VALUES(1),(2))t1(ID) LEFT JOIN #t2 ON t1.ID=#t2.ID

    UNION ALL

    SELECT #t2.*

    FROM (VALUES(1),(2))t1(ID) LEFT JOIN #t2 ON t1.ID=#t2.ID

    WHERE 1=0;

    --If you really need to control column properties,

    --I don't think you're saving much by using SELECT...INTO

    --instead of just creating the table explicitly

    CREATE TABLE #t7 (unique_column_name_4 INT NULL);

    INSERT INTO #t7

    SELECT #t2.ID

    FROM (VALUES(1),(2))t1(ID) LEFT JOIN #t2 ON t1.ID=#t2.ID;

    SELECT table_name=LEFT(OBJECT_NAME(object_id,2),3),

    is_nullable,

    is_identity

    FROM tempdb.sys.columns

    WHERE OBJECT_NAME(object_id,2) LIKE '#t[3-7]%';

    DROP TABLE #t1,#t2,#t3,#t4,#t5,#t6,#t7;

    Cheers!

    EDIT: Forgot to include my code sample

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply