inserting data from one table to another and want to create primary key at same time

  • I am creating a table by using an insert into from another table

    then I am deleting any duplicate rows from that table

    then I try to create a primary key on column1(SN_Original)

    but I get an error saying

    Msg 8111, Level 16, State 1, Line 27

    Cannot define PRIMARY KEY constraint on nullable column in table 'lookuptable'

    is there a way round this

    here is my code

    Drop table lookuptable

    select dbo.bigtable.Software_Name_Original as SN_Original ,

    dbo.bigtable.Software_Name_Raw as SN_New

    into lookuptable

    from dbo.BigTable

    order by dbo.bigtable.Software_Name_Raw

    --Delete duplicate rows from lookup table based on SN_NEW (software_name_new)

    DELETE f

    FROM (

    select row_number() over (partition by SN_NEW order by SN_NEW) as rn from lookuptable

    ) AS f

    WHERE rn > 1

    ALTER TABLE lookuptable ADD CONSTRAINT SN_OriginalPK

    PRIMARY KEY CLUSTERED (SN_Original);

  • When you use

    Select Into

    the columns created would take the data type ,nullability etc. properties from the columns available in FROM clause.

    so make sure you have SOFTWARE_NAME_ORIGINAL as NOT NULL ... Since , primary key can't be created on Nullable columns ..

    ~ demonfox
    ___________________________________________________________________
    Wondering what I would do next , when I am done with this one :ermm:

  • Or , even better , don't use select into ..

    Just create a table first and then Insert values into it ..

    I find it better than Select into .

    ~ demonfox
    ___________________________________________________________________
    Wondering what I would do next , when I am done with this one :ermm:

  • demonfox (3/12/2013)


    Or , even better , don't use select into ..

    Just create a table first and then Insert values into it ..

    I find it better than Select into .

    Could you elaborate on this, please?

    SELECT INTO is "performance winner", why would creating table first be better?

    If you use SELECT INTO you cannot create PK or INDEX in time of "SELECT INTO", but after it's execution you can create whatever you like on a new table created.

    SELECT INTO will copy column definition from source (eg. IDENTITY if present) or from result of datatype casting:

    SELECT CAST(NULL as INT) Col1, CAST(NULL AS Varchar(40)) ... INTO ...

    will create two columns Col1 INT and Col2 varchar(40).

    SELECT INTO is the best way of creating temp tables in stored procedures whenever they required.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (3/12/2013)SELECT INTO is "performance winner", why would creating table first be better?

    Not always. You can achieve minimal logging with INSERT INTO...SELECT as well. Some people prefer to create the table first and there is no hard-and-fast rule that says you shouldn't go that route.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (3/12/2013)


    Eugene Elutin (3/12/2013)SELECT INTO is "performance winner", why would creating table first be better?

    Not always. You can achieve minimal logging with INSERT INTO...SELECT as well. Some people prefer to create the table first and there is no hard-and-fast rule that says you shouldn't go that route.

    There are few ways to achieve minimal logging for insertes with different prerequisites for each of the method. I wonder, is content of the "minimal logging" the same for all methods?

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • What do you mean by "content"?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Reminder: There's no such thing as "minimal logging" if the database is using the "Full" recovery model.

    Reminder: Temporarily shifting to "Simple" recovery destroys the log chain and either a full backup or differential backup will need to be taken immediately afterward changing the recovery model back to "FULL" to establish a new log chain.

    Reminder: Temporarily shifting to "Bulk Logged" recovery destroys the ability to do point-in-time restores for any log file that contains minimally logged operations.

    Further, opc.three is correct. Although SELECT/INTO is a miracle of performance even in the "FULL" recovery model, it's not always better if you need to add a clustered index or other indexes. Sometimes, it is more efficient to create the table and then do an INSERT/SELECT. There is no rule of thumb here, though. To know which is faster, you must test. Like anything else in SQL Server, "It Depends".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • demonfox (3/12/2013)


    When you use

    Select Into

    the columns created would take the data type ,nullability etc. properties from the columns available in FROM clause.

    so make sure you have SOFTWARE_NAME_ORIGINAL as NOT NULL ... Since , primary key can't be created on Nullable columns ..

    Not quite. SELECT/INTO will copy the IDENTITY property, the Data Type, and the data. It will NOT copy nullability. That's why the OP is currently having the problem he cited. I'll post the fix for this in a minute or two.

    [EDIT] I stand corrected. Not sure why I thought it wouldn't work in 2K5 but it does. Looking into why I made such a mistake.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/12/2013)


    demonfox (3/12/2013)


    When you use

    Select Into

    the columns created would take the data type ,nullability etc. properties from the columns available in FROM clause.

    so make sure you have SOFTWARE_NAME_ORIGINAL as NOT NULL ... Since , primary key can't be created on Nullable columns ..

    Not quite. SELECT/INTO will copy the IDENTITY property, the Data Type, and the data. It will NOT copy nullability. That's why the OP is currently having the problem he cited. I'll post the fix for this in a minute or two.

    What do you mean by "will NOT copy nullability"?

    As I'm aware, it does copy it from source column:

    CREATE TABLE Table1 (Col1 INT NULL, Col2 INT NOT NULL)

    SELECT * INTO Table2 FROM Table1

    exec sp_help Table2

    I agree that there is no rule of thum for using SELECT ... INTO over INSERT ... SELECT, however I always start with SELECT ... INTO before considering the second one, in most of cases, I have experience with, it does win...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • alan_lynch (3/11/2013)


    I am creating a table by using an insert into from another table

    then I am deleting any duplicate rows from that table

    then I try to create a primary key on column1(SN_Original)

    but I get an error saying

    Msg 8111, Level 16, State 1, Line 27

    Cannot define PRIMARY KEY constraint on nullable column in table 'lookuptable'

    is there a way round this

    here is my code

    Drop table lookuptable

    select dbo.bigtable.Software_Name_Original as SN_Original ,

    dbo.bigtable.Software_Name_Raw as SN_New

    into lookuptable

    from dbo.BigTable

    order by dbo.bigtable.Software_Name_Raw

    --Delete duplicate rows from lookup table based on SN_NEW (software_name_new)

    DELETE f

    FROM (

    select row_number() over (partition by SN_NEW order by SN_NEW) as rn from lookuptable

    ) AS f

    WHERE rn > 1

    ALTER TABLE lookuptable ADD CONSTRAINT SN_OriginalPK

    PRIMARY KEY CLUSTERED (SN_Original);

    Before I get into the simple code to do as you ask, I'd like to suggest a couple of things.

    1. Always use a 2 part naming convention. there's a slight performance advantage to it (especially for frequently executed GUI related procs) and it makes your code more bullet proof if someone ever creates another schema and creates a table with the same object name in that schema. That also means that you should correctly use table aliases for queries that address more than 1 table. It also means you need to stop using the 3 part naming convention especially since it's been deprecated for the last couple of revisions of SQL Server.

    2. Format your code for readability especially when it comes to indenting. You might actually be the next person that has to work on it. 😉

    3. Start using semi-colons where they belong for every statement. Leaving them out has also been deprecated for a couple of revisions of SQL Server.

    4. Break out Books Online (press the {f1} key in SSMS to get there) and read about CTEs. They're basically the same as the sub-query you wrote but reveal themselves in a more natural "top down" fashion.

    5. Only select what you need. DELETEs are expensive because you first had to load the data and then delete it. On top of that, the DELETEs get logged even if you're in the SIMPLE recovery mode because it's not ever minimally logged.

    With all of that in mind, here's the (untested) code that I'd use to replace your original code. I say "untested" only because I don't have your original table handy and you didn't provide any readily consumable data to test against. See the first link in my signature line below for future posts.

    Here's the code...

    --===== If the lookup table already exists, drop it.

    IF OBJECT_ID('dbo.LookupTable','U') IS NOT NULL

    DROP TABLE dbo.LookupTable

    ;

    --===== Create and populate the new lookup table on-the-fly.

    -- Since you're not creating an IDENTITY column here,

    -- and ORDER BY is a useless and only makes the code

    -- take longer an use more resources.

    -- Also, stop using the 3 part naming convention in the

    -- SELECT list... it's been deprecated.

    -- Also notice that there's no need to load a bunch of

    -- data and then delete it. That's another waste of

    -- clock cycles.

    WITH

    cteDupeCheck AS

    (

    SELECT RowNum = ROW_NUMBER() OVER (PARTITION BY Software_Name_Raw

    ORDER BY Software_Name_Raw)

    , SN_Original = Software_Name_Original

    , SN_New = Software_Name_Raw

    FROM dbo.BigTable

    )

    SELECT SN_Original = ISNULL(SN_Original,0) --ISNULL makes the column NOT NULL

    , SN_New

    INTO dbo.Lookuptable

    FROM cteDupeCheck

    WHERE RowNum = 1

    ;

    --===== Since we now have a NOT NULL column,

    -- add the desired unique clustered index.

    ALTER TABLE dbo.LookUpTable

    ADD CONSTRAINT PK_LookUpTable

    PRIMARY KEY CLUSTERED (SN_Original)

    ;

    Let us know how it works out for you. Feedback is our only reward. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Eugene Elutin (3/12/2013)I agree that there is no rule of thum for using SELECT ... INTO over INSERT ... SELECT, however I always start with SELECT ... INTO before considering the second one, in most of cases, I have experience with, it does win...

    I start with SELECT...INTO as well (now anyway, thanks to Jeff explaining the situation a year or so back) but with the need for a PK and possibly other indexes, we should not dismiss INSERT INTO...SELECT straightaway.

    I think you and Jeff were typing at the same time and his post should clear the air on what was meant by 'not copying NULLability', a simple matter of perspective.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Eugene Elutin (3/12/2013)


    Jeff Moden (3/12/2013)


    demonfox (3/12/2013)


    When you use

    Select Into

    the columns created would take the data type ,nullability etc. properties from the columns available in FROM clause.

    so make sure you have SOFTWARE_NAME_ORIGINAL as NOT NULL ... Since , primary key can't be created on Nullable columns ..

    Not quite. SELECT/INTO will copy the IDENTITY property, the Data Type, and the data. It will NOT copy nullability. That's why the OP is currently having the problem he cited. I'll post the fix for this in a minute or two.

    What do you mean by "will NOT copy nullability"?

    As I'm aware, it does copy it from source column:

    CREATE TABLE Table1 (Col1 INT NULL, Col2 INT NOT NULL)

    SELECT * INTO Table2 FROM Table1

    exec sp_help Table2

    I agree that there is no rule of thum for using SELECT ... INTO over INSERT ... SELECT, however I always start with SELECT ... INTO before considering the second one, in most of cases, I have experience with, it does win...

    First, I absolutely agree. My experience has been that SELECT/INTO will usually win especially when using it to create a Temp Table (minimal logging there) and especially when no indexes are involved. In most cases, no indexes will be needed because, hopefully, you've only captured the data you really need and table scans will be as fast as seeks followed by a range scan (like what occurred in my "Hierchies on Steroids" articles... the heap was actually faster than a clustered table). I have, however, done some experiments with opc.three and there are times when the "create then fill" method does work faster than SELECt/INTO followed by an index creation. Like I said, "It Depends".

    On the second part, it would appear they fixed the copy of nullability problem in 2008 and above. Sorry for not being aware of that. If you run your code in 2005, here's the result you get. {EDIT} My apologies. I don't know why I thought SELECT/INTO wouldn't copy nullability. There was a reason but I'll be darned if I can remember what it was.

    Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation

    ----------- ---- -------- ------ ----- ----- -------- ------------------ -------------------- ---------

    Col1 int no 4 10 0 yes (n/a) (n/a) NULL

    Col2 int no 4 10 0 no (n/a) (n/a) NULL

    Still, I'll continue to use the ISNULL method to make the code bullet proof. It costs almost nothing in terms of performance to do so.

    As a sidebar, the ISNULL method will also strip the IDENTITY property off a column if you need to do such a thing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/12/2013)


    .... If you run your code in 2005, here's the result you get.

    Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation

    ----------- ---- -------- ------ ----- ----- -------- ------------------ -------------------- ---------

    Col1 int no 4 10 0 yes (n/a) (n/a) NULL

    Col2 int no 4 10 0 no (n/a) (n/a) NULL

    ...

    So, it works in SQL2005 too 😉

    I agree on use of ISNULL to make column NOT NULL and strip away its IDENTITY if such thing is desired outcome.

    On creating indexes post table creation - agree as well, it's all depends.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (3/12/2013)


    Jeff Moden (3/12/2013)


    .... If you run your code in 2005, here's the result you get.

    Column_name Type Computed Length Prec Scale Nullable TrimTrailingBlanks FixedLenNullInSource Collation

    ----------- ---- -------- ------ ----- ----- -------- ------------------ -------------------- ---------

    Col1 int no 4 10 0 yes (n/a) (n/a) NULL

    Col2 int no 4 10 0 no (n/a) (n/a) NULL

    ...

    So, it works in SQL2005 too 😉

    I agree on use of ISNULL to make column NOT NULL and strip away its IDENTITY if such thing is desired outcome.

    On creating indexes post table creation - agree as well, it's all depends.

    Dammit. What's going on here? First, appologies for not drinking enough coffee before posting. :blush: These old eyes didn't see the absence of NOT on the first column. I'm not sure why I thought it wouldn't copy nullability and now I've got some digging to do.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply