How to Insert million of records into a table?

  • Hi

    Seeking help on above question. Because of this question I have failed my 1st interview. Please also provide couple of examples on how to achieve this result, it will be big help for my research.

    Thanks in advance to answers provider!

  • That's an easy one to search for yourself, you'll also learn more.

    Thanks

  • What was your answer?  Do you know why it was wrong?

    What's the job?  (I assume it's a job interview you failed...)

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • And... was that all there was to the question?  Did they identify the source of the data?  Did the identify any limits or extenuating circumstances?

    Better than that, did you have anything on your resume that said you know how to work with lots of data or had tables that contained millions of rows or knew how to do ETL or import data or what?

    I also have to agree with the others.  "Research" means finding stuff out on your own... not having others provide answers to interview questions for you.

    I will give you a starting point, though... unless there are some currently unknown limits or additional circumstances, the way to insert a million rows is the same way to insert just one.  Limits and additional circumstances will cause variations on that theme.

    Now that you know that, all you have to do know is be prepared to discuss the many variations.  And, if that's all the information they gave you for the question, then they may have dinged you for not asking about limits and additional circumstances.

    My answer to such a simply stated question with no additional information offered would have started with "It Depends" following by the litany of limits, circumstances, and the effects each would have on the code and what the code should contain.

    While this type of question might seem a bit unfair, if you were interviewing for a senior position, there are no requirements on the part of the interviewers to be fair because they're looking for the best candidate they can get for the money.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here are two methods:

    IF OBJECT_ID('tempdb..#SlowInserts','U') IS NOT NULL BEGIN
    DROP TABLE #SlowInserts
    END
    GO
    CREATE TABLE #SlowInserts(Id int IDENTITY(1,1) NOT NULL, Comment varchar(50) NOT NULL)
    GO
    SET NOCOUNT ON
    DECLARE @i int = 0
    DECLARE @InsertValue as varchar(36)
    WHILE @i < 1000000 BEGIN
    INSERT INTO #SlowInserts(Comment)
    SELECT NEWID()
    SET @i+=1
    END
    GO
    SELECT count(*) FROM #SlowInserts
    IF OBJECT_ID('tempdb..#FastInserts','U') IS NOT NULL BEGIN
    DROP TABLE #FastInserts
    END
    CREATE TABLE #FastInserts(Id int IDENTITY(1,1) NOT NULL, Comment varchar(36) NOT NULL)
    GO
    ;WITH X1 AS (SELECT * FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) T(C))
    INSERT INTO #FastInserts(Comment)
    SELECT TOP(1000000) NEWID()
    FROM X1 A,X1 B,X1 C,X1 D,X1 E,X1 F

    SELECT count(*) FROM #FastInserts

     

  • Re Jeffs comment Did the identify any limits or extenuating circumstances?

    I was tasked with importing over 15000000 rows of data, first having to delete a massive amount of existing data. The problem was the insert had to be complete before staff started work that day. The insert was overrunning and causing problems, solution drop the indexes, insert the data then rebuild indexes.

  • Kev Wisbey-443237 wrote:

    Re Jeffs comment Did the identify any limits or extenuating circumstances?

    I was tasked with importing over 15000000 rows of data, first having to delete a massive amount of existing data. The problem was the insert had to be complete before staff started work that day. The insert was overrunning and causing problems, solution drop the indexes, insert the data then rebuild indexes.

    Cool... I love this kind of feedback.

    Just curious... you say that you imported over 15 million rows of data but that you first had to delete a massive amount of existing data.  That prompts me to ask some additional questions...

    1. How many rows were typically deleted?
    2. After the 15 Million Row import, how many total rows were left in the table?
    3. What was the criteria for the deletions?  Was it based on some temporal value or ???
    4. Did you drop the Clustered Index or was the table a heap?
    5. Was any replication or other use of the log (log shipping) required for this one table?
    6. What was the Recovery Model of the database set to and, if set to FULL, was the temporary use of BULK LOGGED allowed?  If not to the latter, could the table be moved to a different database if no code changes were required?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s.  The questions I asked above (and possibly more) would be the kind of questions (obviously not identical because there were no deletes) that I would have asked an interviewer if they asked the simple question of "How to Insert million of records into a table?" without any amplifying information except the first two words in my reply would have been "It Depends".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff thanks for that, getting a cool from you wow. The problem was we had to get all till transactions from a large group of outlets, in case there was any breakdown in the outlets internet the idea was delete several days transactions and reload the data. As for how many rows were there after I honestly cannot remember (this was 2010).It was a clustered index no way would we have a heap and if I remember we had more than 1 index. Re your point 5 & 6 as I was only involved in writing the SSIS package for the import I cannot comment on those points.

  • Thanks, Kev but... darn it all.  I was hoping you remembered more details because it sounds like a wicked interesting problem and I was going to set something up to explore the given method and some of my own.  One of the first things I cut my teeth on (circa '96) in SQL was loading shedloads of telephone data.

    Anyway, thank you again for the kind feedback and the information that  you did remember.

    heh... p.s.  I never used DTS or SSIS for any of it.  Didn't even know such a thing existed back then and that might not be all bad. 😀  A large part of many jobs in the years after that were to replace SSIS jobs with T-SQL jobs.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply