Data Migration - Quickly Inserting New Data

  • Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columni

  • Hi Steve

    I too use this technique from time to time - it's a neat trick that has all sorts of uses. One slight improvement that takes possibly less effort is to treat any static text as a constant, and add it to an out-of-the-way cell. (Use several cells if need be.) In the concatenation formula you then simply refer to this cell using an absolute reference (F4 cycles round the possibilities). In this way, if it turns out that you've made a typo it's a simple matter to change the text in the 'constant' cell, and all the formulae that refer to this cell will update, saving the process of having to update all the wrong formulae manually.

    Cheers - Graham

  • I use this technique frequently. It is also handy for giving clients big chunks of data for them to edit so they don't have to wade through screen after screen of data to change one or two fields for each row.

    I did a database conversion that changed Standard Industry Codes (SIC - 4 digit ID of the industry a business in the US fits into) to North American Industry Code Standard (NAICS - 6 digits) or something like that. Since the new codes are more specific than the old ones, I used DTS to output the results of the conversion so the client could review the list and update it quickly in Excel. For the final load, I just fired up Excel and built a bunch of update statements that ran in the database GUI. BTW, the database was Progress RDBMS and not even SQL Server; DTS was difficult to use in this case because third party ODBC drivers are required since Progress does not provide them.

    [font="Tahoma"]Bryant E. Byrd, BSSE MCDBA MCAD[/font]
    Business Intelligence Administrator
    MSBI Administration Blog

  • Why didn't I thank of that! Thanks for the lateral thinking which I will certainly bear in mind for quick fixes.

  • I knew I wasn't the first to use it. Some interesting ideas, especially the statis cell.

  • I had a similiar situation to update some old Ids with New IDs... my list contains around 1000-1500 rows in excel...

    Save the sheet as a CSV first...

    I did a BULK INSERT to a temp table and did and UPDATE on the existing table with a join based on the temp table. it was quick.. i don't think it's dirty though!!!! what you think ???

    Linto

  • Hi,

    You can reverse this techinique and use the same to create datadefinitions in Programming Languages also.

    eg: desc table name get the table definition copy paste in excel and manipulate by diffrent ways.

    --Jeswanth

    --------------------------------

  • Its a dirty hack!

    Very similar to tricks I've seen done in Query Analyzer to generate statements. Worth knowing, maybe not the first thing I'd recommend.

  • Why so ? Can you make it clear your statement ?

    --Jeswanth

    --------------------------------

  • Awesome technique and perfectly legitimate.  Used to do this with Unix text files into Oracle about 10 yrs ago, when there was nothing like DTS.  Don't think of this as "quick and dirty".  There's a reason there is an SQL "Insert" command.

     

    -- Mike Prascak

  • Steve  - It's a good idea to add " union" at the end of every row in your insert statement (except the last one). In this case it's one transaction - so you inserting "all the records or none".

    Alex

  • Hi,

    Very nice technique indeed !

    There are SQL scripts to achive the same and may come handy in such situations, however in case you do not have one with you then this is possibly the best technique !

    Learnt a new technique today !

    Thanks for sharing.

    Amol

  • What you outlined is a really useful technique, I have used it many times.

    However sometimes you may need something a little more complicated. In those situations the OPENROWSET statement may come in handy too.

    Here is the basic syntax to use in the Query Analyzer:

    SELECT t.'col1' + ',' + t.'col2'

    FROM OPENROWSET('MSDASQL'

    , 'Driver=Microsoft Excel Driver (*.xls);DBQ=C:\somepath\somefile.xls'

    , 'SELECT [col1], [col2] FROM [Sheet1$]') t

    The advantage is that you can treat the OPENROWSET fragment like any other subquery. This means you can join it with other tables (or even other spreadsheets) to pick out just the rows you need

    Here is a fragment that would insert only rows that do not yet exist in the table:

    INSERT INTO TABLE1 (THE_NAME)

    SELECT t.[myFirst] + ',' + t.[mySecond]

    FROM OPENROWSET('MSDASQL'

    , 'Driver=Microsoft Excel Driver (*.xls);DBQ=C:\somepath\somefile.xls'

    , 'SELECT [myFirst], [mySecond] FROM [Sheet1$]') t

    LEFT JOIN TABLE1 ON TABLE1.THE_NAME = (t.[myFirst] + ',' + t.[mySecond])

    WHERE TABLE1.THE_NAME IS NULL -- restrict to new names only

    Thanks for reminding us of a really useful technique!

  • Graham,

    Do you have an example of specifically how to use your suggestion? I am curious as to how you "treat static text as a constant and add it to an out of the way cell"?

    Thanks,

    Nate

  • In my mind that's way more work that just using bulk insert.  I have a couple of scripts that I tweak when I need to process Excel data.  If I save the data as CSV then I use this script:

    /* create temp hierarchy table */

    CREATE TABLE #if_hierarchy (

      territory_id varchar(16),

      parent_territory_id varchar(16),

      territory_name varchar(40),

      forecast_rpt varchar(80),

      current_yr_act_rpt varchar(80),

      previous_yr_act_rpt varchar(80)

    )

    /* load temp hierarchy table */

    BULK INSERT #if_hierarchy

    FROM 'C:\Documents and Settings\jtlovell\My Documents\Clients\SQL\IF_hierarchy.csv'

      WITH (

        datafiletype = 'char',

        fieldterminator = ',',

        rowterminator = '\n'

      )

    /* load permanent hierarchy table */

    INSERT INTO dbo.if_hierarchy (

      update_time,

      parent_territory_id,

      territory_name,

      forecast_rpt,

      current_yr_act_rpt,

      previous_yr_act_rpt,

      territory_id,

      if_header_id

    )

    SELECT

      getdate() as update_time,

      h.parent_territory_id as parent_territory_id,

      h.territory_name as territory_name,

      h.forecast_rpt as forecast_rpt,

      h.current_yr_act_rpt as current_yr_act_rpt,

      h.previous_yr_act_rpt as previous_yr_act_rpt,

      h.territory_id as territory_id,

      1 as if_header_id

    FROM #if_hierarchy h

    drop table #if_hierarchy

    GO

     

    ---

    If there's commas in the data, then I save to a tab delimited file and use this script:

    CREATE TABLE #reporting_continent(

      country_name varchar(50) primary key,

      region varchar(50),

      reporting_continent varchar(30)

    )

    GO

    /* load cross reference table */

    BULK INSERT #reporting_continent

    FROM 'C:\Documents and Settings\jtlovell\My Documents\Clients\SQL\reporting_continent.txt'

      WITH (

        datafiletype = 'char',

        fieldterminator = '\t',

        rowterminator = '\n'

      )

    UPDATE s SET

      reporting_continent = rc.reporting_continent

    FROM dbo.site s

      JOIN #reporting_continent rc ON (s.phys_country = rc.country_name)

    UPDATE ss SET

      reporting_continent = rc.reporting_continent

    FROM dbo.soldship ss

      JOIN #reporting_continent rc ON (ss.shipphys_country = rc.country_name)

    drop table #reporting_continent

    GO

    --

    The nice thing about this method is that it's easily repeatable.  So for data migrations I just put the data files and the scripts in the same folder and one script loads many Excel files.

     

     

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply