Bulk insert from file having varying number of columns

  • I have a csv file having different number of columns in each row.

    I will have to dump this data into a staging table having 20 columns.

    20131111,010103,1,Test , Test ,"HFirst",43,40,42

    20131111,010113,2,"Test HScriptTwo",43,40

    20131111,010123,3,"Testing HThree","HScript Three"

    It should store the data leaving the rest of the columns as NULL.

    It is currently filling all the 20 columns ignoring the line break, Writing to the columns continously.

    BULK INSERT SysRepTemp1

    FROM 'c:\SystemReports.csv'

    WITH

    (

    FIRSTROW=1,

    FIELDTERMINATOR = ',',

    ROWTERMINATOR = ''

    )

  • Dorthy Jeff Moden has a couple of posts on this, which i'd saved in the past:

    here's the specific threads in question:

    http://www.sqlservercentral.com/search/?q=%22Headerless+Ragged+Right%22

    basically, he uses some dos commands to make sure the file has headers if it didn't already, and then uses a text-file linked server which automatically treats the files as having NULLS for missing columns(which in turn make the columns for the import somewhere else.)

    those old posts refer to 32 bit linked servers usign the JET driver, so if you need a 64 bit text based linked server, it's like this example:

    select * from OpenRowset('MSDASQL',

    'Driver={Microsoft Access Text Driver (*.txt, *.csv)};DefaultDir=C:\Data\;'

    ,'select top 10 * from C:\Data\MailItems.txt')

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thank you Jeff

  • Lowell (11/12/2013)


    Dorthy Jeff Moden has a couple of posts on this, which i'd saved in the past:

    here's the specific threads in question:

    http://www.sqlservercentral.com/search/?q=%22Headerless+Ragged+Right%22

    basically, he uses some dos commands to make sure the file has headers if it didn't already, and then uses a text-file linked server which automatically treats the files as having NULLS for missing columns(which in turn make the columns for the import somewhere else.)

    those old posts refer to 32 bit linked servers usign the JET driver, so if you need a 64 bit text based linked server, it's like this example:

    select * from OpenRowset('MSDASQL',

    'Driver={Microsoft Access Text Driver (*.txt, *.csv)};DefaultDir=C:\Data\;'

    ,'select top 10 * from C:\Data\MailItems.txt')

    Sadly, there are 3 things that are now wrong with the method I posted... 32 bit limitations, performance, and privs. A flat load followed by a split and a crosstab are frequently more effective not to mention the fact that OPENROWSET requires "SA" privs to run. I'll see if I have the time to demo some code after work.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • here's my linked server example for a 64 bit folder full of text files that may help:

    a couple of prerequisites:

    install the AccessDatabaseEngine_x64.exe from microsoft:

    http://www.microsoft.com/en-us/download/details.aspx?id=13255

    make sure you open an Administrative command prompt window, and run it with the c:\Downloads\AccessDatabaseEngine_x64.exe /passive

    command line flag;

    this will force the install of the drivers, even if you have 32 bit office installed;

    otherwise you get some error about 32 bit Office preventing the install.

    After that is installed:

    --Required settings for the provider to work correctly as a linked server

    EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'AllowInProcess', 1

    GO

    EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'DynamicParameters', 1

    GO

    and then the code for the text based linked server:

    --#################################################################################################

    --Linked server Syntax for Folder Full Of Text Files

    --#################################################################################################

    --add a folder as a linked server to access all .txt and .csv files in the folder

    DECLARE @server sysname,

    @srvproduct nvarchar(256),

    @provider nvarchar(256),

    @datasrc nvarchar(100),

    @location nvarchar(100),

    @provstr nvarchar(100),

    @catalog sysname,

    @sql varchar(1000)

    SET @server = N'TxtSvr'

    SET @srvproduct = N'OLE DB Provider for ACE'

    SET @provider = N'Microsoft.ACE.OLEDB.12.0'

    SET @datasrc = N'C:\Data\'

    SET @provstr ='Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Data\;Extended Properties="text;HDR=YES;FMT=Delimited" '

    set @provstr = 'Text'

    EXEC sp_addlinkedserver @server,@srvproduct,@provider,@datasrc,@provstr,@provstr

    GO

    --===== Set up login mappings.

    EXEC dbo.sp_AddLinkedSrvLogin TxtSvr, FALSE, NULL, Admin, NULL

    GO

    --===== List the tables in the linked server which is really a list of

    -- file names in the directory. Note that the "#" sign in the

    -- Table_Name is where the period in the filename actually goes.

    EXEC dbo.sp_Tables_Ex TxtSvr

    GO

    --===== Query one of the files by using a four-part name.

    SELECT *

    FROM TxtSvr...[LegalName_NickName_List#txt]

    --===== Drop the text server

    EXEC dbo.sp_DropServer 'TxtSvr', 'DropLogins'

    GO

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (11/12/2013)


    here's my linked server example for a 64 bit folder full of text files that may help:

    Very cool. I've not actually used the ACE drivers, yet (had other problems that would take too long to explain) but now I have a great example for how to use them to solve the ragged-right problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/12/2013)


    Lowell (11/12/2013)


    here's my linked server example for a 64 bit folder full of text files that may help:

    Very cool. I've not actually used the ACE drivers, yet (had other problems that would take too long to explain) but now I have a great example for how to use them to solve the ragged-right problem.

    yeah i went to some effort to update all my 32 bit examples that i collected over the years, and update them to work in the 64 bit environment too. there wasn't enough examples out there of working 64 bit linked servers.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thanks Jeff for your solution.

    Isn't there a way to get this through without using linked server? A little simpler solution?

    Please let me know.

  • Sorry for the late response. I lost track of this thread. Do you still need help on this?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Re: 64-bit ACE & 32-bit Office apps...

    If you install this, and you have 32-bit Office apps, it will probably mess things up for the Office apps. At least that's been my experience with 32-bit Office 2010 apps on 64-bit Windows 7.0. I wanted to work within 64-bit SQLExpress 2008R2, and thus installed 64-bit ACE engine (with the switch to do so), and it did indeed make the 64-bit ACE engine visible to SQL Express, but then...

    Excel 2010 - wanted to reinstall/reconfigure each start (and it made PowerQuery unavailable...)

    Project 2010 - wanted to reinstall/reconfigure, then failed ("out of memory"...really?)

    Word 2010 - wanted to reinstall, but the NoReReg=1 registry change worked for Word.

    "NoReReg = 1" did not work for Excel or Project...

    So..... I installed 32-bit SQL Express, uninstalled 64-bit ACE, and... Office apps are OK now.

    Reinstalled PowerQuery just to be safe.

    On a 64-bit SQL Server box, having 32-bit and 64-bit ACE installed, seemed to work... the 32-bit ACE engine is not visible to SQL Server, but the 64-bit engine will be.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply