Bulk Insert

  • Hello, I need to load the following data into a SQL table. This is how the vendor is able to provide it to us.

    CRCorp Daily Report,,,,,,

    ,,,,,,

    Facility,Location,Purchase Order #,Vendor,Inventory #,Date Ordered,Extended Cost

    09-Mowtown 495 CRST,09-402A Women's Imaging,327937,"BARD PERIPHERAL VASCULAR, INC.",113989,7/25/2014,650

    09-Mowtown 495 CRST,09-402A Women's Imaging,327936,"WB MASON CO., INC.",112664,7/25/2014,8.64

    01-Mowtown 499 CRST,01-302B Oncology,327947,McKesson General Medical,n/a,7/25/2014,129.02

    04-Rocky Hillside,04-301A OB/GYN(Closets),327931,CARDINAL 200 (MED/SURG) LLC,115098,7/25/2014,13.23

    04-Rocky Hillside,04-301A OB/GYN(Closets),327931,CARDINAL 200 (MED/SURG) LLC,111690,7/25/2014,11.79

    04-Rocky Hillside,04-301A OB/GYN(Closets),327932,McKesson General Medical,111399,7/25/2014,18.79

    04-Rocky Hillside,04-301A OB/GYN(Closets),327932,McKesson General Medical,111459,7/25/2014,16.8

    04-Rocky Hillside,04-301A OB/GYN(Closets),327932,McKesson General Medical,103946,7/25/2014,74.76

    04-Rocky Hillside,04-301A OB/GYN(Closets),327931,CARDINAL 200 (MED/SURG) LLC,110637,7/25/2014,41.44

    04-Rocky Hillside,04-301A OB/GYN(Closets),327931,CARDINAL 200 (MED/SURG) LLC,110818,7/25/2014,24.01

    14-Wartown RRB,14-102A OB/GYN,327933,CARDINAL 200 (MED/SURG) LLC,111160,7/25/2014,14.64

    14-Wartown RRB,14-102A OB/GYN,327933,CARDINAL 200 (MED/SURG) LLC,112804,7/25/2014,27.26

    14-Wartown RRB,14-102A OB/GYN,327933,CARDINAL 200 (MED/SURG) LLC,110361,7/25/2014,45.96

    14-Wartown RRB,14-102A OB/GYN,327934,McKesson General Medical,103946,7/25/2014,74.76

    14-Wartown RRB,14-102A OB/GYN,327934,McKesson General Medical,111820,7/25/2014,981.6

    14-Wartown RRB,14-102A OB/GYN,327933,CARDINAL 200 (MED/SURG) LLC,110610,7/25/2014,16.11

    14-Wartown RRB,14-102A OB/GYN,327934,McKesson General Medical,111553,7/25/2014,19.74

    14-Wartown RRB,14-102A OB/GYN,327933,CARDINAL 200 (MED/SURG) LLC,104769,7/25/2014,20.04

    14-Wartown RRB,14-102A OB/GYN,327933,CARDINAL 200 (MED/SURG) LLC,115098,7/25/2014,13.23

    14-Wartown RRB,14-102A OB/GYN,327934,McKesson General Medical,105923,7/25/2014,90.6

    14-Wartown RRB,14-102A OB/GYN,327934,McKesson General Medical,111425,7/25/2014,23.62

    It is comma separated with quotes around vendor names that contain a comma in them. There are a few.

    I have attempted to bulk insert it into this table with no luck.

    CREATE TABLE POMaster

    (Facility VARCHAR(75),

    Location VARCHAR(75),

    PONum INT,

    VendorNm INT,

    INVENTORYNUM VARCHAR(25),

    orderDte DATE,

    ExtendedPrice NUMERIC(10,2)

    )

    GO

    It does not like the double quotes. Any ideas on how to make this format work? Do I need a format file? A solution to make this work would be greatly appreciated.

    Thank you,

    David

  • Here is a quick solution, more generic than using a specific format file for the import.

    The first part is a CSV splitter function which is based on Jeff Moden's DelimitedSplit8K, then a simple format file for importing the file line by line. The last bit is the actual import.

    😎

    dbo.DELIMITED_CSV_SPLIT

    CREATE FUNCTION dbo.DELIMITED_CSV_SPLIT

    (

    /*********************************************************************

    Using Jeff Moden's DelimitedSplit8K as a base with the addition

    of a Text Qualifier parameter, @pTxtQualifier CHAR(1) = '"'

    *********************************************************************/

    @pString VARCHAR(8000)

    ,@pDelimiter CHAR(1)

    ,@pTxtQualifier CHAR(1)

    )

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    /*********************************************************************

    cteTally, inline Tally table returning a number sequence equivalent

    to the length of the input string.

    *********************************************************************/

    WITH E1(N) AS

    (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ),

    E2(N) AS (SELECT 1 FROM E1 a, E1 b),

    E4(N) AS (SELECT 1 FROM E2 a, E2 b),

    cteTally(N) AS

    (

    SELECT

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM E4

    ORDER BY 1 OFFSET 0 ROWS

    FETCH FIRST CAST(LEN(@pString) AS BIGINT) ROWS ONLY

    )

    /********************************************************************

    Retrieve the position (N) and the character code (chrCode)

    for all delimiters (@pDelimiter) and text qualifiers

    (@pTxtQualifier)

    ********************************************************************/

    ,ctePrimer(N,chrCode) AS

    (

    SELECT

    t.N

    ,UNICODE(SUBSTRING(@pString,t.N,1)) AS chrCode

    FROM cteTally t

    WHERE SUBSTRING(@pString,t.N,1) COLLATE Latin1_General_BIN

    = @pDelimiter COLLATE Latin1_General_BIN

    OR SUBSTRING(@pString,t.N,1) COLLATE Latin1_General_BIN

    = @pTxtQualifier COLLATE Latin1_General_BIN

    )

    /********************************************************************

    The cteStart encloses the string in virtual delimiters using

    Union All at the beginning and the end. The main body sets the

    IsDelim and IsTxQf flags.

    ********************************************************************/

    ,cteStart(N,IsDelim,IsTQA) AS

    (

    SELECT

    0 AS N

    ,1 AS IsDelim

    ,0 AS IsTxQf

    UNION ALL

    SELECT

    t.N

    ,(1 - SIGN(ABS(t.chrCode - UNICODE(@pDelimiter)))) AS IsDelim

    ,(1 - SIGN(ABS(t.chrCode - UNICODE(@pTxtQualifier)))) AS IsTxQf

    FROM ctePrimer t

    UNION ALL

    SELECT

    LEN(@pString) + 1 AS N

    ,1 AS IsDelim

    ,0 AS IsTxQf

    )

    /********************************************************************

    cteWorkSet:

    Position (N), Delimiter flag (IsDelim), Text Qualifier flag

    (IsTQA) and the running total of the number of appearances of

    Text Qualifiers. The delimiters which are inside Text Qualifiers

    are cancelled out by multiplying the IsDelim flag with the result

    of ( 1 + the running total of IsTQA ) mod 2.

    ********************************************************************/

    ,cteWorkSet(N,IsDelim,IsTQA) AS

    (

    SELECT

    cST.N

    ,cST.IsDelim * ((1+ SUM(cST.IsTQA) OVER

    (PARTITION BY (SELECT NULL) ORDER BY cST.N

    ROWS UNBOUNDED PRECEDING)) % 2) AS IsDelim

    ,((SUM(cST.IsTQA) OVER (PARTITION BY (SELECT NULL)

    ORDER BY cST.N ROWS UNBOUNDED PRECEDING)) % 2) AS IsTQA

    FROM cteStart cST

    ),

    /********************************************************************

    cteWSTQ:

    Using LEAD and LAG to retrieve the offsets for the Text Qualifiers

    and filtering the results by IsDelim = 1 or IsTQA = 1. The set now

    holds all the information needed for correctly splitting the text.

    ********************************************************************/

    cteWSTQ(P_START,IsDelim,NEXT_IsTQA,LAG_IsTQA) AS

    (

    SELECT

    cWS.N AS P_START

    ,cWS.IsDelim AS IsDelim

    ,LEAD(cWS.IsTQA,1,0) OVER (ORDER BY cWS.N) AS NEXT_IsTQA

    ,LAG(cWS.IsTQA,1,0) OVER (ORDER BY cWS.N) AS LAG_IsTQA

    FROM cteWorkSet cWS

    WHERE cWS.IsDelim = 1

    OR cWS.IsTQA = 1

    )

    /********************************************************************

    cteWSLEN:

    Calculate the start and the length of each field

    ********************************************************************/

    ,cteWSLEN(P_START,P_LEN) AS

    (

    SELECT

    (X.P_START + X.NEXT_IsTQA + SIGN(X.P_START)) AS P_START

    ,(LEAD(X.P_START,1,0) OVER (ORDER BY X.P_START) -

    ((X.P_START + X.NEXT_IsTQA) + SIGN(X.P_START) +

    LEAD(X.LAG_IsTQA,1,0) OVER (ORDER BY X.P_START))) AS P_LEN

    FROM cteWSTQ X WHERE X.IsDelim = 1

    )

    /********************************************************************

    Splitting the string using the output of the cteWSLEN, filtering

    it by the length being non-negative value. The NULLIF returns NULL

    if the field is empty.

    ********************************************************************/

    SELECT

    ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber

    ,NULLIF(REPLACE(SUBSTRING(@pString,cWL.P_START,cWL.P_LEN),@pTxtQualifier+@pTxtQualifier,@pTxtQualifier),'') AS Item

    FROM cteWSLEN cWL

    WHERE cWL.P_LEN > -1

    GO

    IMPORT_BY_LINE_FMT.xml

    <?xml version="1.0"?>

    <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    <RECORD>

    <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\r" MAX_LENGTH="8000"/>

    </RECORD>

    <ROW>

    <COLUMN SOURCE="1" NAME="RAW_LINE" xsi:type="SQLVARYCHAR"/>

    </ROW>

    </BCPFORMAT>

    Finally the import code

    DECLARE @FILE_PATH_NAME NVARCHAR(1024) = N'C:\ImportCSV\CRCorpDailyReport.txt';

    DECLARE @FORMAT_FILE NVARCHAR(1024) = N'C:\ImportCSV\IMPORT_BY_LINE_FMT.xml';

    DECLARE @pDelimiter CHAR(1) = ',';

    DECLARE @pTxtQualifier CHAR(1) = '"';

    DECLARE @IMPORT_LINES TABLE

    (

    IL_ID INT NOT NULL PRIMARY KEY CLUSTERED

    ,IL_LINE VARCHAR(8000) NOT NULL

    );

    DECLARE @SQL_STR NVARCHAR(MAX) = N'

    ;WITH HEADER AS

    (

    SELECT

    X.RAW_LINE

    FROM OPENROWSET

    (

    BULK ' + NCHAR(39) + @FILE_PATH_NAME + NCHAR(39) + N'

    ,FORMATFILE= ' + NCHAR(39) + @FORMAT_FILE + NCHAR(39) + N'

    ) AS X

    )

    SELECT

    ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) AS C_NO

    ,X.RAW_LINE

    FROM HEADER X

    ';

    INSERT INTO @IMPORT_LINES (IL_ID,IL_LINE)

    EXEC SP_EXECUTESQL @SQL_STR

    ;WITH BASE_DATA AS

    (

    SELECT

    IL.IL_ID

    ,C.ItemNumber

    ,C.Item

    FROM @IMPORT_LINES IL

    OUTER APPLY dbo.DELIMITED_CSV_SPLIT (IL.IL_LINE,@pDelimiter,@pTxtQualifier) AS C

    )

    SELECT

    BD.IL_ID

    ,MAX(CASE WHEN BD.ItemNumber = 1 THEN BD.Item END) AS COL_01

    ,MAX(CASE WHEN BD.ItemNumber = 2 THEN BD.Item END) AS COL_02

    ,MAX(CASE WHEN BD.ItemNumber = 3 THEN BD.Item END) AS COL_03

    ,MAX(CASE WHEN BD.ItemNumber = 4 THEN BD.Item END) AS COL_04

    ,MAX(CASE WHEN BD.ItemNumber = 5 THEN BD.Item END) AS COL_05

    ,MAX(CASE WHEN BD.ItemNumber = 6 THEN BD.Item END) AS COL_06

    ,MAX(CASE WHEN BD.ItemNumber = 7 THEN BD.Item END) AS COL_07

    FROM BASE_DATA BD

    GROUP BY BD.IL_ID

    Results

    IL_ID COL_01 COL_02 COL_03 COL_04 COL_05 COL_06 COL_07

    ----------- --------------------- ------------------------- ------------------- -------------------------------- -------------- -------------- ----------------

    1 CRCorp Daily Report NULL NULL NULL NULL NULL NULL

    2 Facility Location Purchase Order # Vendor Inventory # Date Ordered Extended Cost

    3 09-Mowtown 495 CRST 09-402A Women's Imaging 327937 BARD PERIPHERAL VASCULAR, INC. 113989 7/25/2014 650

    4 09-Mowtown 495 CRST 09-402A Women's Imaging 327936 WB MASON CO., INC. 112664 7/25/2014 8.64

    5 01-Mowtown 499 CRST 01-302B Oncology 327947 McKesson General Medical n/a 7/25/2014 129.02

    6 04-Rocky Hillside 04-301A OB/GYN(Closets) 327931 CARDINAL 200 (MED/SURG) LLC 115098 7/25/2014 13.23

    7 04-Rocky Hillside 04-301A OB/GYN(Closets) 327931 CARDINAL 200 (MED/SURG) LLC 111690 7/25/2014 11.79

    8 04-Rocky Hillside 04-301A OB/GYN(Closets) 327932 McKesson General Medical 111399 7/25/2014 18.79

    9 04-Rocky Hillside 04-301A OB/GYN(Closets) 327932 McKesson General Medical 111459 7/25/2014 16.8

    10 04-Rocky Hillside 04-301A OB/GYN(Closets) 327932 McKesson General Medical 103946 7/25/2014 74.76

    11 04-Rocky Hillside 04-301A OB/GYN(Closets) 327931 CARDINAL 200 (MED/SURG) LLC 110637 7/25/2014 41.44

    12 04-Rocky Hillside 04-301A OB/GYN(Closets) 327931 CARDINAL 200 (MED/SURG) LLC 110818 7/25/2014 24.01

    13 14-Wartown RRB 14-102A OB/GYN 327933 CARDINAL 200 (MED/SURG) LLC 111160 7/25/2014 14.64

    14 14-Wartown RRB 14-102A OB/GYN 327933 CARDINAL 200 (MED/SURG) LLC 112804 7/25/2014 27.26

    15 14-Wartown RRB 14-102A OB/GYN 327933 CARDINAL 200 (MED/SURG) LLC 110361 7/25/2014 45.96

    16 14-Wartown RRB 14-102A OB/GYN 327934 McKesson General Medical 103946 7/25/2014 74.76

    17 14-Wartown RRB 14-102A OB/GYN 327934 McKesson General Medical 111820 7/25/2014 981.6

    18 14-Wartown RRB 14-102A OB/GYN 327933 CARDINAL 200 (MED/SURG) LLC 110610 7/25/2014 16.11

    19 14-Wartown RRB 14-102A OB/GYN 327934 McKesson General Medical 111553 7/25/2014 19.74

    20 14-Wartown RRB 14-102A OB/GYN 327933 CARDINAL 200 (MED/SURG) LLC 104769 7/25/2014 20.04

    21 14-Wartown RRB 14-102A OB/GYN 327933 CARDINAL 200 (MED/SURG) LLC 115098 7/25/2014 13.23

    22 14-Wartown RRB 14-102A OB/GYN 327934 McKesson General Medical 105923 7/25/2014 90.6

    23 14-Wartown RRB 14-102A OB/GYN 327934 McKesson General Medical 111425 7/25/2014 23.62

  • Attached to this post is a simple SSIS Package * that will load your file to (a slightly modified version of) your POMaster table. I think you will need some transforms to resolve some of your data points to system keys, e.g. the Purchase Order # in your file is alpha-numeric and in your table PONum is declared as an INT, but you'll get the idea. The intent was to show you how to setup an SSIS Flat File Connection Manager to parse a delimited file with text qualifiers and embedded delimiters.

    * the forum would not allow me to attach a file with a .dtsx extension, so I named it with a .txt extension, just drop the .txt on save.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply