Surely there is a way to import CSV data in this situation?

  • Ok, here goes:

    CREATE Function ChangeDelimiters(

    @LINEvarchar(max),

    @SEARCHchar(1),

    @REPLACEchar(1))

    Returns Table As Return

    -- Replace @SEARCH with @REPLACE when found in @LINE between a pair of double quotes (e.g. 1,2,3,"4,5",6,7 with SEARCH ',' and REPLACE '!' becomes 1,2,3,"4!5",6,7)

    -- Table with 10 dummy rows

    With A1 As(

    Select 1 N Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All

    Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1),

    -- Increase to 100000

    A2 As (Select 1 N From A1 a Cross Join A1 b),

    A3 As (Select 1 N From A2 a Cross Join A2 b),

    A4 As (Select 1 N From A3 a Cross Join A1 b),

    -- Table with one row per character in the line, numbered N

    TALLY As (Select Top (Len(@LINE)) ROW_NUMBER() Over (Order By A4.N) N From A4),

    -- Table with one row per character in the line, and a second column indicating the character itself

    ITEM_SPLIT As (

    SelectN As ITEM_ORDER,

    SubString(@LINE,N,1) As ITEM

    FromTALLY

    WhereN <= Len(@LINE)),

    -- Table with the start and end positions of the double quotes in @LINE

    QUOTE_REP As (

    Select((ROW_NUMBER() Over (Order By ITEM_ORDER)+1)/2) As QUOTE_POS, -- int (position+1)/2 will give start and end (1=1,2=1.5=1,3=2,4=2.5=2 etc)

    ITEM_ORDER,

    ITEM

    FromITEM_SPLIT

    WhereITEM = '"'),

    -- Table with one row for each replacement, giving start and end positions of quotes

    QUOTE_CHANGE As (

    SelectQUOTE_POS,

    Min(ITEM_ORDER) As START_POS,

    Max(ITEM_ORDER) As END_POS

    FromQUOTE_REP

    Group By QUOTE_POS),

    -- Table with one rows per character in the *altered* @LINE

    REJOIN As (

    SelectCase When qc.QUOTE_POS Is Not Null And its.ITEM = @SEARCH Then @REPLACE Else its.ITEM End As ITEM

    FromITEM_SPLIT its

    Left Join QUOTE_CHANGE qc On its.ITEM_ORDER Between qc.START_POS And qc.END_POS)

    -- Concat each row in REJOIN to form the result

    SelectRESULT = ((Select '' + ITEM From REJOIN For Xml Path(''),TYPE).value('.','varchar(max)'))

  • Brian - You'll get better performance with a properly constructed permanent tally table instead of a virtual one. Not sure how to do that? See The "Numbers" or "Tally" Table: What it is and how it replaces a loop.[/url]

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Wayne, it really depends. There seems to be a tipping point where a dynamic tally table beats a physical tally table.

    Don't have specifics, and it could just be my system or perception.

  • Do you remember when all this text qualified CSV stuff that everyone has a problem with used to be child's play if not a touch slow? 😉

    --===== Setup the Jet driver to look at a directory in the qualified text (true CSV) mode

    EXEC sp_addlinkedserver TxtSrv01, 'Jet 4.0', 'Microsoft.Jet.OLEDB.4.0', 'C:\Temp', NULL, 'Text';

    --===== Add a trusted login to the new linked server

    EXEC sp_addlinkedsrvlogin TxtSrv01, FALSE, NULL, NULL;

    --===== See a list of text files from the directory we setup on the linked server

    EXEC sp_tables_ex TxtSrv01;

    --===== Read the content of a given file name [TestInsert#txt] using the true CSV mode

    SELECT * FROM TxtSrv01...[TestInsert#txt];

    --===== All done... throw away the linked server

    EXEC sp_dropserver 'TxtSrv01','droplogins';TestInsert#txt]

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 31 through 33 (of 33 total)

You must be logged in to reply to this topic. Login to reply