Home Forums Data Warehousing Integration Services Need to remove an arrow char at end of each file as well as a trailing comma in header RE: Need to remove an arrow char at end of each file as well as a trailing comma in header

  • Since Powershell is also "interpreted" rather than "compiled", I'm not sure that PowerShell would be any faster. Short of writing a compiled, automatic file editor (which could be incredibly fast because it could do a direct byte level change of the file), I'd be tempted to do a blob-style BULK INSERT and then use something like xp_Cmdshell to rewrite the file back out without the offending characters. Then turn around and do a normal columnar BULK INSERT. Even with those 3 evolutions, it's likely to be faster than using a batch file or Powershell processor.

    Shifting gears a bit, someone has a strange notion of what a file should contain. The extra comma/EOF/arrow thing is undoubtedly someone's attempt to be clever about how to clearly indicate that the end of file has been reached instead of providing a control file with a rowcount in it. I'd find out who's making the files and tell them to fix it and to stop being "clever". 😉 If that's not possible and this is going to be an on-going task, then it would definitely be worth having someone write a "file repair" SQLCLR as a preprocessor.

    Either that, or you might be using the wrong "code page" for the file type. Some code page styles include a 4 header (such as a Unicode or UTF-8 File), some contain a byte footer (real files actually do end with an end-of-file character that is absorbed by the OS, and some contain both. It might be important to find out what the file type/code page that they're sending the file as actually is.

    BTW, SQL Server doesn't support the very common UTF-8 until you get to SQL Server 2016.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)