import flat file through SSIS

  • Marcelo,

    Did that do it for you or are you still having problems?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • hi there! sorry I didn't get back to you, but got caught with work stuff. I briefly tried the code you sent against a txt file I created and didn't work. I think it ;s because I used the sample on the thread, instead of downloading the sample file in the temp folder of the server (I'm embarrassed to ask :doze: , but I've looked around and couldn't retrieve the txt file. how I can get to the temp folder?

    So, let's say it all works out OK using what you provide, which I wouldn't doubt, does that mean I would have to have ea text file formatted in a way were the delimiter are spaced, just as you showed on the post? ("123" | "John Smith" | "6050 N Washington Ave" ) I can see how that work work.

    Also, this sample was fairly small, but would it also apply to something at a larger scale, say 50+ fields and 10+ million records.

    I haven't try this yet, but would SQL be able to export the delimited text files as in the example above?

    Last question: if you need to export data from SQL to delimited text files, what would be the most optimal set of delimiters to use? I think anything we choose will have issues one way or another, but what would be the 'one' the causes the least problems or at least the easiest to resolve.

    I searched online for some tips and reviews and most pointed to the tab-delimited. Would a plain straight csv be better?

    In my case, the issue is with the pipes and quotes, both being used as delimiters and also as part of the data.

    In any event. This is a work in progress for me so I'll be around!

    Many thanks!

    marcelo

  • Chelo (8/26/2011)


    hi there! sorry I didn't get back to you, but got caught with work stuff. I briefly tried the code you sent against a txt file I created and didn't work. I think it ;s because I used the sample on the thread, instead of downloading the sample file in the temp folder of the server (I'm embarrassed to ask :doze: , but I've looked around and couldn't retrieve the txt file. how I can get to the temp folder?

    I used the example data you provided and created a file in C:\Temp on my server. The code will work with a drive letter on physical drives on the server or as a UNC to a given share.

    So, let's say it all works out OK using what you provide, which I wouldn't doubt, does that mean I would have to have ea text file formatted in a way were the delimiter are spaced, just as you showed on the post? ("123" | "John Smith" | "6050 N Washington Ave" ) I can see how that work work.

    No. Again, I was just following the example file you gave. If you'd like to attach an actual file (please, nothing that would violate anyone's privacy) for me to work with, I'd be happy to show you what needs to be done.

    Also, this sample was fairly small, but would it also apply to something at a larger scale, say 50+ fields and 10+ million records.

    Of course. I haven't timed such a thing in years but, about 4 years ago, I measured BULK INSERT for a job and it loaded at the rate of 5.1 Million rows 20 columns wide every 60 seconds. Of course, your milage may vary.

    I haven't try this yet, but would SQL be able to export the delimited text files as in the example above?

    Yes but that's a different subject. If you want to do it via T-SQL, it would also mean that I'd have to show you the proper way to use xp_CmdShell so that it doesn't become a security risk. Of course, you could schedule a script task to run it without using xp_CmdShell through a batch file and OSQL.

    Last question: if you need to export data from SQL to delimited text files, what would be the most optimal set of delimiters to use? I think anything we choose will have issues one way or another, but what would be the 'one' the causes the least problems or at least the easiest to resolve.

    I searched online for some tips and reviews and most pointed to the tab-delimited. Would a plain straight csv be better?

    It depends... are there any commas in the data you're trying to export?

    In my case, the issue is with the pipes and quotes, both being used as delimiters and also as part of the data.

    The pipes and quotes don't constitute a problem. The code I posted solves the import side of that problem.

    Many thanks!

    marcelo

    You bet. Thanks for the feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff- thanks again for all the feedback! Had to all this on hold till after my vacation (reason good enough!) 🙂

    A gave the data upload a new try, but this time using the SQL Import/Export wizard where I specified the delimiter to be the 3 characters together ("|"). Since this would be a new field delimiter, the downsize of this option is that ea row would contain a quote as the first char in the first field, and a quote as the last char on the last field. A 'replace' would clean those up though. On the bright side, all the data was upload and the quotes and pipe symbols embedded within the data didn't cause any trouble.

    One problem I run into was with the 'suggested' data types and field length. It seems SQL can only suggest types and lengths based on the first 10,000 records, which isn't quite enough when working on millions or records. It was a bit of trial/error, but got it done finally.

    thanks again for all your help!!!

    Best,

    m

Viewing 4 posts - 31 through 33 (of 33 total)

You must be logged in to reply to this topic. Login to reply