Working with stored text file in TSQL

  • I'm looking for a way to look into a field containing a text file and work with the content line by line. Our current environment will not allow file/OS access so the the only way i can reach this text file is to have it exist in a varchar(max) or whichever data type suggested.

    Thank you in advance for any advise...

    Keith

  • keith.westberg (3/20/2014)


    ...a field containing a text file and work with the content line by line

    What's actually stored in the "field"? The path to the text file or the actual content from the text file?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The text file itself.

  • In binary form or the actual character string of contents from within the text file?

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • a lot more information as far as details would help, but

    if the data is important line by line, then for me, each line should be stored as a row/column in a table.

    it sounds like you want to duplicate an existing process, but do it in SQL instead.

    so the right thing to do would be to import each line as a row of data, and chop that data up into relevant columns; that way you can better search and manipulate the data itself.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • I apologize for not laying the groundwork... so here's the process.

    User uploads a file to website application. The site parses the text file and generates insert parameters for the called sproc on sql server 2008 r2. This process is repeated for every line found in text file besides the header line which is parsed for appropriate columns. Before the call is made to the db, the business logic scrubs the content to assure data quality and format. Now some text files can be 100k lines, in which case this app makes 100k db calls.

    My search is for a way for the user to simply upload the file to the db directly and kick off a process which would take place only on the db that walks the file line by line. Our users are NOT granted access to an area to drop files off for bulk loading. And the server admins will NOT provide a location for the db to reach out to use file system for its data source.

    The only option i can see with these limitations, is to allow the user to push the entire file to the database and work the process from there. Granted it would not be the sexiest thing on the block... but its worth a run.

    Any tips or sources for like process would be amazing... thank you

  • @mydoggiejessie: text, not binary.

  • @mydoggiejessie: Sorry -- it would be stored as actual character string if that's the way forward

  • keith.westberg (3/21/2014)


    I apologize for not laying the groundwork... so here's the process.

    User uploads a file to website application. The site parses the text file and generates insert parameters for the called sproc on sql server 2008 r2. This process is repeated for every line found in text file besides the header line which is parsed for appropriate columns. Before the call is made to the db, the business logic scrubs the content to assure data quality and format. Now some text files can be 100k lines, in which case this app makes 100k db calls.

    My search is for a way for the user to simply upload the file to the db directly and kick off a process which would take place only on the db that walks the file line by line. Our users are NOT granted access to an area to drop files off for bulk loading. And the server admins will NOT provide a location for the db to reach out to use file system for its data source.

    The only option i can see with these limitations, is to allow the user to push the entire file to the database and work the process from there. Granted it would not be the sexiest thing on the block... but its worth a run.

    Any tips or sources for like process would be amazing... thank you

    This won't help except, maybe, to know that you have someone that agrees with you but I just have to say that I'm totally amazed at some DBAs/Server Admins . Yes, DBAs/Server Admins absolutely need to be extremely concerned about security, size, and safety but DBAs/Server Admins aren't supposed to be roadblocks. They're (we're) supposed to be enablers of safely doing things the right way and uploading a file and then bulk loading it is absolutely the correct method for doing this rather than sending a 100k row file across the pipe one bloody row at a time. The system(s) will pay dearly when just a handful of users try to upload such files in the RBAR fashion that you've been constrained to. The web Server is going to take a beating, as well.

    There's absolutely no reason why an uploadable FTPS or similar site couldn't be constructed and offered to the users with logins and passwords that would preclude users from seeing each other's files.

    Maybe there are some mitigating factors that I'm not aware of but, from here, it seems that the DBAs/Server Admins have erected a totally unjustifiable roadblock for what appears to be an important process. I don't know if it's arrogance, ignorance, or slothfulness that has cause them to do this to you but they need to realize that you're all working for the same company and when it comes to handling data, they need to be enablers instead of nay sayers because they're the ones that know how to do this in a secure fashion and with performance in mind.

    I apologize for the rant but this kind of stuff just bugs the hell out of me.

    Shifting gears to the problem at hand, you said...

    User uploads a file to website application.

    What is the process for that happening? Is the file loaded onto the Website box itself and then the app reads from that file or ???

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you Jeff... and I hear you. I wish this was the first time a "we don't do that here" has bit me.

    And again, I failed to give total clarity over the environment. This data load is performed by a single user and is done on the first of the month. So you are absolutely right on the "oh my god, how many calls will this app make" statement. The app is more of an analyst tool for comparing dates and values over time. The analyst(s) have been, for as long as i know, working other tasks involving other systems during the 1st to the 3rd, so I know for sure this process has been in place for at least 4 years.

    I can't explain in detail why the lock down, but it does exist and it is a hurdle I have been wanting to jump since i arrived. The process I mentioned is working, but with some adjustments. For instance we had to limit the file size to 30mg in order to avoid page timeouts.

    I have scoured the web looking for examples, articles, tips/tricks... and I get nothing that even come close to walking a varchar and working with the content from top to bottom. I even used my trusty expert exchange that I use when I'm really stuck, and even that left me empty.

    Is it not possible?

  • I'm still a bit confused though. You say the user is uploading a file... what is that process? Is the user uploading a file to the Web Server and then your app is loading from that file or ???

    I need to understand what the current process and file location is so I can try to come up with a better alternative.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yes Jeff. We are using asp.net 3.5 to present the users/analysts with the tools they require. The data provider uploads files on the 1st of the month. We use the native upload control for .Net. This control places the file in a predefined folder within the apps folder structure. Once it successfully saves to this folder, we begin walking the file to generate the values required for the sproc. When this walk reaches the end of the file, we close the file and delete it.

    Did I cover the details enough? Besides what's mentioned above, there really isn't much to it. We don't use ajax or any other eye candy so the controls and logic are pretty straight forward.

    Keith

  • If that's the case, then you could use SQLBULKCOPY in C# to load the rows into the database in batches much like BCP or BULK INSERT would. That will be MUCH faster than doing an INSERT per row.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Well that looks promising. Let me do some reading and see where it leads. I think I may have looked at that once long ago, but shelved it because i needed to do a row by row validation and this if i recall was a firehouse with the all or nothing approach.

    Let me do some google csi and see how things have changed, if at all.

  • keith.westberg (3/25/2014)


    Well that looks promising. Let me do some reading and see where it leads. I think I may have looked at that once long ago, but shelved it because i needed to do a row by row validation and this if i recall was a firehouse with the all or nothing approach.

    Let me do some google csi and see how things have changed, if at all.

    I you need to, you can still do "row-by-row" validations. Load 10 or 50 K rows into an array, validate the data, and bulk it over to the server.

    A better way (IMHO) is to bulk all of it over to a staging table on the server and validate it there. Then copy the validated rows to their final resting places and bad data to an errata table for repairs or failure reporting.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply