To T-SQL or not to T-SQL?

  • sqldriver (2/17/2014)


    I think doing that bit of pre-processing with PS should be fine; it's just one line, even if it is ~75 times.

    It would be interesting to see which approach was faster. Preprocess the files to a single file per report type or import each file separately into a common table.

    Is the pre-processing route the way you're going to go on this?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sqldriver (2/15/2014)


    I really hate SSIS and was hoping to avoid it. I guess I could try doing this all in another language that handles large file operations better. Of course, since these are all going to garbage staging tables, I could always generate dynamic import statements with extra variables. Hm.

    Who cares about filling up transaction logs, anyway?! :w00t:

    Heh, sorry to hear, but this is precisely why I use things like SSIS. A for Each loop over a data flow with a derived column to add the source filename in. You may not like SSIS, but to me using Powershell or another language for this is like going to Canada via Mexico.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I appreciate that SSIS is a good tool for this situation, but I'm in kind of a weird environment where it's not accessible, on top of me not liking it, so it's not just a matter of preference.

    I can use PowerShell to kick out some SQL based on file names, which is essentially what I've done:

    Get-ChildItem C:\Users\$env:username\Desktop\$mydir |

    Where-Object { $_.Name -match "^\w+.txt$" } |

    Foreach-Object { $dbn = $_.Name.Substring(0,6)

    $table = $_.Name -replace '.txt', ''

    $altr = "alter table $schema.s_$table add column dbn varchar (100) default '$dbn';`n`n" |

    Out-File -FilePath C:\Users\$env:username\Desktop\SQL\$schema\Alter.sql -Append -Encoding default }

    So after I load all my files, I run this and it fills in the information for me. When I add it before loading, it does not populate.

    Of course, now I'm running into an issue where SQL thinks some reports have more columns in them than they actually do. Ho-hum. 😛

  • sqldriver (2/18/2014)


    I appreciate that SSIS is a good tool for this situation, but I'm in kind of a weird environment where it's not accessible, on top of me not liking it, so it's not just a matter of preference.

    Ick.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (2/18/2014)


    sqldriver (2/18/2014)


    I appreciate that SSIS is a good tool for this situation, but I'm in kind of a weird environment where it's not accessible, on top of me not liking it, so it's not just a matter of preference.

    Ick.

    You don't know the half of it.

  • I guess I don't understand the need for any preprocessing here. Just load the each group of files into a working table (heaps, to be sure) and do your validation there. Thr only time that you might run into a problem is if the first line of a file has different delimiters than rest of the lines in the file.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/18/2014)


    I guess I don't understand the need for any preprocessing here. Just load the each group of files into a working table (heaps, to be sure) and do your validation there. Thr only time that you might run into a problem is if the first line of a file has different delimiters than rest of the lines in the file.

    Hi Jeff,

    I'm not doing any pre-processing TO the files I'm loading now. What I'm using PS to do is:

    1) Upload my files to S3

    2) Take the headers from each file, strip out unusable characters from the headers (these animals sometimes start their headers off with numbers inside brackets), and generate create table statements from the fixed headers

    3) Create bulk inserts based on file names

    4) Create alter table statements to add two columns to each table with default values based on file names that get run after I load

    So it's really just using PS to create SQL statements that I run. I think it's my best option, given the shortcomings of my environment.

    Thanks

  • sqldriver (2/18/2014)


    Jeff Moden (2/18/2014)


    I guess I don't understand the need for any preprocessing here. Just load the each group of files into a working table (heaps, to be sure) and do your validation there. Thr only time that you might run into a problem is if the first line of a file has different delimiters than rest of the lines in the file.

    Hi Jeff,

    I'm not doing any pre-processing TO the files I'm loading now. What I'm using PS to do is:

    1) Upload my files to S3

    2) Take the headers from each file, strip out unusable characters from the headers (these animals sometimes start their headers off with numbers inside brackets), and generate create table statements from the fixed headers

    3) Create bulk inserts based on file names

    4) Create alter table statements to add two columns to each table with default values based on file names that get run after I load

    So it's really just using PS to create SQL statements that I run. I think it's my best option, given the shortcomings of my environment.

    Thanks

    To be honest, I'd use T-SQL to create the T-SQL statements. 🙂 It's called "dynamic SQL". 😉

    Anyway, it sounds like you're all set. Is that a true statement?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/18/2014)


    sqldriver (2/18/2014)


    Jeff Moden (2/18/2014)


    I guess I don't understand the need for any preprocessing here. Just load the each group of files into a working table (heaps, to be sure) and do your validation there. Thr only time that you might run into a problem is if the first line of a file has different delimiters than rest of the lines in the file.

    Hi Jeff,

    I'm not doing any pre-processing TO the files I'm loading now. What I'm using PS to do is:

    1) Upload my files to S3

    2) Take the headers from each file, strip out unusable characters from the headers (these animals sometimes start their headers off with numbers inside brackets), and generate create table statements from the fixed headers

    3) Create bulk inserts based on file names

    4) Create alter table statements to add two columns to each table with default values based on file names that get run after I load

    So it's really just using PS to create SQL statements that I run. I think it's my best option, given the shortcomings of my environment.

    Thanks

    To be honest, I'd use T-SQL to create the T-SQL statements. 🙂 It's called "dynamic SQL". 😉

    Anyway, it sounds like you're all set. Is that a true statement?

    Would that I could. Few things in life bring me the joy that obsessively counting 's and +s and having QUOTNAME() on the clipboard does. But, like I've mentioned before, I'm not in a standard environment - server is on AWS 🙁

    So I'm kind of stuck using PS to grind out the dynamic SQL. It's enough to make a guy update his resume.

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply