• Eric M Russell - Friday, November 16, 2018 8:13 AM

    Jeff Moden - Friday, November 16, 2018 6:28 AM

    Perhaps a bit deeper than just a database option, I'd like to see some table options that would give NoSQL a run for their money when it comes to ETL and "Minimal Logging".
    ...

    Jeff, I have considered this 100 times in the past, and I was thinking this again just yesterday when an ETL job ingesting 200 million rows filled the log drive. SQL Server (and most any RDMS) is an excellent tool for querying, transforming, and modeling relational data, but it's mediocre at aggregate querying and simply terrible at bulk loading data. What I've done in the past was select / stage into flat text files, perform column transformation, analysis, and QA on the text files, and then finally batch load prepped and cleansed data from there into SQL Server tables. On those occasions where a table load went wrong and needed to be "rolled back", I would delete the load, in batches of 10,000 rows at a time using ETL RunID. I was ingesting 100s millions of rows on a low end 2 CPU / 16 GB RAM server back in the early 2000s with little DBA support, and it worked because I was circumventing most of the transaction logging during normal daily usage.

    Ditto that and that's exactly what I'm talking about.  I've rather grown to like the fact that I don't have to write any loops in SQL Server (SELECT is a loop behind the scenes) and it makes life easy except for all the bloody logging it does for updates or you happen to miss the mark for minimal logging on an insert.  The work arounds do work like you say but, lordy, it could be so much easier.

    That and fix BULK INSERT/BCP to automagically handle true CSV and create a BULK EXPORT.  Then we'd be rockin'. "D

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)