• Ol'SureHand (2/15/2009)


    Hope this question does not need its own thread ...

    Data warehousing practitioners seem to have a constraint on EVERY single column. The above code will only preserve necessary constraints, but needs doing before the table was (re)populated, thus each constraint will be checked for each row during the extraction.

    DBA types think it is better to import the data into the table, then run an UPDATE script on each column that essentially needs to be set (e.g. Update tblRepopulated SET thisColumn = 0 where thisColumn IS NULL).

    Which practice is more efficient with large data volumes?

    As soon as possible...

    If it is e.g. an SSIS package that imports your data and that is capable on handling the null to default switch, I would suggest to do it at that time.

    If the strategy is to load the data as is into a staging area and only then start to prepare it for the DWH environment, ... that may be a valid choice to disrupt your production OLTP as less as possible.

    It all depends on the timeframe you have to pull your data into your DWH.

    Rule no 1 says: "tell your system what you know"

    In this case that would mean ... don't load data that you know must be modified to a predefined value.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me