Best practice(s) for refreshing data in several tables

  • Mike Scalise wrote:

    Jeff, You ask some great questions. Are you suggesting that I create a separate database with only views (that are named with the original names of the tables I would've refreshed) and that simply query the source database for the columns and records I need? Any existing reports/queries against this target database would continue to work because the view names would be the table names (and I would have deleted the tables so the names would be available for the views). Do I have that right? Then there's no refreshing whatsoever. I'm not sure about performance and if indexed views are event a possibility. I think I need to specify WITH SCHEMABINDING when creating the views, so I don't know how that would work across databases and if I could replace the source database each day without an issue. Thanks, Mike

    Yes... That's the first possibility that I'm talking about and, instead of using views, use synonyms (same as a pass-through view).

    And, no... you don't need or even want "indexed views".  The underlying indexes of the tables will work just fine as if the tables were local.

    The only disadvantages are in this particular case is, if the reporting code isn't optimized and/or there's a very heavy load on the source tables, there will be performance issues.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks, Jeff. I'm not at all opposed to using synonyms, but using views (with original table names) on the target would allow me to use the queries I would have used to refresh tables on the target (and therefore any reporting tools wouldn't really notice any structural difference or that they're actually querying views instead of tables).

    So if I use synonyms instead of views (and the reporting tools use those instead of the views behind-the-scenes), what would the synonyms themselves point to so that I could query specific fields and format data from the source? Obviously I can name the synonyms with the original table names (as I had planned with the views, so I'm not concerned about that).

    Does my question make sense or am I just missing something?

    Thanks,

    Mike

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • Mike Scalise wrote:

    ...I'm currently restoring a backup of one of our production databases on another server daily...it serves as the source database for this ETL process I'm writing.

    ...My latest thought is to do a combination of DELETE (using EXISTS) + INSERT (using EXCEPT) to ensure the target tables always match the source. Depending on how fast (or slow) that process is, I may test the synonym trick that Phil mentioned before to make things even more efficient.

    Since your SOURCE database gets restored daily from another server, if you use VIEWS or SYNONYMS across databases then those items would not be accessible in the TARGET during the actual restore.  If that is acceptable to the users it's a valid solution, I use such methods frequently.

    If you can't have that downtime in the TARGET database, then the best option depends on how big each of those 500 tables are and what percentage of those rows change from day to day.  If the changes daily are few then some sort of script with UPDATE / INSERT statements (or MERGE statements) would work if you have a column that indicates a version or last changed date-time, or your EXISTS / EXCEPT script idea.

  • Mike Scalise wrote:

    ...So if I use synonyms instead of views (and the reporting tools use those instead of the views behind-the-scenes), what would the synonyms themselves point to so that I could query specific fields and format data from the source? Obviously I can name the synonyms with the original table names (as I had planned with the views, so I'm not concerned about that). Does my question make sense or am I just missing something? Thanks, Mike

    Wouldn't you still need a VIEW somewhere since you said the table definitions in the SOURCE and TARGET databases are slightly different?

  • Chris Harshman wrote:

    Mike Scalise wrote:

    ...So if I use synonyms instead of views (and the reporting tools use those instead of the views behind-the-scenes), what would the synonyms themselves point to so that I could query specific fields and format data from the source? Obviously I can name the synonyms with the original table names (as I had planned with the views, so I'm not concerned about that). Does my question make sense or am I just missing something? Thanks, Mike

    Wouldn't you still need a VIEW somewhere since you said the table definitions in the SOURCE and TARGET databases are slightly different?

    I think I would need views, which is why I'm wondering what the benefit of the synonym is at that point. Why wouldn't I just not have any tables whatsoever in the target database and instead just have a view for each table (named the same as the original table it represents). At that point, is there even a need for synonyms?

    Mike

    Mike Scalise, PMP
    https://www.michaelscalise.com

  • I came too late into this conversation and I hope I make sense.

    Option 2 is not bad.  However, if I had to take that route then I would first create a database called "Staging" or something like that. Then instead of having # or @ tables you can use the physical tables that can be used for compiling the records that will be insert into truncated tables of yours.  I would even take another step further on this and create some LOG tables in Staging for recording your events/errors/warning/etc.

    Hope this helps.

     

     

     

    Cheers,
    John Esraelo

  • John Esraelo-498130 wrote:

    I came too late into this conversation and I hope I make sense. Option 2 is not bad.  However, if I had to take that route then I would first create a database called "Staging" or something like that. Then instead of having # or @ tables you can use the physical tables that can be used for compiling the records that will be insert into truncated tables of yours.  I would even take another step further on this and create some LOG tables in Staging for recording your events/errors/warning/etc. Hope this helps.      

     

    This does help! Thank you for your suggestion/input.

     

    Mike

    Mike Scalise, PMP
    https://www.michaelscalise.com

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply