Going From Many Schemas To One Schema

  • Comments posted to this topic are about the item Going From Many Schemas To One Schema

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • It's an interesting technique - but not one I would recommend!

    The simple version has this caveat: it works "as long as they share the same primary key and the shared columns have the same names" (quote from the article). Verifying that is what takes time when preparing a data merge (be it a one-time operation or an ETL job that is to be run repeatedly). Building the actual statement is a matter of seconds, especially if you use SSMS' drag&drop techniques to copy the column list from the object explorer. The technique shown in this article doe indeed save you a couple of seconds and a few keystrokes, but at the price of having to use dynamic SQL (which means you have to run under a user account with elavated rights, and you incrue the risk of injection attacks). And worse - it invites laziness. There might be situations where the same name has been used for essentially different colum; I'll exclude those when working manuallly, but they'll be included by this procedure. Or there might be some subtle difference in how data is represented, making some conversion required.

    The advanced version circuvenes some (not all!) of these problems. But I think that populating the table with column equivalences would take me more time than simply writing the INSERT, UPDATE, or MERGE statement.

    Therefor, I consider this article in interesting mental exercise and good information on the information schema tables, but not a technique to be used in a production system.

    On a final note: Any system with a case sensitive database collation will cause these queries to fail; the table and column names of the information schema tables are all uppercase.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • The use of the term "schema" in this article is confusing because the term has a specific meaning in SQL Server which differs from its use here. Perhaps using the term "table definition" would be clearer.

  • Hugo Kornelis (8/9/2010)


    It's an interesting technique - but not one I would recommend!

    The simple version has this caveat: it works "as long as they share the same primary key and the shared columns have the same names" (quote from the article). Verifying that is what takes time when preparing a data merge (be it a one-time operation or an ETL job that is to be run repeatedly). Building the actual statement is a matter of seconds, especially if you use SSMS' drag&drop techniques to copy the column list from the object explorer. The technique shown in this article doe indeed save you a couple of seconds and a few keystrokes, but at the price of having to use dynamic SQL (which means you have to run under a user account with elavated rights, and you incrue the risk of injection attacks). And worse - it invites laziness. There might be situations where the same name has been used for essentially different colum; I'll exclude those when working manuallly, but they'll be included by this procedure. Or there might be some subtle difference in how data is represented, making some conversion required.

    You make some valid criticisms, and when dealing with a relatively small number of tables, I agree with you. Given just a small handful of tables, it is much better to hand craft the statements and ensure they are correct and perform exactly what you want and to avoid the dynamic SQL this entails..

    I first used this technique when dealing with moving over a dozen slight variations of a table into one consolidated analysis table, and moreover in that case for certain business reasons I expected many more slight variations of that table to be generated and need to be merged in as well. In that case, generating it dynamically was vastly easier to write and more maintainable than statically creating the insert code for every one of the variations.

    The second time I used it was when I had around 30 tables all of which had one of five slight variations of a schema that all needed to be merged into an analysis table in a one-time ETL job.

    As to whether you should use it in production, I would certainly approach that with caution, but I try to avoid any dynamic SQL in production where that is possible. As mentioned, I have used this in moving from production servers to analysis/reporting servers. I have also used it for ETL for data that was going into production, but that always had a staging location involved for final verification before moving it to production

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • fahey.jonathan (8/9/2010)


    The use of the term "schema" in this article is confusing because the term has a specific meaning in SQL Server which differs from its use here. Perhaps using the term "table definition" would be clearer.

    You have a good point. I used the word schema as it is used in database theory, not as it is used specifically in SQL Server. Next time I talk about it I will be more careful to differentiate. Thanks for the suggestion!

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • Thanks Timothy for taking the time to write this up and share it with us.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Good article sir. I still can't find the +1 button on this site though 😀

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply