Muting the Immutable

  • David.Poole (1/19/2016)


    I can understand doing this for a very limited number of datasets. I think it would create problems for Kimball data warehouse systems where we need slow changing dimensions. External standards data such as ISO and ABI datasets do change and you need control and auditability when someone changes it.

    I did do something like this with a union query to generate a huge range of numbers to act as a tally table but to be honest I've fallen back to creating a physical tally table instead. Even a few hundred thousand records in a tally table is no real burden, particularly if compression and column store indexes are options.

    I've only done this approach (MERGE statements) of static data maintenance on small data sets. Usually on type data and state flag definitions. For large data sets I'd probably write some command line tool that gets executed in the post-deploy script. Using MERGE statements for large sets would become unwieldy, IMO.

  • qbrt (1/19/2016)


    Phil Parkin (1/18/2016)


    Merge can be idempotent. I would recommend doing the UPDATE part of the MERGE only if an update is required. I do this via the

    WHERE NOT EXISTS (SELECT source INTERSECT SELECT target)

    Technique, which handles differences and NULLS. Otherwise the UPDATE happens every time regardless.

    In my case, I am using the post-deployment script feature of SSDT as part of the declarative maintenance of the database. That is, the MERGE statements are put in place to maintain the static data in their final form. So, the update statements must remain. These scripts are always part of the database project and there are no stacks of scripts to maintain that must be executed in a specific way depending on what version of the database is being upgraded. That's the thinking that led me to use the IF EXISTS() statement. Although, removing them (as you suggest) for MERGE statements makes sense to me. I suspect that is the type of tooling that Phil was also insinuating in his article. A tool that would tie the static data to the schema in a declarative way without having to use DML statements. I was thinking some type of editor that ties data to a table (data can be from different sources, such as JSON or XML files), this data gets compiled into the DACPAC (similar to a BACPAC) and the proper data sync process is determined during the deploy of the database. That way I don't have to worry about writing code and figure out how to sync data. The SSDT deploy tool will do that for me and maintain integrity of the data. Of course, this is for static data only that can be validated during compile/build time.

    I do not understand what you mean by this:

    I suspect that is the type of tooling that Phil was also insinuating in his article. A tool that would tie the static data to the schema in a declarative way without having to use DML statements.

    MERGE is a DML statement. Surrounding it with IF EXISTS() does not somehow change that.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin (1/19/2016)


    qbrt (1/19/2016)


    Phil Parkin (1/18/2016)


    Merge can be idempotent. I would recommend doing the UPDATE part of the MERGE only if an update is required. I do this via the

    WHERE NOT EXISTS (SELECT source INTERSECT SELECT target)

    Technique, which handles differences and NULLS. Otherwise the UPDATE happens every time regardless.

    In my case, I am using the post-deployment script feature of SSDT as part of the declarative maintenance of the database. That is, the MERGE statements are put in place to maintain the static data in their final form. So, the update statements must remain. These scripts are always part of the database project and there are no stacks of scripts to maintain that must be executed in a specific way depending on what version of the database is being upgraded. That's the thinking that led me to use the IF EXISTS() statement. Although, removing them (as you suggest) for MERGE statements makes sense to me. I suspect that is the type of tooling that Phil was also insinuating in his article. A tool that would tie the static data to the schema in a declarative way without having to use DML statements. I was thinking some type of editor that ties data to a table (data can be from different sources, such as JSON or XML files), this data gets compiled into the DACPAC (similar to a BACPAC) and the proper data sync process is determined during the deploy of the database. That way I don't have to worry about writing code and figure out how to sync data. The SSDT deploy tool will do that for me and maintain integrity of the data. Of course, this is for static data only that can be validated during compile/build time.

    I do not understand what you mean by this:

    I suspect that is the type of tooling that Phil was also insinuating in his article. A tool that would tie the static data to the schema in a declarative way without having to use DML statements.

    MERGE is a DML statement. Surrounding it with IF EXISTS() does not somehow change that.

    **** The whole thing? Eeek! I need to brush up on my english.

    Right. MERGE is a DML. The point is it would be awesome if I didn't have to use DML statements to maintain static data with my database project via SSDT tooling and the static data would be part of the declared schema of the database. ie: In the file that contains the CREATE TABLE DDL statements, I could perhaps attach a reference to a JSON file that contains a list of all the static records I wish that table to have. The SSDT tool will make sure that table has only those records in it upon a successful deploy.

  • qbrt (1/19/2016)


    Phil Parkin (1/19/2016)


    qbrt (1/19/2016)


    Phil Parkin (1/18/2016)


    Merge can be idempotent. I would recommend doing the UPDATE part of the MERGE only if an update is required. I do this via the

    WHERE NOT EXISTS (SELECT source INTERSECT SELECT target)

    Technique, which handles differences and NULLS. Otherwise the UPDATE happens every time regardless.

    In my case, I am using the post-deployment script feature of SSDT as part of the declarative maintenance of the database. That is, the MERGE statements are put in place to maintain the static data in their final form. So, the update statements must remain. These scripts are always part of the database project and there are no stacks of scripts to maintain that must be executed in a specific way depending on what version of the database is being upgraded. That's the thinking that led me to use the IF EXISTS() statement. Although, removing them (as you suggest) for MERGE statements makes sense to me. I suspect that is the type of tooling that Phil was also insinuating in his article. A tool that would tie the static data to the schema in a declarative way without having to use DML statements. I was thinking some type of editor that ties data to a table (data can be from different sources, such as JSON or XML files), this data gets compiled into the DACPAC (similar to a BACPAC) and the proper data sync process is determined during the deploy of the database. That way I don't have to worry about writing code and figure out how to sync data. The SSDT deploy tool will do that for me and maintain integrity of the data. Of course, this is for static data only that can be validated during compile/build time.

    I do not understand what you mean by this:

    I suspect that is the type of tooling that Phil was also insinuating in his article. A tool that would tie the static data to the schema in a declarative way without having to use DML statements.

    MERGE is a DML statement. Surrounding it with IF EXISTS() does not somehow change that.

    **** The whole thing? Eeek! I need to brush up on my english.

    Right. MERGE is a DML. The point is it would be awesome if I didn't have to use DML statements to maintain static data with my database project via SSDT tooling and the static data would be part of the declared schema of the database. ie: In the file that contains the CREATE TABLE DDL statements, I could perhaps attach a reference to a JSON file that contains a list of all the static records I wish that table to have. The SSDT tool will make sure that table has only those records in it upon a successful deploy.

    Not the whole thing, just the bit I (re)quoted 🙂 Your English is just fine.

    OK, I misunderstood – now I know what you meant.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Has anybody on this thread heard anything about Microsoft adding data to the declarative nature of SSDT projects? Sure would be a neat feature.

    Hakim Ali
    www.sqlzen.com

  • I think I remember playing around with an early attempt at handling data within SSDT projects and it wasn't quite intuitive. Post-Deploy and MERGE type statements do work, especially with some sort of temp table or such to compare against, but it's not an easy problem to handle.  We see some of that done within the "bacpac" structure, but that's a full on "this data should be there" and doesn't necessarily take into account cases where you might have a mixture of data.  I know some systems have a "system-required" set of data as well as a "user-entered" set of data. You _need_ the system data, but you can't completely replace the table's data with it because there's that set of user data as well.

    For our purpose, the post-deploy scripts have worked pretty well, but we haven't had to handle a large number of records/tables. If we had a large number of tables to maintain, we'd probably be looking at some other solution that could better handle comparing a static set of data with the data in the table and update it to match.  Red-Gate has an advantage here with Data Compare, but MS could likely do something along those lines with its limited data compare in SSDT.  Either way, if someone added new columns with required data, you'd need to manage that dataset.

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply