How do I code a three-level table name in SSDT that will honour environment settings in deployed job?

  • Hi,

    Say I have these databases on my server:

    Extract_Dev
    Extract_Prod
    Transform_Dev
    Transform_Prod
    Load_Dev
    Load_Prod

    And say my SSDT job has an SQL task such as:

    INSERT INTO Transform_Dev.dbo.SomeTable
    SELECT a.foo, b.bar
    FROM Extract_Dev.dbo.SomeTable a
    LEFT JOIN Load_Dev.dbo.Keys b
    ON a.blah=b.blah

    IOW, I am joining tables in two different databases, and inserting into a table in a third database.

    If I deploy this job using environments, then the connection string will change for the source and target databases based on the environment selected when the job is run (i.e. "dev" or "prod").  So far so good.  

    However, AFAIK that doesn't affect code within an Execute SQL task.  I can't work out how to write the above Execute SQL task without hard coded database names.

    Do I have to jump through hoops using parameters and dynamic SQL, or is there something simple I'm missing?

  • Scott In Sydney - Monday, January 8, 2018 8:13 PM

    Hi,

    Say I have these databases on my server:

    Extract_Dev
    Extract_Prod
    Transform_Dev
    Transform_Prod
    Load_Dev
    Load_Prod

    And say my SSDT job has an SQL task such as:

    INSERT INTO Transform_Dev.dbo.SomeTable
    SELECT a.foo, b.bar
    FROM Extract_Dev.dbo.SomeTable a
    LEFT JOIN Load_Dev.dbo.Keys b
    ON a.blah=b.blah

    IOW, I am joining tables in two different databases, and inserting into a table in a third database.

    If I deploy this job using environments, then the connection string will change for the source and target databases based on the environment selected when the job is run (i.e. "dev" or "prod").  So far so good.  

    However, AFAIK that doesn't affect code within an Execute SQL task.  I can't work out how to write the above Execute SQL task without hard coded database names.

    Do I have to jump through hoops using parameters and dynamic SQL, or is there something simple I'm missing?

    Nothing else simple that I can think of. The query isn't going to know which environment it's being executed in when the package is moved.
    I'm curious though - do you have Development and Production on the same server?

    Sue

  • NEVER user 3 or 4 part naming in your queries.  You end up with a mess like the one you have right now. 

    Instead, create synonyms in each environment that will be called by a common 2 part name that points to a 3 part name.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for the replies...

    @sue:  Development and Production are on the same server, differentiated by separate databases.  Probably not best practice but not under my control.

    @jeff:  I've used synonyms occasionally, didn't think of using them here.  If I could create a synonym at the database level that might be acceptable.  However, I don't want to create a synonym at the object (i.e. table) level; for my application, that would require too many synonyms to too many tables and a maintenance hassle.

    Since my original post, the approach I'm now taking is to use views:  instead of embedding the code in SSDT, I'm creating a view on the server:

    CREATE VIEW blah
    AS
    (
    SELECT a.foo, b.bar
    FROM Extract_Dev.dbo.SomeTable a
    LEFT JOIN Load_Dev.dbo.Keys b
    ON a.blah=b.blah
    )

    My package is then:

    OLEDB Source (the view)
    Row Count
    OLEDB Target (fastload into target table)

    The Dev view joins Dev tables, the Prod view (once created) will join Prod tables, and the environment (once created) will select the correct connection string to source the correct view.

    What I wish SSDT supported is the idea of a connection alias property.  Say I have connections Extract.conmgr and Load.conmgr.  Say they have the property "Alias", which I assign as "Extract" and "Load" respectively.  In my SSDT environment, those connection point to "Dev".  But, once deployed, I create environments Dev and Prod, with the correct connection strings.

    Finally, my vision is that SSDT supported some syntax in the SQL queries where the alias could be referenced.  For example:

    SELECT a.foo, b.bar
    FROM <<Extract>>.dbo.SomeTable a
    LEFT JOIN <<Load>>.dbo.Keys b
    ON a.blah=b.blah

    SSDT would then replace the alias with the current DB_NAME() at runtime based on the environment under which the connection manager is running, before sending the resolved code to the server.

    As I see it, SSDT supports dynamic connections via environments if a connection manager is used, but this support is lacking with embedded SQL.  IMO this is a shortcoming in the product.  But hey, that's just my opinion.

    I'd add this as a suggestion for MS to consider but I'm not familiar with Microsoft Connect (I can't see where SSDT bugs/suggestions are to be entered).

  • You can configure parameters to control how the job executes.

    For an Execute SQL Task you want to use the parameters to define package variables with the database names.  In the SQL task you could use the database variables as input parameters to define a SQL query in a varchar variable, then execute it with sp_executeSQL.  An alternative would be to have another package variable that used the configured variable values in a variable expression to create the same T-SQL query, then use an expression property on the Execute SQL task to set the SQL Command value to the query variable.  If you're worried about SQL injection, a T-SQL script that builds the query is probably a better place to put validation code that looks at the input parameters.  If the query-building process is more complex, or maybe you want to pass in one parameter and have some more complex process derive all the server, database, and table names required based on some logic from there, you might want to use a Script Task with a lot of .Net code to define the final query text.  Write this out to a package variable, and the following Execute SQL task (with Expression property copying the package variable value to the SQL Command property) can execute it.  Or the query could be executed in the Script Task, depending on what you want to do with the results.

    If you want to use a Data Flow task to do all the work, create several connection managers and reference them in the data flow.  The package configuration can directly set the ServerName and InitialCatalog properties of the connection managers, or the configuration can set package variable values and expressions on the connection managers can set connection manager properties from the package variables.  Like above, you could start with a Script Task that takes one configured package variable value (like "Dev", "Test", or "Prod") and sets as many other variables as you need with all the database names.  The Data Flow components are not dynamic, this approach requires the same table names in every database.

    All of these alternatives are some form of dynamic SQL.  The only difference is where the decisions take place.  Also how much documentation it takes to explain what is going on.  You can imagine the fun looking at a strange package and thinking the InitialCatalog property on a connection manager is wrong.  But then you notice that it is set by an expression from a package variable.  If you check the entire package, you may find a Script Task that is setting that variable, based on another variable with a strange value.  Then you check the package configuration and find the original parameter is coming from an XML configuration file.  Will you then go another step and find that the XML config file value is being overridden by a commandline setting defined in the SQL Agent job that calls the package?

    It's great to find a way the package can automatically do whatever you want for any environment, based on as little as a single input parameter.  But when you document it imagine you're the guy that comes along six months later after the original author decided to go somewhere else where his genius is appreciated.

  • @scott Coleman

    Thanks for the reply.

    Do I have to jump through hoops using parameters and dynamic SQL, or is there something simple I'm missing? 

    I was hoping SSDT supported dynamic database names at run time for embedded SQL out-of-the-box, analogous to how it supports dynamic connections for connection managers via environment settings.

    I could create a project level parameter environment=dev|prod (we don't have test), and set the right value when I deploy the project, and then use dynamic SQL in all my embedded SQL (that requires three level names).  But I was hoping Microsoft had a better idea of how developers work and made this easier for us.

    (I'm relatively new to SSDT but am growing to like it - compared to the alternatives I have available.  But you have to admit it's a convoluted mishmash of technologies.  Perhaps MS will integrate the technologies better in the next release.  Still, what's the incentive if SSDT is a free download?)

  • You keep saying "SSDT" but it sounds like you mean "SSIS", is that right? SSDT and SSIS are not the same thing (SSIS is a tooling set included in SSDT, but SSDT is much more than SSIS).

    Anyway, it is possible to do "dynamic" SQL of a sort in SSIS, many (but not all) of the widgets let you execute SQL based on a package variable that can be based on an expression. That expression can reference package parameters, and those parameters can be linked to an "environment" when you deploy the package. And you can have different "environments" at deployment that set the parameter values differently, then when the package executes, it "inherits" the parameter values from the environment it is assigned.

    So, you create a package parameter (a string) that defines your database environment. Then, create a package variable that builds your SQL statement (set the "expression" of the variable to a string expression that builds your query), incorporating that parameter value, and then assign the variable to the widget as its query source. For the "Execute SQL Task" for example, that means setting the SQLSourceType to "Variable" and then setting the "SourceVariable" to the package variable you created. When the package runs, the variable gets assigned the expression that creates the SQL Statement to execute.

    Wasn't sure if you already know all this and it's not what you are trying to achieve, but thought I'd throw it out there. 🙂

  • Scott In Sydney - Sunday, January 14, 2018 3:25 PM

    @jeff:  I've used synonyms occasionally, didn't think of using them here.  If I could create a synonym at the database level that might be acceptable.  However, I don't want to create a synonym at the object (i.e. table) level; for my application, that would require too many synonyms to too many tables and a maintenance hassle.

    Since my original post, the approach I'm now taking is to use views:  instead of embedding the code in SSDT, I'm creating a view on the server:

    I guess I don't see the advantage with using views over synonyms for this.  They both need to be updated and views have code to be updated.  Synonyms don't.  That's your call though.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • dmbaker - Tuesday, January 16, 2018 6:55 AM

    You keep saying "SSDT" but it sounds like you mean "SSIS", is that right? SSDT and SSIS are not the same thing (SSIS is a tooling set included in SSDT, but SSDT is much more than SSIS).

    Anyway, it is possible to do "dynamic" SQL of a sort in SSIS, many (but not all) of the widgets let you execute SQL based on a package variable that can be based on an expression. That expression can reference package parameters, and those parameters can be linked to an "environment" when you deploy the package. And you can have different "environments" at deployment that set the parameter values differently, then when the package executes, it "inherits" the parameter values from the environment it is assigned.

    So, you create a package parameter (a string) that defines your database environment. Then, create a package variable that builds your SQL statement (set the "expression" of the variable to a string expression that builds your query), incorporating that parameter value, and then assign the variable to the widget as its query source. For the "Execute SQL Task" for example, that means setting the SQLSourceType to "Variable" and then setting the "SourceVariable" to the package variable you created. When the package runs, the variable gets assigned the expression that creates the SQL Statement to execute.

    Wasn't sure if you already know all this and it's not what you are trying to achieve, but thought I'd throw it out there. 🙂

    Hi dmbaker,

    Thanks for your reply.  I'd like to get the terminology right if I'm incorrect:

    1)  I'm using SQL Server Data Tools (SSDT)  as my development environment to build packages (*.dtsx files).  I can test/execute those packages from within SSDT, and deploy them to SQL Server once ready for production.
    https://msdn.microsoft.com/en-us/library/hh272686%28v=vs.103%29.aspx?f=255&MSPPError=-2147217396
    https://docs.microsoft.com/en-us/sql/ssdt/download-sql-server-data-tools-ssdt

    2)  I believe there are three (?) deployment models for SSDT (file/package/project?).  The one I am using is the project deployment model, deploying the project en masse to <SQL Server>/Integration Services Catalogs/SSISDB/<Folder Name>/Projects/<Project Name>/Packages/<Package>.dtsx

    3) Once uploaded, I can create environments in .../SSISDB/<Folder Name>/Environments.  Within the environments, I can create variables, such as connection strings.  Within the project, I can configure the connection managers to use the connection strings.  Doing this requires that I specify an environment at run time.

    4)  FWIW, I tried creating a "wrapper package" to run multiple packages using the Execute Package task.  I didn't like the resultant (consolidated) logging within (SSIS?).  The logging I'm referring to is what I get when I RMB Project --> Reports --> All Executions.  So I'm executing the packages using a Powershell script which calls dtexec.  I'm hoping to use a scheduler (JAMS) to execute the packages once we go production (if I can convince my employer to buy it - I find SQL Agent to be lacking in features).  If so, then I know JAMS uses dtexec to execute packages, so the migration would be easier.  (I've attached the PS script for anyone curious).

    5) Within Object Explorer --> Connect, I have a choice for Integration Services, but there are no servers available in my environment.

    SSDT and SSIS are not the same thing (SSIS is a tooling set included in SSDT, but SSDT is much more than SSIS).

    Finally, AFAIK I was using the correct terminology.  I don't consider SSIS to be a part of SSDT; I thought SSIS was a component of SQL Server itself?  Instead, I consider SSDT to be the development environment for packages to be executed by SSIS, and SSIS to be much more that SSDT, not the other way around?

    I'll have to Google what "Connect to Integration Services" means, since I don't have an Integration Services service available to me.  Regardless, my current environment allows me to execute the deployed packages.

    (Sorry for going a bit off track...)

  • Jeff Moden - Tuesday, January 16, 2018 7:24 AM

    I guess I don't see the advantage with using views over synonyms for this.  They both need to be updated and views have code to be updated.  Synonyms don't.  That's your call though.

    Hi Jeff,

    (First of all, I really respect your work - I've read a lot of your posts.  I'm even using http://www.sqlservercentral.com/articles/Tally+Table/72993/ in my current project.  So thanks for your contributions to the community.  I know you're a "guru" and greatly respect your opinion.)

    When I investigated synonyms for this, it looked like I'd need to create 30+ synonyms to point to the tables in the other databases, just so I could use a two level name, and I'd have to create them in both dev and prod.

    While I'm using the view(s) as a source within SSDT to join across databases, I thought they might be useful in the future for colleagues doing ancillary work directly in SSMS; the views could be a source for their own adhoc queries.

    The view source code will be pretty static once in production.  Besides, I think it would be easier to update the views than update the embedded code in individual SSDT packages.  For example, RMB Database --> Tasks --> Generate Scripts --> Select Views --> write to single file.  One place to maintain the views.  And it's easy to change USE [Dev] to USE [Prod], and/or search-and-replace, to generate the production views.

    I haven't investigated whether there are any performance issues using views as a source vs. embedded SQL code within the SSDT packages, but I wouldn't think there would be any.

    Kind Regards...

  • Scott In Sydney - Tuesday, January 16, 2018 4:01 PM

    Hi dmbaker,

    Thanks for your reply.  I'd like to get the terminology right if I'm incorrect:

    3) Once uploaded, I can create environments in .../SSISDB/<Folder Name>/Environments.  Within the environments, I can create variables, such as connection strings.  Within the project, I can configure the connection managers to use the connection strings.  Doing this requires that I specify an environment at run time.

    5) Within Object Explorer --> Connect, I have a choice for Integration Services, but there are no servers available in my environment.

    Finally, AFAIK I was using the correct terminology.  I don't consider SSIS to be a part of SSDT; I thought SSIS was a component of SQL Server itself?  Instead, I consider SSDT to be the development environment for packages to be executed by SSIS, and SSIS to be much more that SSDT, not the other way around?

    With regard to #3, those parameters that you set up can also be used for other things, like expressions within the variables in your package. So, if you're doing dynamic SQL, that parameter you set up could drive the expression of a variable that is used in a SQL statement that you can subsequently execute. So, at runtime, via an environment parameter, you can drive dynamic SQL in your packages.

    With regard to #5, are you seeing that in what, Visual Studio? If you see that then you do have an installation of Integration Services (a.k.a. SSIS). I'm not exactly sure how it works, but I think SSDT installs a "lite" SSIS server on your local PC when you install SSDT. Because you installed SSDT, you have enough SSIS stuff installed to run your packages, but it's not a full-blown SSIS server installation.

    Terminology-wise, SSDT is the tooling suite within Visual Studio that is used to build database things. It's a whole set of tools that you can use to build SSIS, SSAS, SSRS, and regular database stuff (tables, procs, etc.). You use SSDT to build SSIS stuff, but you also use it to build Analysis Services cubes, Reporting Services reports and databases in general. Yes, SSIS is *not* a subset of SSDT. Rather, the SSIS tooling in SSDT is a subset of SSDT. Don't know that it's more proper, but I would say "I'm using SSDT to build an SSIS project." But whatever, as soon as you mention "packages" or something SSIS-specific, people will figure out what you're talking about. 🙂

  • Jeff Moden - Tuesday, January 16, 2018 7:24 AM

    Scott In Sydney - Sunday, January 14, 2018 3:25 PM

    @jeff:  I've used synonyms occasionally, didn't think of using them here.  If I could create a synonym at the database level that might be acceptable.  However, I don't want to create a synonym at the object (i.e. table) level; for my application, that would require too many synonyms to too many tables and a maintenance hassle.

    Since my original post, the approach I'm now taking is to use views:  instead of embedding the code in SSDT, I'm creating a view on the server:

    I guess I don't see the advantage with using views over synonyms for this.  They both need to be updated and views have code to be updated.  Synonyms don't.  That's your call though.

    ditto. I'm with Jeff here. If it's a lot of work for synonyms, it's a one time shot.

  • Scott In Sydney - Tuesday, January 16, 2018 4:19 PM

    ...When I investigated synonyms for this, it looked like I'd need to create 30+ synonyms to point to the tables in the other databases, just so I could use a two level name, and I'd have to create them in both dev and prod.

    While I'm using the view(s) as a source within SSDT to join across databases, I thought they might be useful in the future for colleagues doing ancillary work directly in SSMS; the views could be a source for their own adhoc queries.

    The view source code will be pretty static once in production.  Besides, I think it would be easier to update the views than update the embedded code in individual SSDT packages.  For example, RMB Database --> Tasks --> Generate Scripts --> Select Views --> write to single file.  One place to maintain the views.  And it's easy to change USE [Dev] to USE [Prod], and/or search-and-replace, to generate the production views.

    I haven't investigated whether there are any performance issues using views as a source vs. embedded SQL code within the SSDT packages, but I wouldn't think there would be any.

    Kind Regards...

    Wouldn't you also need to write 30+ views, that had different code in each environment (DEV, PROD, etc...)   I also tend to use synonyms whenever a query is using multiple databases, so that I can deploy the same code to each environment and not have to worry about changing the code.  One scenario where you would have to change views but not change synonyms would be in the case of a change in the table design, such as adding a new column.  In your case, each view that references the altered table would need code changes to include the new column, but a synonym acts more like a pointer to the real object, it doesn't contain metadata about the object it is pointing to, but a view does.

  • (This post is morphing into best practice for code maintenance, rather than the original topic, but let's run with it...)

    Background:
    I have 30 staging tables with the same composite key (3 columns) across all of them.  I'm creating a DIM table in the DataWarehouse layer (separate db) containing the composite key, some other columns, and an identity column as a surrogate key.
    I am joining the staging tables with the DIM table on the composite key, inserting the SK.
    The target table is in the Transform layer (separate db).  Further transformation to be done on those tables (after adding the SK) before loading into the DW.
    So the three affected db's are Staging (source), DW (source), Transform (target)
    The original code in my first post was:

    INSERT INTO Transform.SomeTable
    SELECT whatever
    FROM Staging.SomeTable
    LEFT JOIN DataWarehouse.DIM_Table
    ON <composite keys>

    I've since changed that approach in SSDT to:

    OLEDB Source
    RowCount
    OLEDB Target


    So, with that change in approach, and thinking about it further, I only need one synonym: in Staging, pointing to DW.DIM_Table.  This would allow two-level names for the join.  And I only need a SELECT; I lose the INSERT and let SSDT handle that via the target destination.

    So, with that background in mind:

    Scenario #1:
    I create 30 views corresponding to the join described above.
    For now, ignore whether the view uses synonyms/two level names or three level names.
    I use those views as the source for my 30 packages in SSDT.
    With the use of package variables and naming conventions for the views and packages, I can quickly clone a package, remap the columns, and it "works" with no further code changes.
    Now, a new column is added to the DIM table, and I need it passed to the target table.  So it affects all 30 views.
    In SSMS, Tasks --> Generate Scripts, I script all the views to a single file, edit that file, and regenerate the views.

    Scenario #2:
    Similar to above, but instead of views, I embed the query as the OLEDB source.
    I will again use package variables, naming conventions, etc.  I may be able to make the query partially dynamic, but I can't see how I would make the list of columns dynamic, unless I use SELECT *, then use column mapping to restrict the columns to those in the target table.
    Now, a new column is added to the DIM table, and I need it passed to the target table.  So it affects all 30 packages.
    So I have to open each package, edit the code for the source.

    Both Scenarios:
    Admittedly, in both scenarios, I have to open all packages and remap columns from source to target to map the new column.
    I also have to add the new column to the target tables in SSMS.

    Which approach is easier?  I contend Scenario #1, but am open minded.

    @chris-2 Harshman

    One scenario where you would have to change views but not change synonyms would be in the case of a change in the table design, such as adding a new column. In your case, each view that references the altered table would need code changes to include the new column, but a synonym acts more like a pointer to the real object,

    How do synonyms help with respect to adding a new column?  Isn't a synonym is just a pointer to an object, in this case a table?  But, you still have to either edit a view, or edit an embedded query in a package.

    If using a view is a bad idea given the above, then fine, convince me.  I'm not yet convinced.  Having said that, I may change my views to use a synonym and two level names.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply