June 18, 2008 at 3:30 pm
I have a project in which I am updating Active Directory group membership based on records in a database. I have about 20 domain groups that are updated on a nightly basis. An SSIS package has 20 corresponding Data Flow tasks that are almost identical except for the group and members.
This package works great, as is, but each time I add a new group to automate, I have to edit the SSIS package and deploy it. I want to improve this process by taking the data from the package and putting it into a database table. Then, when I need to add a group, I only need to add a record to the table.
My plan is to have a table with the following columns:
groupDN <-- distinguishedName of the Active Directory domain group
sqlCommand <-- A complete SELECT statement that returns a list of users from our business system
For each group I wish to maintain, I will add a record in the Groups table. Also, I will add the related SELECT statement that should be used to populate the domain group's members. Each of these SELECT statements are considerably different from each other.
I created an SSIS package that works to a point. In the Control Flow, I have an Execute SQL Task that retrieves all group records and places them into a variable, User::domainGroupQuery (package scope).
Next, I have a Foreach loop (Foreach ADO Enumerator) that uses User::domainGroupQuery as the ADO object source variable.
Inside the Foreach loop, I have a simple Script Task that uses MsgBox() to display the groupDN and sqlCommand values. This works perfectly.
Now, within the Foreach loop, I want to add a Data Flow task. Inside the Data Flow task, I have an OLE DB source. This is where it falls apart.
I want to use "SQL command from variable" in the "Data access mode". Then, when I select the User::domainGroupQuery variable and try to select "Columns", an error is displayed.
How can I use an SQL command, stored in a database table, to control the source of an OLE DB Source within a Data Flow task? I want to keep the entire package as dynamic as possible.
Thanks for any help you can provide.
Mike
June 19, 2008 at 12:31 am
Any specific reason why it HAS to be a SQL command from variable?
Sure its possible to do, I have one that runs peaches, but there are different ways.
For example either a stored procedure (careful with the metadata there...), or alternatively user defined function which allows for input parameters.
Using this as a source will do almost the same thing
~PD
June 19, 2008 at 7:53 am
It doesn't have to be SQL from a variable, but with the number of groups I may have to add, it will be, by far, the most efficient solution.
If I can control the domain group updates by adding a record to a control table, it will be much more efficient to add new groups as they are requested.
I don't want to have to edit an SSIS package, stored procedure or view.
Also, having the group distinguished name and the SELECT statement hard-coded into a package or stored proc doesn't seem the "right" way to handle this.
The groups' distinguished name and the SELECT statements are just bits of data that should control the package.
Regardless, I can get it to work halfway, I just can't get the Data Flow task to "see" one record at a time.
June 19, 2008 at 7:59 am
The first image shows all of the domain groups I currently update using SSIS. As you can see, there is a complete Data Flow task for each group. This was a quick way to get the project completed, but now I want to fix it and do it correctly.
As you can see, there is quite a bit of redundancy and chance for error.
The second image shows the details of the Data Flow tasks. All of these Data Flow tasks are exactly the same except for the group distinguished name and the population of people who are added/removed from the group.
I want to reduce this down to a single Data Flow task using a database table as the source of the information.
June 20, 2008 at 12:24 am
I am not saying that you should hardcode anything.
All I am saying is that the same way you are building the SQL statement which you are passing to your variable, you could build a table valued function.
Unless the SQL statement is so drastically different.
But for the sake of argument, I tried to do a similar thing, just a far more simple approach, just to make sure in my mind that I can run a SQL statement from a variable.
These are the steps that I followed:
a) Created a new package in BIDS called DocPhilsDynamicSQLLoad and renamed the package.dtsx as the same
b) Added a variable called varDocPhilsSQLStatement. I set the value equal to "Select NationalIDNumber from HumanResources.Employee where ManagerID = 16" (in your case, you will create a scripting task to change the variable - I am a little lazy this morning)
c) Added an OLEDB connection called DocPhilsSQLEnvironment, and pointed this to AdventureWorks on my localhost
d) Added a text file connection called DocPhilsTextNationalIDNumbers, and pointed this to c:\TestDocPhilIDs.txt (1 Column called NationalIDNumber Four Byte Signed Int)
e) Added a dataflow task called TestTransfer. In the transfer, added an OLEDB source (DocPhilsSQLEnvironment) and Flatfile destination (DocPhilsTextNationalIDNumbers)
f) In the OLEDB source, changed the DataAccessMode to SQL Command from variable and added my user::varDocPhilsSQLStatement. My column was generated for me
g) Dragged the arrow to the flatfile destination, and fixed the metadata.
TADA, when I executed this ran first time (12 rows exported). Then I played around, and changed the id to 6 and ran successfully (8 rows exported)
My initial point is that you can do this in a function or stored proc as well, and get the very same results. I just prefer having my queries in a central place, which SQL offers me. Also, it compiles prior to SSIS running, so it should be faster. For example, you could create a function which returns a table with NationalIDNumbers and accepts a variable of ManagerID
~PD
June 20, 2008 at 12:49 pm
pduplessis (6/20/2008)
Unless the SQL statement is so drastically different.
That is one of the issues. Although many of the SELECT statements are similar, several of them are completely different.
The only thing that is common is the single ID number that is returned for each query.
I will attempt your example and see how I fare.
Thanks for the suggestions.
M!ke
Viewing 6 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply