SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Audit transformation in SSIS


Audit transformation in SSIS

Author
Message
daniel-johnsson
daniel-johnsson
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 62
Hi everyone. This is my first time posting on this forum so let me know if you need more information in order to answer my question.

I'm working on a ETL-project and part of the assignent is to implement auditing. The audit transformation is working flawlessly, however I want to add an audit_id
that will correspond to the respective dimension tables.
Like I mentioned, I am able to retrieve data/information for every column in the audit transformation (contained in a table in my database) with an audit_id column that use
identity (1,1) but I am unable to join them to the dimension tables audit_id column.

I've been trying to solve it with a Lookup-transformation but it's been unsuccessful.

I appreciate any feedback or answer.
Thanks in advance!

// Daniel
Paul Hernández
Paul Hernández
Old Hand
Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)

Group: General Forum Members
Points: 376 Visits: 661
Hi Daniel,

I will need a little bit more details in order to help you.

- Which SQL Server version are you using?
- As far as I've understood, you have implemented your own audit process. How does it works? Which auditing information are you logging?

There is a built-in audit functionality in SQL Server 2008 which gathers the Names and IDs of your packages and tasks, but it is very basic.

Kind Regards

Paul Hernández
daniel-johnsson
daniel-johnsson
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 62
Thank you for your reply Paul.
I'm using SQL Server 2012 (SP1), Developer Edition.
I use a OLE DB source to extract some data from 3 different tables from a specific database. I then
connect this component with the Audit transformation component. In the audit I selected the following information to audit:
Execution start time, Machine name, User name, Package name and Task name.
After that I create a data flow path between my audit transformation and an OLE DB destination where I load my Audit_table. When I created the audit_table
I created a column called Audit_id with identity (1,1).

This is as far as I get. Like I mentioned in my previous post I am unable to join the Audit_id in my audit_table with my audit_id columns in my various dimension tables.
I have 6 dimension tables and 2 fact tables which currently have audit_id column with the value NULL.

I hope this is all the information you need and if not please let me know.

// Daniel
Phil Parkin
Phil Parkin
SSCoach
SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)

Group: General Forum Members
Points: 18195 Visits: 20392
daniel-johnsson (3/7/2013)
Thank you for your reply Paul.
I'm using SQL Server 2012 (SP1), Developer Edition.
I use a OLE DB source to extract some data from 3 different tables from a specific database. I then
connect this component with the Audit transformation component. In the audit I selected the following information to audit:
Execution start time, Machine name, User name, Package name and Task name.
After that I create a data flow path between my audit transformation and an OLE DB destination where I load my Audit_table. When I created the audit_table
I created a column called Audit_id with identity (1,1).

This is as far as I get. Like I mentioned in my previous post I am unable to join the Audit_id in my audit_table with my audit_id columns in my various dimension tables.
I have 6 dimension tables and 2 fact tables which currently have audit_id column with the value NULL.

I hope this is all the information you need and if not please let me know.

// Daniel


I don't think I fully understand the structure of what you are doing.

Are you saying that AuditId in your dim tables should be a many to one FK to AuditId in your Audit table?

Is it just a case of retrieving the AuditId for a single insert to your Audit table and then passing that into subsequent data flows?


Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.

Please surround any code or links you post with the appropriate IFCode formatting tags. It helps readability a lot.
Paul Hernández
Paul Hernández
Old Hand
Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)Old Hand (376 reputation)

Group: General Forum Members
Points: 376 Visits: 661
Hi Daniel,
The problem here is how to look up the Audit_Id in your Audit table using one or more columns from your dimension or fact tables. You have the following information: Execution start time, Machine name, User name, Package name and Task name.

How you can relate this two table, good question. If you store the creation time of the columns (I always create a CretatedOn and UpdatedOn column for Dimension tables and CreatedOn column for Fact tables) you can make the following:

Get the Audit_Id where the CreateDT >Execution start time

But you must do it after an execution and before the next execution in order to obtain one and only one Audit_Id.

This approach does not work if you have different user, machines or packages running at the same time, it is just an idea and it seems to be not precise.

Other approach that definitely works is to drop your separated Audit table and add audit columns to each of your dimension tables and fact tables. It is redundant and space consuming but I think it not so bad at all.

Paul Hernández
daniel-johnsson
daniel-johnsson
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 62
Hi Phil,

Yes that is exactly what I want to accomplish. Basically that I have Audit_ID with identity (1,1) that works as a PK in my Audit table and
my respective dimension tables contain Audit_ID columns that should work as foreign keys to the Audit_ID PK in my Audit table.
And just what you mentioned, I want to retrieve the Audit_ID from my Audit table and pass it on through the data flow to give that value
to my various dimension tables.

Also I should mention that I'm auditing every row that's passing through but the only value that seperates these is the Audit_ID column so I currently end up with
128 rows which is identical (except for the Audit_ID). Can I use a Row Sampling component to filter it down to just 1 row or will this become a problem?

Again, let me know if it's not clear what I currently have and what I want to accomplish. I'd be glad to provide the information if I can. Still rather fresh to both
posting in SSC and working with SSIS.
daniel-johnsson
daniel-johnsson
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 62
Hi Paul,

It's true that I could simply drop my audit table and log all my auditing in my various dimension and fact tables but that would not be according
to my assignment I'm afraid. I would like to make a separate table that contains all my auditing.
It might work as a last resort if all else fails however.

I'm not entirely sure how your solution with CreatedOn and UpdatedOn works. Are you saying I should create the columns inside the package?
And would I then have to create this on every dimension table I'm loading, since I am auditing every dimension table?

I wouldn't mind doing that, I just want to make sure I understand your solution.
I appreciate the help!

// Daniel
Phil Parkin
Phil Parkin
SSCoach
SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)SSCoach (18K reputation)

Group: General Forum Members
Points: 18195 Visits: 20392
daniel-johnsson (3/8/2013)
Hi Phil,

Yes that is exactly what I want to accomplish. Basically that I have Audit_ID with identity (1,1) that works as a PK in my Audit table and
my respective dimension tables contain Audit_ID columns that should work as foreign keys to the Audit_ID PK in my Audit table.
And just what you mentioned, I want to retrieve the Audit_ID from my Audit table and pass it on through the data flow to give that value
to my various dimension tables.

Also I should mention that I'm auditing every row that's passing through but the only value that seperates these is the Audit_ID column so I currently end up with
128 rows which is identical (except for the Audit_ID). Can I use a Row Sampling component to filter it down to just 1 row or will this become a problem?

Again, let me know if it's not clear what I currently have and what I want to accomplish. I'd be glad to provide the information if I can. Still rather fresh to both
posting in SSC and working with SSIS.


I think that you could achieve this as follows:

1) Create a package-scoped integer variable in SSIS - 'AuditId', perhaps.
2) Insert an ExecuteSQL task before your data flow - this will insert a row into the audit table and return the new AuditId into your AuditId variable.
So the SQL would be something like

insert dbo.Audit (col1, col2)
select ...

select scope_identity()

3) Now you can add a derived column to any subsequent data flow and set its value to whatever is in the variable AuditId.


Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.

Please surround any code or links you post with the appropriate IFCode formatting tags. It helps readability a lot.
Lynn Pettis
Lynn Pettis
SSC-Dedicated
SSC-Dedicated (38K reputation)SSC-Dedicated (38K reputation)SSC-Dedicated (38K reputation)SSC-Dedicated (38K reputation)SSC-Dedicated (38K reputation)SSC-Dedicated (38K reputation)SSC-Dedicated (38K reputation)SSC-Dedicated (38K reputation)

Group: General Forum Members
Points: 38971 Visits: 38508
Phil Parkin (3/9/2013)
daniel-johnsson (3/8/2013)
Hi Phil,

Yes that is exactly what I want to accomplish. Basically that I have Audit_ID with identity (1,1) that works as a PK in my Audit table and
my respective dimension tables contain Audit_ID columns that should work as foreign keys to the Audit_ID PK in my Audit table.
And just what you mentioned, I want to retrieve the Audit_ID from my Audit table and pass it on through the data flow to give that value
to my various dimension tables.

Also I should mention that I'm auditing every row that's passing through but the only value that seperates these is the Audit_ID column so I currently end up with
128 rows which is identical (except for the Audit_ID). Can I use a Row Sampling component to filter it down to just 1 row or will this become a problem?

Again, let me know if it's not clear what I currently have and what I want to accomplish. I'd be glad to provide the information if I can. Still rather fresh to both
posting in SSC and working with SSIS.


I think that you could achieve this as follows:

1) Create a package-scoped integer variable in SSIS - 'AuditId', perhaps.
2) Insert an ExecuteSQL task before your data flow - this will insert a row into the audit table and return the new AuditId into your AuditId variable.
So the SQL would be something like

insert dbo.Audit (col1, col2)
select ...

select scope_identity()

3) Now you can add a derived column to any subsequent data flow and set its value to whatever is in the variable AuditId.







Instead of :



insert dbo.Audit (col1, col2)
select ...

select scope_identity()




Why not:



insert dbo.Audir(col1,col2)
output INSERTED.AuditID -- or what ever column is the Identity column
select ...




The output value(s) could also be put into a temporary table or table variable if needed first. It depends on how the INSERT is being done, is it a single row or multi-row entry.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
daniel-johnsson
daniel-johnsson
Grasshopper
Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)Grasshopper (21 reputation)

Group: General Forum Members
Points: 21 Visits: 62
Hi Phil and Lynn,

I have been trying to solve my problem with your instuctions but I'm still not able to work it out I'm afraid. Maybe I am missing something obvious, so I will explain a bit more in detail how far I have got.

I created an integer variable that I named AuditID and before I do my first load to my first staging table I insert a row in my Audit_table with the following t-sql:
SET IDENTITY_INSERT dbo.STAGE_AUDIT ON

INSERT INTO dbo.STAGE_AUDIT ([Audit ID], [Package Name], [Execution Start Time], [Machine name], [User name], [Task name])
VALUES (1, 'Staging', '2013-03-13', 'DANIEL-SSIS', 'ECUTBILDNING\Administrator', 'Stage Geography')

SELECT SCOPE_IDENTITY()

This way it inserts the same information as my Audit Transformation would. Maybe this is not how the SQL statement should look like, but I'm honestly unsure what the statement should look like.
Nevertheless, I'm successfully loading my variable with 1 in the Result Set section saying result name = 0, Variable name = User::AuditID. Also I'm using result set as single row.

From this Execute SQL Task my data flow continues into my first load Data Flow task which I can summarize like this:
A source where I retieve some data and insert it into my staging table.
Next to it I use the same source, retrieving the same data but before I connect it to my destination I use an Audit Transformation and also a Row Sampling to only have 1 row as the output since they are all identical except for their AuditID. I then connect it to my destination table, which is my Audit_Table.
Lastly I use a source which is my staging table that I loaded previously. My data path is connected with a Derived Column which looks as following:
I replace my audit_id column with the expression audit_id == @User::AuditID and then I connect this Derived Column with my destination, which is the same table as my source.

Sadly when I run this package my AuditID remains NULL in my staging table. Obviously I'm doing something wrong but that is why I'm here asking for help is it not Smile

Thanks again for your time and help, and I'm sorry if it's not very clear with the lack of images and the fact that it's all just text. Let me know if it's unclear or wether you need more information.

// Daniel
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search