Data Activity Tracking Using SQL Triggers

Question

Data Activity Tracking Using SQL Triggers

Hemandu Malhotra

SSC Veteran

Points: 275
More actions
December 9, 2022 at 12:00 am

#4116625

Comments posted to this topic are about the item Data Activity Tracking Using SQL Triggers
Hemandu

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

quickdraw SSC Enthusiast Points: 151 More actions · Answer 1

This technique should come with a strong warning about understanding your database concurrency, because auditing with triggers increases transaction times, and can cause serious locking and/or latching issues.

I've encountered this issue on both SQL Server (two projects, at transaction rates higher than 5000/second) and PostgreSQL. This is especially true if your are auditing many fields.

This technique can lead to what is called The Hot Page Problem, where the last page on the audit table has too much contention at high TPS. You can mitigate that in certain cases with having a separate audit table per audited table, and/or partitioning the auditing table with a partition key that equally distributes the writes across partitions.

Jeff Moden SSC Guru Points: 1004650 More actions · Answer 2

I have a grand appreciation for anyone that steps up to the plate with an article.

Shifting gears to the subject at hand, this is an old method that was written about many years ago. The folks prior to me had implemented similar triggers and made copies of the Inserted and Deleted trigger tables into Temp Tables, just like you did, because MS didn't make the tables accessible in the scope of dynamic SQL.

For a 4 column update on just 10,000 rows of a (then) 132 column wide table, the trigger code took a whopping 4 minutes.

This is NOT the way to do "by column" auditing and I strongly recommend against such dynamic code methods.

The way I did it was to write code to examine the source table and write the trigger from that. Yes, that meant the trigger had 130 or so conditional (IF UPDATE() insert statements. After that, the time it took to execute the 4 column Update that I spoke of dropped from 4 MINUTES to 400 MILLI-Seconds.

You also jump through a fairly complex hoop with the following code to determine what the trigger action is...

-- selecting the action Insert, delete or update
SET @Type = (CASE WHEN EXISTS(SELECT * FROM INSERTED) AND EXISTS(SELECT * FROM DELETED)
THEN 'U'  -- Set Action to Updated.
                      WHEN EXISTS(SELECT * FROM INSERTED)
                        THEN 'I'  -- Set Action to Insert.
                      WHEN EXISTS(SELECT * FROM DELETED)
                        THEN 'D'  -- Set Action to Deleted.
                 ELSE NULL -- Skip. It may have been a "failed delete".   
                    END)

Although it's not a real bad issue, you are hitting each trigger table twice. If you use a bit of negative logic, it can be simplified quite a bit.

 SELECT @Type = CASE
                  WHEN NOT EXISTS (SELECT * FROM DELETED)  THEN 'I'
                  WHEN NOT EXISTS (SELECT * FROM INSERTED) THEN 'D'
                  ELSE 'U'
                END

The method is also an extreme duplication of data and very difficult to assemble what a row looked like at a given point of time without a whole lot of work. Since the "NewValue" for the latest update lives in the source table, there is no really need for it and provides and extreme duplication of data. If you used that space, instead, for another datetime column so that you have two dates and times, one for when the OldValue started being valid and one for when it was updated to something else, PiT reconstruction of rows for reporting would be a lot easier.

I'll also state that if you do like this in your trigger code...

IF @PKCols IS NULL
BEGIN
   RAISERROR('no PK on table %s', 16, -1, @TableName)
   RETURN
END

... the the source table has some really serious design issues.

So, while I really appreciate all the work you did to write this article, the premise of this article is seriously flawed. You must NOT use any trigger that dumps the INSERTED or DELETED trigger tables to Temp Tables. It's not just a casual "Code Smell". If either of those are present or you're doing any loops (and, yes, recursion is a form of bad looping) in your trigger code, your trigger code will be one of your biggest performance issues you'll ever have.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Hemandu Malhotra SSC Veteran Points: 275 More actions · Answer 3

Hi quickdraw,

Thank you for your response. I have been using this concept with medium sized databases and we have not experience any locking or latching issues. Our Audit table are residing in separate Databases and one audit table is created as per each schema of audited table.

Hemandu

Hemandu Malhotra SSC Veteran Points: 275 More actions · Answer 4

Hi Jeff Moden,

Thank you for your response. I am not sure if I follow some of your concerns. Yes, this is an old method and we have customized it as per our project needs. Also, some of the code is there for anyone using this trigger who may not have followed the DB design best practices to catch certain errors.

Implementing this in our projects, we have not seen any performance issues at all. But yes, there is always a scope of improvement.

Hemandu

Jeff Moden SSC Guru Points: 1004650 More actions · Answer 5

Hemandu Malhotra wrote:

Hi Jeff Moden,
Thank you for your response. I am not sure if I follow some of your concerns. Yes, this is an old method and we have customized it as per our project needs. Also, some of the code is there for anyone using this trigger who may not have followed the DB design best practices to catch certain errors.
Implementing this in our projects, we have not seen any performance issues at all. But yes, there is always a scope of improvement.

Thanks for the feedback, Hemandu.

The biggest problem I have with articles that "explain" this method is that the people writing about it (no offense but yourself included), have no clue as to the damage that it can cause in other environments. I have first hand experience with that. The folks at the place I was talking about where it took 4 minutes to update just 4 columns of 10,000 rows got the trigger code from a similar article on this very site about a decade ago.

I also try to explain to people that you should always "write for scope" because scope is going to happen. It always does.

Like I said, I really appreciate the work you put into this article and I'm not bad mouthing you. I'm trying to make people aware that this type of trigger is something to be avoided even for singleton update code where it becomes part of the "SQL Death by a thousand cuts" that a whole lot of databases experience.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Hemandu Malhotra SSC Veteran Points: 275 More actions · Answer 6

Hi Jeff Moden,

No offense taken. I really appreciate your honest feedback.

Thank you!

Hemandu