Does T-SQL have a means of iterating over the columns of a table?

  • We've got an unusual request, and I think my managers are approaching this from a programming POV rather than a database POV. Here's how it was proposed to me:

    We've got to capture the history of a record, when its updated or deleted. If this were a C#/VB.NET app then we could iterate using a FOREACH through all of the fields of a record, comparing old to new. However, whenever I and others who have some experience at capturing history on a table, have done this we've used either the UPDATE or DELETE trigger to capture what's in the deleted (and inserted for an UPDATE) table(s) and did whatever it was that had to be done. Typically I just saved the whole record from the temporary deleted table into some other history/archive table. My colleague has done the same thing. Our boss is thinking more like C#/VB.NET.

    So is there even such an operator in T-SQL for iterating over the fields/columns of a record?

    Kindest Regards, Rod Connect with me on LinkedIn.

  • if the table has an implicit or explicit ordering, like an identity() column, or a CREATEDDATE datetime column, you can compare old to new in a set based operation, by joining the table to itself.

    SQL has a cursor object, which will still require the same logic to know what rows are first vis second, so the identity/createdate still applies, and you could do thesame thing a cursor would, just a couple of magnitudes slower.

    show us the table structure, we can help flesh out a set based operation.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Rod at work (4/26/2016)


    We've got to capture the history of a record, when its updated or deleted. If this were a C#/VB.NET app then we could iterate using a FOREACH through all of the fields of a record, comparing old to new.

    What's the point of doing that?

    How would a result of such comparison affect the record to be stored in Archive table?

    If the new and old values are different - OK, it's obvious, we record the old value.

    But what if it's the same?

    You still need to put something into the cell. What would it be? NULL?

    If you want NULLs instead of unchanged values - use NULLIF:

    SELECT NULLIF (d.ColName, i.ColName)

    FROM deleted d

    left join inserted i on i.PK_Col = d.PK_Col

    But I'm not sure if it's such a bright idea.

    Only if you need to save it in some UNPIVOT form.

    _____________
    Code for TallyGenerator

  • Rod at work (4/26/2016)


    We've got an unusual request, and I think my managers are approaching this from a programming POV rather than a database POV. Here's how it was proposed to me:

    We've got to capture the history of a record, when its updated or deleted. If this were a C#/VB.NET app then we could iterate using a FOREACH through all of the fields of a record, comparing old to new. However, whenever I and others who have some experience at capturing history on a table, have done this we've used either the UPDATE or DELETE trigger to capture what's in the deleted (and inserted for an UPDATE) table(s) and did whatever it was that had to be done. Typically I just saved the whole record from the temporary deleted table into some other history/archive table. My colleague has done the same thing. Our boss is thinking more like C#/VB.NET.

    So is there even such an operator in T-SQL for iterating over the fields/columns of a record?

    Is your boss thinking that you only want to capture the columns that have data changed in them instead of doing "whole row" auditing?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have come across a few clients over the years that had built their own "column-based" auditing. In every case the overhead of such systems (regardless of how they did the work) was quite onerous. So if you MUST have that, be sure you take it into consideration when you are determining what hardware you need. And you also need to make sure you do load testing that will represent your production system. Otherwise you could well get blindsided with unacceptable performance (and concurrency) when you roll this new auditing out.

    To do the column-based auditing you will need to look at the UPDATE() feature inside triggers. It allows you to test if a column has been updated. Note that updating a column to it's same value still hits this (and I see clients doing that ALL THE TIME). Oh, since you want to do all columns it will be more efficient to do COLUMNS_UPDATED(). You probably want to automate the generation of this trigger code too. DAMHIKT 🙂

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Jeff Moden (4/26/2016)


    Rod at work (4/26/2016)


    We've got an unusual request, and I think my managers are approaching this from a programming POV rather than a database POV. Here's how it was proposed to me:

    We've got to capture the history of a record, when its updated or deleted. If this were a C#/VB.NET app then we could iterate using a FOREACH through all of the fields of a record, comparing old to new. However, whenever I and others who have some experience at capturing history on a table, have done this we've used either the UPDATE or DELETE trigger to capture what's in the deleted (and inserted for an UPDATE) table(s) and did whatever it was that had to be done. Typically I just saved the whole record from the temporary deleted table into some other history/archive table. My colleague has done the same thing. Our boss is thinking more like C#/VB.NET.

    So is there even such an operator in T-SQL for iterating over the fields/columns of a record?

    Is your boss thinking that you only want to capture the columns that have data changed in them instead of doing "whole row" auditing?

    Good question, Jeff. I at first thought he was looking at capturing the whole record, if any change occurred on it. Late yesterday afternoon we had our weekly Skype meeting (members of the team are all around the state so that's how we have to do our team meetings). During that meeting I learned that I was mistaken in my understanding of what's required. My boss is working on the requirements this morning, I won't know what's required until later today, but I do know it isn't the simplistic copy the whole row modified into some history table.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • TheSQLGuru (4/27/2016)


    I have come across a few clients over the years that had built their own "column-based" auditing. In every case the overhead of such systems (regardless of how they did the work) was quite onerous. So if you MUST have that, be sure you take it into consideration when you are determining what hardware you need. And you also need to make sure you do load testing that will represent your production system. Otherwise you could well get blindsided with unacceptable performance (and concurrency) when you roll this new auditing out.

    To do the column-based auditing you will need to look at the UPDATE() feature inside triggers. It allows you to test if a column has been updated. Note that updating a column to it's same value still hits this (and I see clients doing that ALL THE TIME). Oh, since you want to do all columns it will be more efficient to do COLUMNS_UPDATED(). You probably want to automate the generation of this trigger code too. DAMHIKT 🙂

    WOW Kevin, what you've said here really causes me some concern. At this point I don't know if my boss is thinking of saving each changed field to a separate record in a history table or if he's thinking of saving each changed field into the same record. This much I do know, it isn't what I thought it was going to be (saving the whole record) when I started this thread. BTW, his concern about saving the whole record that was modified is that it would take up a lot of disk space. I didn't say anything in our meeting, but I wonder how true that is these days. Back in the day we all had to be very concerned with disk space. Today disk space is much cheaper. I know he also is thinking of doing this through the code, rather than in SQL Server itself. I'm comfortable doing it either way, but I'd rather do it the fastest way. I suspect that would likely be some solution done in the database. I'm going to look more into the UPDATE() feature inside of triggers; I've not heard of it before so I've got to research it. One thing that would be very helpful is if you could direct me to some research, blog, MSDN/TechNet article describing the performance penalties for doing "column-based" auditing. We're trying very hard to replace some incredibly old apps (most all using MS Access) with something that's faster. None of those use any auditing, so I imagine that if we came up with a nice, modern looking app that sucks at performance our users are all going to scream, "Give me my old Access '97 app back!"

    Kindest Regards, Rod Connect with me on LinkedIn.

  • As Kevin mentioned, this can be an issue.

    One thing to shortcut is to calculate a hash across the row, and then only check UPDATED() if the hash is different. Again still work.

    I think if you get better requirements, we can make better suggestions. At the least, I'd suggest you look at CT/CDC as this might help you determine what options are built in.

    There's no good way of iterating over a set of columns, as Sergiy mentioned, it doesn't usually make sense in T-SQL to do this.

  • TheSQLGuru (4/27/2016)


    I have come across a few clients over the years that had built their own "column-based" auditing. In every case the overhead of such systems (regardless of how they did the work) was quite onerous. So if you MUST have that, be sure you take it into consideration when you are determining what hardware you need. And you also need to make sure you do load testing that will represent your production system. Otherwise you could well get blindsided with unacceptable performance (and concurrency) when you roll this new auditing out.

    To do the column-based auditing you will need to look at the UPDATE() feature inside triggers. It allows you to test if a column has been updated. Note that updating a column to it's same value still hits this (and I see clients doing that ALL THE TIME). Oh, since you want to do all columns it will be more efficient to do COLUMNS_UPDATED(). You probably want to automate the generation of this trigger code too. DAMHIKT 🙂

    We had that "onerous" problem with our column auditing system. Someone made the very serious mistake of trying to make a general purpose trigger that could be attached to any table and the results were disastrous for performance.

    Part of the reason why folks before me decided to use "column level auditing" is because they "designed" (poor choice of words, for sure) many tables with large numbers of columns (a bit denormalized, to say the least). The trouble was that the generic triggers were written using techniques that required the materialization of both the DELETED and INSERTED tables because they were out of scope using the chosen methods (SQLCLR in this particular case). To update just 4 columns on just 10,000 rows was taking 4 minutes.

    I replaced that with hard coded triggers that examined every column using UPDATE(column) and we got the previously mentioned times down to less than 800 milliseconds even on a table with 137 columns including comparisons between the INSERTED and DELETED logical tables to ensure that a value had actually changed.

    Of course, we didn't write those hard-coded triggers. Instead, I designed code to write the triggers for us. When the table schema is changed, we just rerun the stored procedure to rebuild the trigger for that table and then deploy the trigger.

    Unfortunately and even though we only capture changes (never INSERTs, which would be a gross and unnecessary duplication of data), I couldn't talk them out of capturing both the "old value" and "new value" instead of capturing just the "new value", so the audit table has gotten quite large (heh... and taught myself a lot about partitioning, in the process). I did, however, talk them into using SQL_Variant for the datatype of those two columns so that there's virtually no conversion overhead normally related to converting different datatypes to VARHAR(8000) and then back again on the way out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/27/2016)


    TheSQLGuru (4/27/2016)


    Of course, we didn't write those hard-coded triggers. Instead, I designed code to write the triggers for us. When the table schema is changed, we just rerun the stored procedure to rebuild the trigger for that table and then deploy the trigger.

    Are you able to share that code....would be very interested.

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Jeff, your reply was very interesting. I'm saving that for when I have a discussion with my boss about this. I am especially interested in the statistics you gave for when they tried to do things using a generic trigger. My boss likes things as generic as possible, so I suspect that he'd rather go that way. I'll mention how long it took to run an update!

    I am interested in seeing that stored procedure you said you wrote to help generate the trigger code. That might appease my boss, too. If you can share it, that is. If you can't, I understand.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • Rod at work (4/28/2016)


    Jeff, your reply was very interesting. I'm saving that for when I have a discussion with my boss about this. I am especially interested in the statistics you gave for when they tried to do things using a generic trigger. My boss likes things as generic as possible, so I suspect that he'd rather go that way. I'll mention how long it took to run an update!

    I am interested in seeing that stored procedure you said you wrote to help generate the trigger code. That might appease my boss, too. If you can share it, that is. If you can't, I understand.

    Rod...there is an old thread here

    http://www.sqlservercentral.com/Forums/Topic1544629-146-1.aspx

    where I posted a "generic trigger" and some discussion from Jeff......maybe worth a quick scan thro.

    I do use a generic trigger, but only on tables that I call "master data"....ie where there are few updates, I dont audit for inserts and deletes are not allowed.

    I never got around to using sql_variant, but that makes a lot of sense.

    would like to see Jeff's sp.

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • I'll try to post the code tonight. Have to do some cleanup on it to remove some of the proprietary/sensitive stuff.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'm assuming what you want to do is roll your own Change Data Capture, where you are capturing only those columns that changed within a row, and then insert into an Entity Attribute Value style audit table.

    This method below doesn't traverse across columns dynamically, and it would have to be retrofitted for each table within a trigger, but this would be my first stab at it. Assume that #inserted and #deleted are the virtual tables provided within a trigger.

    create table #inserted

    ( employee_id int, employee_name varchar(30), employee_phone int );

    create table #deleted

    ( employee_id int, employee_name varchar(30), employee_phone int );

    create table EmployeeChangeAudit

    (

    primary key ( EmployeeChangeAuditTime, employee_id, audit_column ),

    EmployeeChangeAuditTime datetime not null,

    employee_id int not null,

    audit_column varchar(2000) not null,

    column_value varchar(2000) not null

    );

    GO

    create table #inserted ( employee_id int, employee_name varchar(30), employee_phone int );

    create table #deleted ( employee_id int, employee_name varchar(30), employee_phone int );

    INSERT INTO EmployeeChangeAudit

    ( EmployeeChangeAuditTime, employee_id, audit_column, column_value )

    SELECT EmployeeChangeAudit as getdate(), employee_id

    , audit_column, column_value

    FROM

    (

    select employee_id, 'employee_name' as audit_column

    , employee_name as column_value from #inserted

    EXCEPT

    select employee_id, 'employee_name'

    , employee_name from #deleted

    ) as b

    UNION ALL

    select audit_column, column_value from

    (

    select employee_id, 'employee_phone' as audit_column

    , employee_phone as column_value from #inserted

    EXCEPT

    select employee_id, 'employee_phone'

    , employee_phone from #deleted

    ) as c;

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I do my auditing via Pop Rivett's article on Simple-Talk. It records the before and after values of each field change, one value pair per record. I have no idea how well it performs under high volume since mine is a low volume system. I have the audit table in a database outside of my production data.

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

Viewing 15 posts - 1 through 15 (of 59 total)

You must be logged in to reply to this topic. Login to reply