Audit Trails and Logging Part II

  • Jack Corbett (6/10/2008)


    DCPeterson (6/10/2008)


    The most common problem I've seen when it comes to audit logs is that the requirements aren't well defined so you get something like this: "well, we don't know what we might want to see, so just capture everything..." Sure, I can do that, but then you will be buried under a mountain of useless audit data.

    This is the problem. Most audit trails created audit everything and then expect you to sort it out.

    Now I am not advocating that this as the solution, but theoretically you could run your server-side trace to log the desired (well-filtered) selects to a table and, using a trigger, service broker, or another custom service, re-run the query to get the results using an account that is ignored by the trace or when logged ignored by the application re-running queries. This takes care of missing ad-hoc queries. Of course now you have doubled the read load on your database.

    Needless to say this is an interesting problem to consider, especially since I don't know how you could deny rights to the DBA. Then you would also have to protect the logged data and would need to log access to the logged data? Seems like an endless loop:w00t:

    Yeah, this seems like a requirement out of a DBA nightmare! I don't think there is a GOOD way to do something like this. It's just plain ugly, no matter how you approach it.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • GSquared (6/10/2008)


    DCPeterson, on the subject of a "server side trace". I'm not clear on what you mean by that. Are you talking about something outside the database, or some feature of SQL 2005/2000 that I'm unaware of? (I've read that SQL 2008 will offer more built-in auditing features, but haven't had a chance to play with those yet.) Or do you just mean running Profiler constantly?

    Just running Profiler constantly will give you a log, but the files will quickly fill up a lot of disk space, and can't be queried the way log tables can (at least, not easily). If you have Profiler inserting into tables, the performance hit is just about the same as using triggers, as is the space used.

    Could be done. Like any of the other methods, it has advantages and disadvantages. I can't think of an easy way to query such data to readily get the complete history of a particular row in a particular table, to run undo commands, or to track workflow data for management reports. On the other hand, you could get performance data right in your audit log, which could be useful for other things.

    First off, I think it helps to get terminology correct. A trace refers to the built in capability of SQL Server to capture defined data about defined events into a file. Profiler is a GUI that can be used to define a trace and view the results. When someone speaks about a "server side trace" that means that the trace is running on the server independently of Profiler. Profiler is a terrific tool but it does impose some additional overhead and it is not appropriate to run Profiler continuously on a production server.

    Many DBA's don't realize that by default there is a trace running all the time in SQL 2005. Just like in Profiler, a trace can be configured to capture as much or as little data as you want, also some of the events you could capture are going to impose much more overhead than others, so it pays to do your homework and define your trace as narrowly as possible but not narrower...

    Trace files are actually very easy to query with the fn_trace_gettable function. Part of the trace definition allows you to control how many trace files will be generated, how big they will be and whether they will be overwritten etc... so you can control the amount of disk space used.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Lee, I presume you're working with a healtcare system or somewhere outside the US.

    There may be some healthcare specific products out there.

    Logically, capturing the "pipe" to (and from) the database on the server (similar to an ODBC trace) would probably be the "safest". However, the ODBC trace itself is client-side (and is configured on the client), so it is not "safe" for audit purposes. Several years ago, I did hear of applications that captured the IP traffic for this purpose. It would probably have to be captured on the SQL Server.

    If the requirement is application specific and the application is to be developed, the easiest way would probably be to architect your application with a layer that allows you to perform additional actions like capturing the results. However, the overriding issue may be how to ensure that all paths to the data is covered (other apps, ad hoc queries).

    Consider using security settings limited to a service account known only to your company's security administration group, and require all access to the database to go through a separate service running under this account. That service could then perform any logging required. You would have to notify the service who the actual user is, so that this can be logged.

    If this is not an option, it *may* be possible to implement a proxy database ("log and forward") or alternatively, maybe a SQL trace file approach could be combined with a trace scanner that would repeat the SELECT queries to capture the results (against a replicated DB?). This would have some challenges of its own, I suspect.


    Regards,

    Tore Bostrup

  • I have had to satisfy a similar select-logging requirement, in an environment where management was worried about field sales reps selecting the database and taking it with them if they left the company. Of course we had a stored procedure to do the select, which would stop at 500 rows, but in addition I used the same mechanism as my trigger-based transaction logging, which selects the INSERTED and DELETED into a temp table, then loops through the columns thereof. Since the select sproc was using a temp table to gather the results before returning them, I just passed that temp table to the same logging sproc.

  • SQL tracing in general, and server-side tracing in particular (simply because of its much lower overhead) is one of my favorite tools for a wide variety of tasks ranging from auditing, performance tuning, lock and deadlock investigations to looking under the hood at how a particular application is accessing a system, or even how some of the SQL tools perform their tasks. Creating SQL traces using the sp_trace_create proc can be a little tricky for a first timer, but creating your own scripts or procedures gives you much more control over how a trace works then using the profiler, so I highly recommend you give it a shot.

    Here's an example script that I frequently use as a starting point:

    declare @ProcReturn int,

    @TraceID int

    exec@ProcReturn = sp_trace_create

    @TraceID = @TraceID out,

    @Options = 2,

    @TraceFile = N'e:\TraceData\LockingIssues',

    @MaxFileSize = null

    select@ProcReturn,

    @TraceID

    declare@Column int,

    @On bit

    set@Column = 1

    set@On = 1

    while@Column <= 44 begin

    execsp_trace_setevent @TraceID = @TraceID, @EventID = 10, @ColumnID = @Column, @On = @On

    execsp_trace_setevent @TraceID = @TraceID, @EventID = 12, @ColumnID = @Column, @On = @On

    execsp_trace_setevent @TraceID = @TraceID, @EventID = 41, @ColumnID = @Column, @On = @On

    execsp_trace_setevent @TraceID = @TraceID, @EventID = 43, @ColumnID = @Column, @On = @On

    execsp_trace_setevent @TraceID = @TraceID, @EventID = 50, @ColumnID = @Column, @On = @On

    execsp_trace_setevent @TraceID = @TraceID, @EventID = 59, @ColumnID = @Column, @On = @On

    set@Column = @Column + 1

    end

    exec@ProcReturn = sp_trace_setstatus

    @TraceID = @TraceID,

    @status = 1

    select@ProcReturn,

    @TraceID

    This example captures all of the SQL 2000 columns for these events:

    RPC: Completed, SQL: BatchCompleted, SQL: StmtCompleted, SP: Completed, SQL Transaction, and Lock: Deadlock Chain.

    You can examine the contents of the trace after you've stopped it by using this select:

    select*

    from::fn_trace_gettable('e:\TraceData\LockingIssues.trc', default)

    whereEndTime between '2007-11-07 07:56:00.000' and '2007-11-07 08:05:00.000'

    Regards,

    Eric

  • Eric Stimpson (6/10/2008)


    You can examine the contents of the trace after you've stopped it by using this select:

    Therein lies the rub, I think. A trace used for auditing would need to be running as long as the server is running or you would have to deny access to the database while you examine the log.

    Since I am not really familiar with server-side traces, can you output the results to a table like you can with the Profiler GUI? I would assume you can, but then you now have the performance hit of the inserts as G mentioned in one of his posts.

  • Jack Corbett (6/10/2008)


    Eric Stimpson (6/10/2008)


    You can examine the contents of the trace after you've stopped it by using this select:

    Therein lies the rub, I think. A trace used for auditing would need to be running as long as the server is running or you would have to deny access to the database while you examine the log.

    Since I am not really familiar with server-side traces, can you output the results to a table like you can with the Profiler GUI? I would assume you can, but then you now have the performance hit of the inserts as G mentioned in one of his posts.

    I was going to reply and correct that statement, but you beat me to it. No, the trace does NOT need to be stopped to examine the active trace file with the fn_trace_gettable function. And yes, the results of the query can be inserted into a table, just like any other table valued function.

    If the overhead of inserting the trace into a table is that much of a concern, you have options. You can write the trace file to a share and have a different instance grab and import it for example.

    Also for those of you who are not familiar/comfortable with setting up custom traces, you can actually use Profiler to set up a trace and then export the definition as a .sql file. This is a pretty good way of starting to crack the somewhat arcane syntax of the trace events etc...

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • DC and others who brought up server traces as an option: Thanks. I wasn't aware of that option. Sounds like something to definitely investigate a bit.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I was going to reply and correct that statement, but you beat me to it. No, the trace does NOT need to be stopped to examine the active trace file with the fn_trace_gettable function. And yes, the results of the query can be inserted into a table, just like any other table valued function.

    In SQL 2000 when using fn_trace_gettable, if you do not stop the trace you cannot read the complete trace. In fact, there will be a sizeable number of records that have not yet been written to disk. Only stopping the trace forces all records to be written to the trace file and allows you to read the entire contents into a table. SQL 2005 does not have this issue. You can read the entire contents of the trace file even while it is running.

    So if you're in a SQL 2000, all you need to do is start a new trace before you close the previous, and you will not loose any transactions. Furthermore, your now closed trace can include the call to sp_trace_setstatus to show that a new trace was started. In a scenario where you need to trace continuously, you can create a job that runs once per day and uses the date in the trace filename. Even in SQL 2005, I would still use this method as it allows the trace file from previous days to be moved to another server and compressed, archived, etc.

  • Eric, that's a good point about 2000. Traces in SQL 2005 have been significantly improved in several ways including making them generally less expensive. I just assumed that we were talking about 2005 and I personally haven't worked with a 2000 instance for almost two years so I just don't think about it much anymore...my mistake.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • In my case, it was necessary not only to show that a certain user ran a certain query at a certain time (as the trace would do), it was also necessary to show which records resulted, and the state of certain important columns at the time, hence the need for active logging from the select procedure. Also management wanted real-time info, and did not want to archive old info. Admittedly this was not a huge database, but effective normalization keeps the history table manageable.

  • Thomas Keller (6/10/2008)


    In my case, it was necessary not only to show that a certain user ran a certain query at a certain time (as the trace would do), it was also necessary to show which records resulted, and the state of certain important columns at the time, hence the need for active logging from the select procedure. Also management wanted real-time info, and did not want to archive old info. Admittedly this was not a huge database, but effective normalization keeps the history table manageable.

    I have had databases where we needed to be able to track data changes and be able to present a view of the data as it existed at any point in time. This is best accomplished via triggers and archive/history tables. When combined with a trace that captures who ran what query and when, you could "recreate" what a given user saw at a given point in time without having to store the actual result set returned.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Nice work! I've implemented similar processes and found that they work well - one thing you need to beware of with this approach is if you make a change to a base table, you need to be sure to update the audit table and trigger.

  • If you plan to be able to "recreate" a result set using audit tables and the underlying query, keep in mind that a join to a non-audited table could make it impossible to recreate the desired result set, even to just determine which rows were returned. You're stuck with either auditing all tables that might be joined to the table your concerned with or relying on backups of transaction logs to be able to restore the database to a particular point in time and rerun the query when a non-audited table is used.

  • Somehow, I missed this article before. Awesome job, Gus. Huge amount of good info and some great technique comparisons!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 32 total)

You must be logged in to reply to this topic. Login to reply