This is one DBA’s tale of auditing his databases for changes. There will be many different ways that you can approach this task, and reach a solution. Your solution may be similar to or different from mine. I will explain how I reached my solution, and give you some good ideas of how to accomplish a similar task, for your own systems.
I must mention that my database houses financial data, and needs to be compliant with Sarbanes Oxley requirements. I’m not fanatical about SOX, but have accepted its existence, and have tried to coexist with it, peacefully. At the common denominator between what the auditor wants, and what I need to monitor, I think I found a happy medium. I need to know what changes occur in my databases, so that I can provide an auditable trail to the auditors, and I can reasonably assure myself that no unauthorized changes occur on my system.
With SQL Server 2008, some more functionality will be built into the DB system to allow for auditing these types of changes. I look forward to seeing how this will affect my system, but for now, this system works with SQL 2000 and SQL 2005. C2 auditing is always an option. Third party applications provide another alternative solution. I chose to create a home grown solution that allowed me the flexibility and control over monitoring.
General Overview of the Audit Database Changes System
My shop is under the watchful eye of Sarbanes-Oxley. Not only do I, as a DBA, need to know that changes occur in a controlled manner, changes need to remediated and an auditable trail provided of my actions as well as other actions. I may perform a task that alters the database, but was requested to do so by the change control process. I may roll code into the production environment, but there should have been a business owner request for this as well. I may be tracking down a bug in the system, but a change control process or bug tracking process must be followed in this circumstance as well.
First, let’s define what is meant by Database Changes. I refer too structural changes to database objects. Tables, Views, Stored Procedures, Functions. Not data changes. Not job alterations. Not modifications to users or logins.
Next, I do not presume to dictate how your Change Control process functions. Let’s assume that you do have some process in place, and that it is not only understood but followed. If it is not followed, disciplinary actions can and should occur to those that breach the Change Control process.
Next, the system needs to be minimally impacting, constant, and separate from the main server being monitored. We want business to continue as usual, but we need an overshadowing system that will know of all changes that occur on the monitored db. I have chosen SQL Server Trace to perform these actions. Not the Profiler, though you can use this to do the same task, but in a manual way. Trace allows me to decide what type of events I want to monitor, and then gather specific columns of information about these events. I can drill down to very specific events; even filter specific columns for specific data results. This ability allows me to determine what my business needs are, and ignore things that I do not care to monitor. The files generated by the Trace system are rather small, and can easily be used in a variety of ways once generated. I use the files to keep as a backup, and import them to a system to report.
Since we need to monitor each server, all day long, but want periodic results, I chose a window of 1 hour to be the largest period monitored. Thus a trace will generate a file for a 1 hour block, and then start a new file. Within each hour file, I can know fairly quickly of any breaches of process. Instead of waiting until the next day to process a day’s worth of data. So each hour a job fires off a trace, and then quits and repeats. Also each hour, another job will import this trace file into a database for reporting.
The way I decided to accomplish this task was as follows.
- I create a central Monitor Server with SQL Server installed on it.
- For each server I wish to monitor, a linked server will be created and tested.
- The Monitor Server will have a job that fires off, and reaches over to another server (via linked server), and starts a trace.
- The trace file will be stored locally, on the Monitor Server. The ability to create a file will need to be configured and tested.
- When the trace file has been finished, a process will import this data to a Database Changes database for reporting.
- Scrubbing of the data will occur, and data removed based according too business needs and rules.
- Someone will review this data
- DBA changes will be reviewed.
- non DBA changes will also be reviewed.
- Each item will be justified. A ticket or other Change Control documentation will be associated with an approved action.
- Non approved actions will be raised as potential breaches of the Change Control System.
- Lots of yelling and screaming and finger pointing will occur, until the change is understood and justified.
- Notification Reports can be generated for the Auditors of interested parties.
At one point, a simple email containing all the changes was sent out to needed parties. With many multiple servers to be monitored, this would prove to be a rather large email report. A one by one email was not an option. Reporting Services was implemented to produce a report, and thru a subscription, emails were sent out, with links to the report. Soon, we embedded the report in the email, and this caused problems with too large of emails. In the end, the data is simply reviewed by the reviewer in SQL.
Linked Server Justification
There are always issues with linked servers. I chose this easy method, because we can ensure security on the Monitor Server, disallowing anyone from accessing it, and being able to gain access to the monitored servers it monitors. We control the password changes on the monitored servers, and can alter them at will. We have a mix of SQL Server 2000 and 2005, and did not want to go the mail route on each server. Centrally gathering the data is easily accomplished with a central Monitor Server, and linked servers is an easy way to accomplish this task.
On the Monitor Server, there is an Audit database. Within this database, there are various tables, a few jobs, and a few stored procedures.
This is the main table containing data gathered from the Trace files from other servers. The data will sit here and be reported on.
The fields in the [AuditDatabaseChangeDetail] table are as follows.
This information is the core data we retrieve from the server that experienced the Database Change. It will provide ample information for you to investigate the infraction. These are all fields that are available to be traced for the Database Change events I will describe later. They should be self explanatory. If not, BOL can describe them. The ServerName field is important to maintain, so when you monitor multiple servers you can know where the breach occurred.
As each trace file is created, on each monitored server, I log this into a history table. This data is used to keep track of the status of the Trace file, and to document, historically, the traces performed. This information can be used to troubleshoot issues later.
The fields in the [AuditDatabaseChangeHistory] table include the following:
The [AuditDatabaseChangeHistoryID] field is an Identity to make the row unique. The [TraceFile] indicates not only the name of the trace file created, but the path it was created in. There is an [Imported] field that signals the state of the Trace File. The [DatabaseServerName] field indicates which Monitored Server we are watching. As a file is created, the initial value of the [Imported] field is set to 0 (zero), indicating it was created. When the file is successfully imported, it is set to 1. Other values will indicate various errors that occurred during processing. See the stored procedure [sp_ProcessDatabaseChangesTraces] for more details.
To keep track of the last run status of a Monitored Server, I created a simple table with the following fields.
These let me indicate the ServerName, last run date, and the status of success or failure of the last run. This is rarely used, but does come in handy when I need to spot check the system. If I think that a trace has not fired off appropriately, I can reference this table for the last point in time occurrence of a trace, and determine if its running or not. I consider this a cheating way to quickly monitor status.
This contains a ServerName field and an Enabled flag. These two simple fields allow the system to know if a Server is enabled to be processed by this system. Simple. Select the Enabled records, and viola! You now have the result set of servers to process.
This is an addition to the system that allows me to group servers by a Client and Type of Server. At one point, I had trouble with the job failing that processes the servers part way thru, because of a linked server failure. This would cause all subsequent servers to not be monitored. This table was introduced to group servers together. The fields of this table are as follows:
This table allows me to group a client’s servers together, and further subgroup them by ServerType. For example, I have a client with three types of servers, Workflow servers, Mailroom servers and Dataentry servers. I can call each of these separately to be processed, by Client and ServerType. Or I can call them to be processed all at once, by simply choosing the Client parameter, leaving the ServerType blank. Multiple clients can be called separately. Or I can call all servers to be processed by leaving off both parameters. These features allowed me to setup multiple steps in the processing job (described later) and continue to the next step on failure, allowing better processing.
This table allows me to setup multiple reports to be generated to needed parties. In most cases, I have narrowed it down to 2 groups of changes. Those changes that were performed by the DBA and those not performed by the DBA. The table contains the following fields:
For each [ServerName] I can have multiple [ReportType] . If this is an [Enabled] record, it will be processed by a job described later. The rest of the fields describe the type of email report that is being sent out. To whom it will go is indicated in the Recipients fields. The last field is the [Query] field, wherein you will write the specific sql to pull out the criteria you want. I have hard coded parameters for date into mine that will process a specific datetime range. A sample sql statement is below.
where StartTime Between @DateFrom and @DateTo
and LoginName in ( 'DOMAIN\TJay.Belt')
and ServerName = 'ServerName'
As you can tell from the above query, I am only looking at a certain rate range. Then, I further filter it by Login name, in this case, my domain login name. Add as many of these names as your DBA team supports. Then I am specifying the [Servername] also. This should produce all changes that I have made within the timeframe on said server. As you can imagine, the combinations are endless to what you will want to monitor.
In our case, we basically want to sanity check what the DBA’s have done, and what anyone else may have done. So those are the two queries we use against all the monitored servers involved. This way we know what the DBA’s have done, and what anyone who may not be authorized to make changes has done.
This table contains a lookup value for each of the events that are monitored in this system. These are found in books online, but I often found it nice to have available to join too and add more information about the particular events I was watching.
There are 3 fields in this table. An[ID] field that is the Event ID, a [Name] field that is the name of said event, and a [Description] field with the description of this event.
The fields in this table are as follows:
When a trace is started for these Database Changes, it was going to be a lot of single sql calls to the trace stored procedures for events and columns. So, I decided to store these into a table, and let a cursor cycle thru them, and call them one at a time. This minimizes sql code to create all these events, as well as makes it more customizable. If I want to add or remove events, it can easily be done thru these tables.
The fields in this table are as follows:
This table is similar to the above one. Just for Columns, not Events.
I created a few views to help me out in displaying the data a bit cleaner. These views convert the fields into varchars of specific sizes, so that they are displayed in a standard fashion in various reporting locations. The views that I have used are as follows:
These stored procedures started out living in the master db, and got named sp_ for that reason, so they could be called from elsewhere. But as the project grew, I moved them to an Audit database, but failed to rename them. There are other naming issues I have noticed along the life of this system. In places I name objects ‘Change’ and in others ‘Changes’. If I had the gumption to go back and redo it all, I would. But seeing how it is now in production, I lazily leave it the way it is. Take this chance to update it yourself.
This proc will start up the trace on the remote monitored server. There are a load of parameters to the proc that allow for a lot of customization. A name will be created for the file, based on the Servername and a Datetime stamp. History is created when the trace is started, so we can track the individual trace file status. We use a cursor to cycle thru the table [AuditDatabaseChangeEvents] and [AuditDatabaseChangeEventColumns] to gather data about events and columns and then call the Trace system stored procedures. Once all the events have been set and the columns have been set, we set any filter values that were passed in as parameters and some specific ones in the proc itself. Once this has all been done, the trace is ready to start, and we call sp_trace_setstatus to get it going. The trace will typically run for 1 hour, and quit. A new trace will start at that time from a job that will be described later.
This procedure is used to start the entire lot of enabled servers you want to monitor. It will cycle thru the tables [AuditDatabaseChangesConfig] and [AuditDatabaseChangesClientConfig], and find all the enabled servers, based on the client param you pass in, or it will simply get them all, if no client param was used. After getting each [Servername], it will call [sp_StartDatabaseChangesTrace] and start the traces, as described above. History is added to the [AuditdatabaseChangeTraceHistory] table.
Hourly, we will be not only starting a trace, we will be processing the traces that have previously run. We start this process by looking into the table [AuditDatabaseChangeHistory] and we find the records that were not imported yet. We can do this for all servers, or specify a ServerName. We grab the filename, and path from the history table, and check for file existence prior to loading the trace file. If the file exists, we update the History table with a status indicating we are importing the data. We then use the function ::fn_trace_gettable to import the data from the trace file. This data gets loaded into a staging table first, then into the table [AuditDatabaseChangeDetail]. As errors occur, we will note this in the [Imported] status field in the history table. If no errors occur, we simply update this status field at the end, letting all know that it was successfully imported.
In our case, we copy these trace files to multiple locations for other processing. A copy is sent to a central clearing house for reporting. Another copy is sent to a third party within our group that reviews our actions, mainly looking at the DBA’s actions.
As with the above proc that starts all the traces, this proc will process all, cycling thru the available servers by looking for enabled Servers in the table [AuditDatabaseChangesConfig] and calling the above proc [sp_ProcessClientDatabaseChangesTraces] to get things going. This allows for easy calling of 1 proc from a job to get all things going.
This proc allows me to process all the traces for a particular client. I can pass in a Client and ServerType as params, and it will cycle thru all available records from the table [AuditDatabaseChangesConfig] and calls [sp_ProcessDatabaseChangesTraces] for each client. This is the proc that is used from within the job that allows me to specify a subset of servers to process, allowing the job to continue if a particular Client has issues, so that the entire job does not fail.
This stored proc allows you to customize the items that you would like to be purged from the trace file, and not stored on the db server. Now, this seems a bit hypocritical to purge data that shows changes, for an audit… but hear me out, there is a valid reason. Since the trace will pick up all actions that occur for certain events, and it is indiscriminate at which actions it picks up, some of these may be acceptable to you. Some actions that occur, you may want to ignore, as they have been proved to be safe actions. You can remove them at this stage, and not have false positives in your reporting. For example, any time a read only user writes sql that creates a Temp table, the event [Create Table #temp] will be captured and stored. However, since this is a non issue, I do not want to store these actions. You may find others as well in your organizations. Add these here, but be careful that you do not purge too much valid data.
The purpose of this stored proc is to cycle thru all enabled records in the table [AuditDatabaseChangesEMailConfig] and calls the next proc, which will email the necessary individuals information about this server and its actions. From the config table, we pull vital information to send to the next proc. This uses a cursor to process thru the records, and allows for a single execution of this proc from a job, to satisfy the entire set of enabled servers to have their information emailed out.
This proc will create a formatted email message, inserting the values of the params passed in, and send the email out to the intended parties. The params for this are as follows :
@CCRecipients varchar(4000) = null,
@BCCRecipients varchar(4000) = null,
@Query nvarchar(4000) = null
The Recipients params are self explaining. The Servername will be used to display which server this data report refers too. The date is the time for when the report was ran. The ReportType is a description set in the config table, explained above. The query is what you supplied to produce this data. If no results are returned, no attachment of data will be in existence in the email, and the email will indicate there were no changes. If changes did occur, and data was returned from your query, this data will be attached in simple ascii format to the email.
This isn’t the best way, or only way to report on this information. But when I created this, and supported just a few servers, it was easy and did the trick. Later, I dismissed these procs, and used Reporting Services and subscriptions to send me data. Now, this task has been outsourced to a third party that handles it differently.
[_Audit - Database Changes / Client]
I feel impressed to justify my naming convention. I use the underscore to keep these jobs at the top of the list of jobs, sorted alphabetically. This is a strange habit I picked up years ago. Since the job is an Audit job, I have labeled it as such with the term ‘Audit’. ‘Database Changes / Client’ processes the Database Change by Client. The first incarnation of this job had 1 step. But with the implementation of the Client/ServerType options, I added as many steps as I had for that combination. This allows the job to continue processing, even if a single Client/ServerType combination failed.
Each step calls the proc [sp_StartAllDatabaseChangesTraces] and passed in specific Client and ServerType params. Some forgo the ServerType altogether, and simply call the Client param. Each step has been given the ability to continue to the next step on success or failure, to ensure it process all it can, even if it receives individual failures.
[_Audit - Process Files / Client]
This job will call the process procedures for each Client and ServerType you want. Multiple steps, like the job above has, are needed to ensure proper calling to each combination that you may have in your system. The last step of this job is to call the Purge Database Change proc. This will clean up any unneeded data that was imported.
The Rest of the Process
The rest is up to you. All other steps that you take from this point are not dictated by system jobs or timetables, other than the ones you indicate in your job requirements as a DBA. You will now have data being stored in an audit database. You can act upon it or not. It’s your choice. Since we have a third party reading this information, we do not act upon it as it comes into the system. We get daily reports from this individual that describes the items that have no explanation. They may be a DBA’s actions or otherwise. We can research the data in these tables to determine what happened, who did it, when, etc. And then act upon that information.
The important fact is that there is now detailed data describing changes in your monitored systems, sitting, waiting for you to do something. You or someone ‘must’ remediate them. Regardless of the reality of your solution, you need to document why these changes occurred. Who did what, when it was done, and what was done. Mark this item as resolved and give it good notes. You will find yourself referring to this information in the future, when you need too. Why did so-and-so make X change? You can now research it, and even mine data on its past changes. Now, when the Auditor’s come knocking on your door, you have a solution in place that allows you too easily provide them a report that shows all database changes, and any remediation actions taken.