Today we have a guest editorial from Andy Warren as Steve is on vacation. This was originally published on 6 Sept 2016.
I worked on a project recently where we wanted to consolidate the errors from an application error logging table stored across multiple databases to a monitoring repository (not SQL Server) for analysis and dashboarding. As we talked about options for accomplishing this, another member of the team remarked that all log data should go in files, period. My reply was that I thought it was far from that simple (and made a note to write this to get your opinions!).
Writing to a file is fast, no doubt about it, and has lower overhead than logged transactions to a table. Files in well known formats, such as syslog and the Windows Event Log, are ingested directly by common log aggregation and correlation tools. As DBA's we don't have to rebuild indexes on those files or run jobs to delete rows older than x days or x years that may require chunking or partitions to work effectively or worry about run away error conditions that can log millions of rows in a short time. That adds up to a pretty good case for using files.
Is there a case for using a table? One big advantage of a table is usability. Querying a table is easy, far easier in most cases than querying a file directly. It's also typically faster because we can add indexes to support the common queries. Tables allow a log design to have situation specific columns compared to the more generic log formats. Tables also allow us to easily layer on a security model so that we can let certain users see certain kinds of rows. Some kinds of log data such as multi-tenant application audit logs are so critical that they must be stored in the database so that they get the same level of HA and DR coverage as the 'real' data.
For me the main consideration is how the data will be used. If it's just going to be used for security analysis or the occasional debug scenario I'm far more apt to recommend a file based approach if the volume is high. If it's low, who cares, write it to a table. If the data is going to be used by users frequently, then it has to be in a table, or a system that support SQL'ish queries with fast response times and the ability to apply security to what they can retrieve.
After writing this and thinking about it I still don't think it's as simple as "all logs should be in files". Do you agree? Have different concerns when making the decision? I'm looking forward to the discussion.