Adding a primary key to the IIS ODBC logging table

  • All,

    I recently implemented IIS ODBC logging for our Windows Server 2008 R2 web server. I've referred to http://support.microsoft.com/kb/245243 which shows the required table structure. I have about 30,000 records in the table now (this is after I run a SP after hours each day to clean up records deemed irrelevant) and there is no primary key. I don't believe I can make the logtime field a primary key because it is not unique for each record (there are several entries with the exact same timestamp. I run queries against this table and the performance is less than ideal. I'm thinking that a primary key autonum column would be good but I'm not sure exactly how to implement it and not break the IIS ODBC logging. I can't find anything on this online (figured it should be common).

    I know that I can do something like:

    alter table InternetLog add ID INT IDENTITY

    but I'm not sure if this is the way to go and it doesn't create the IDs aligned with the logtime ascending or descending (which I think would be an ideal way to store it). Any suggestions are greatly appreciated!

  • i wouldn't be fixated on having a primary key just for the sake of having a primary key;

    if you will be searching by [logtime] date primarily , i would simply put a clustered index on the [logtime] field instead. a clustered index does not have to be unique.

    the idea is to make sure you can search the fields the fastest possible, not just make sure there is a PK.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (7/9/2012)


    i wouldn't be fixated on having a primary key just for the sake of having a primary key;

    if you will be searching by [logtime] date primarily , i would simply put a clustered index on the [logtime] field instead. a clustered index does not have to be unique.

    the idea is to make sure you can search the fields the fastest possible, not just make sure there is a PK.

    I agree with Lowell on this. I would create a clustered index on the logtime column and not worry about a primary key at this time.

    I did this when doing FTP logging to a table at a previous employer.

  • Thank you both for the super quick responses!

    Yes, I'm not necessarily after a primary key here - I mainly want to ensure quick searches. I will try and create a clustered index on the logdate column. In my queries I'll often run by a date range as well as finding text strings like '%somename%' for the target and parameter fields (or RIGHT() or LEFT()).

    For the T-SQL, could I do something like the following?

    CREATE CLUSTERED INDEX cix_InternetLog__logDate

    ON InternetLog (logDate)

    Also, how often would the index need to be rebuilt with about 5-6K records added every day? Any other suggestions are greatly appreciated.

  • techguy817 (7/9/2012)


    Also, how often would the index need to be rebuilt with about 5-6K records added every day? Any other suggestions are greatly appreciated.

    If you're only ever inserting into the table and logDate is ever-increasing, which it sounds like it should be, then never. If logDate is in fact ever-increasing go ahead and make the fillfactor 100 when creating the index to ensure as little wasted space in the index as possible.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks. So I'll need to run the following before creating the index, right?

    Use IISLogs;

    GO

    sp_configure 'show advanced options', 1;

    GO

    RECONFIGURE;

    GO

    sp_configure 'fill factor', 100;

    GO

    RECONFIGURE;

    GO

  • No, using sp_configure that way will change the instance-level default. That takes over when no fill factor is specified when creating the index, as you have done above. I recommend specifying the fill factor in the create index statement itself, like this:

    CREATE CLUSTERED INDEX cix_InternetLog__logDate

    ON InternetLog (logDate)

    WITH (FILLFACTOR = 100);

    0 is the default for new installations (0 is equivalent to 100). If you are in the habit of specifying the fill factor for each index creation (which is a good thing to get in the habit of doing) then you would say 100 in the CREATE INDEX statement. This protects us from someone having changed the instance-level default and us not getting what we expected, as well as compels us to think about the index usage patterns and what a good fill factor would be. In your scenario I recommend using 100 as a starting point.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I went ahead and ran that statement to create the index with the fill factor. I read online that when the rows for a clustered index are not unique SQL needs to create it's own, additional, internal column to identify each element in the b-tree structure. These are what my first 100 rows for this column look like:

    2012-05-11 10:48:22.000

    2012-05-11 10:48:22.000

    2012-05-11 10:48:22.000

    2012-05-11 10:48:22.000

    2012-05-11 10:48:22.000

    2012-05-11 10:48:22.000

    2012-05-11 10:48:22.000

    2012-05-11 10:48:22.000

    2012-05-11 10:48:22.000

    2012-05-11 10:48:23.000

    2012-05-11 10:48:23.000

    2012-05-11 10:48:23.000

    2012-05-11 10:48:23.000

    2012-05-11 10:48:23.000

    2012-05-11 10:48:23.000

    2012-05-11 10:48:23.000

    2012-05-11 10:48:23.000

    2012-05-11 10:48:23.000

    2012-05-11 10:48:24.000

    2012-05-11 10:48:24.000

    2012-05-11 10:48:24.000

    2012-05-11 10:48:24.000

    2012-05-11 10:48:24.000

    2012-05-11 10:48:24.000

    2012-05-11 10:48:24.000

    2012-05-11 10:48:24.000

    2012-05-11 10:49:41.000

    2012-05-11 10:49:54.000

    2012-05-11 10:49:58.000

    2012-05-11 10:50:01.000

    2012-05-11 10:55:53.000

    2012-05-11 11:51:32.000

    2012-05-11 14:02:12.000

    2012-05-11 14:02:12.000

    2012-05-11 14:02:12.000

    2012-05-11 14:02:12.000

    2012-05-11 14:02:12.000

    2012-05-11 14:02:12.000

    2012-05-11 14:02:12.000

    2012-05-11 14:02:13.000

    2012-05-11 14:02:13.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:02:14.000

    2012-05-11 14:05:24.000

    2012-05-11 15:58:12.000

    2012-05-11 15:58:12.000

    2012-05-11 15:58:12.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:13.000

    2012-05-11 15:58:14.000

    2012-05-11 15:58:14.000

    2012-05-11 15:58:14.000

    2012-05-11 15:58:14.000

    2012-05-11 15:58:14.000

    2012-05-11 15:58:14.000

    2012-05-11 15:58:14.000

    2012-05-11 15:58:14.000

    2012-05-11 15:58:14.000

  • That's true. It is known as a uniqueifier. Whether you provide the uniqueness or whether SQL Server must help, a clustered index entry must uniquely identify a row. So, if your chosen clustering key will not get you to a single row in all cases (your scenario) then you cannot use the UNIQUE modifier when creating the index. In this scenario SQL Server handles the uniqueness requirement for us under the covers. Nothing to be too concerned about. If you have doubts test with the non-unique clustering key, and also test where you have an identity with a separate index for your logdate and see the differences in terms of space used for the table and indexes, and performance of your specific queries.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks for clarifying. I must have done something really wrong here because now the target column doesn't appear to be showing the correct data. It looks like a lot of repeated data but it doesn't show the hundreds of downloaded .PPTX and .DOCX files that I query for (except for today). I can't imagine that this was related to the index but I don't recall changing anything else. I'll have to try and restore from backup.

  • Just so you can rule it out, adding (or dropping) an index will not modify data in your table in any way shape or form.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Ok thanks for confirming this. I didn't think so but many of the rows seemed to have redundant, overwritten data. I just did a restore, merged in the latest rows, and created the index.

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply