Adding a primary key to the IIS ODBC logging table

Question

Adding a primary key to the IIS ODBC logging table

techguy817

Valued Member

Points: 52
More actions
July 9, 2012 at 10:53 am

#277562

All,
I recently implemented IIS ODBC logging for our Windows Server 2008 R2 web server. I've referred to http://support.microsoft.com/kb/245243 which shows the required table structure. I have about 30,000 records in the table now (this is after I run a SP after hours each day to clean up records deemed irrelevant) and there is no primary key. I don't believe I can make the logtime field a primary key because it is not unique for each record (there are several entries with the exact same timestamp. I run queries against this table and the performance is less than ideal. I'm thinking that a primary key autonum column would be good but I'm not sure exactly how to implement it and not break the IIS ODBC logging. I can't find anything on this online (figured it should be common).
I know that I can do something like:
alter table InternetLog add ID INT IDENTITY
but I'm not sure if this is the way to go and it doesn't create the IDs aligned with the logtime ascending or descending (which I think would be an ideal way to store it). Any suggestions are greatly appreciated!

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Lowell SSC Guru Points: 323743 More actions · Answer 1

i wouldn't be fixated on having a primary key just for the sake of having a primary key;

if you will be searching by [logtime] date primarily , i would simply put a clustered index on the [logtime] field instead. a clustered index does not have to be unique.

the idea is to make sure you can search the fields the fastest possible, not just make sure there is a PK.

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Lynn Pettis SSC Guru Points: 442462 More actions · Answer 2

Lowell (7/9/2012)
i wouldn't be fixated on having a primary key just for the sake of having a primary key;
if you will be searching by [logtime] date primarily , i would simply put a clustered index on the [logtime] field instead. a clustered index does not have to be unique.
the idea is to make sure you can search the fields the fastest possible, not just make sure there is a PK.

I agree with Lowell on this. I would create a clustered index on the logtime column and not worry about a primary key at this time.

I did this when doing FTP logging to a table at a previous employer.

techguy817 Valued Member Points: 52 More actions · Answer 3

Thank you both for the super quick responses!

Yes, I'm not necessarily after a primary key here - I mainly want to ensure quick searches. I will try and create a clustered index on the logdate column. In my queries I'll often run by a date range as well as finding text strings like '%somename%' for the target and parameter fields (or RIGHT() or LEFT()).

For the T-SQL, could I do something like the following?

CREATE CLUSTERED INDEX cix_InternetLog__logDate

ON InternetLog (logDate)

Also, how often would the index need to be rebuilt with about 5-6K records added every day? Any other suggestions are greatly appreciated.

Orlando Colamatteo SSC Guru Points: 182289 More actions · Answer 4

techguy817 (7/9/2012)
Also, how often would the index need to be rebuilt with about 5-6K records added every day? Any other suggestions are greatly appreciated.

If you're only ever inserting into the table and logDate is ever-increasing, which it sounds like it should be, then never. If logDate is in fact ever-increasing go ahead and make the fillfactor 100 when creating the index to ensure as little wasted space in the index as possible.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

techguy817 Valued Member Points: 52 More actions · Answer 5

Thanks. So I'll need to run the following before creating the index, right?

Use IISLogs;

GO

sp_configure 'show advanced options', 1;

GO

RECONFIGURE;

GO

sp_configure 'fill factor', 100;

GO

RECONFIGURE;

GO

Orlando Colamatteo SSC Guru Points: 182289 More actions · Answer 6

No, using sp_configure that way will change the instance-level default. That takes over when no fill factor is specified when creating the index, as you have done above. I recommend specifying the fill factor in the create index statement itself, like this:

CREATE CLUSTERED INDEX cix_InternetLog__logDate

ON InternetLog (logDate)

WITH (FILLFACTOR = 100);

0 is the default for new installations (0 is equivalent to 100). If you are in the habit of specifying the fill factor for each index creation (which is a good thing to get in the habit of doing) then you would say 100 in the CREATE INDEX statement. This protects us from someone having changed the instance-level default and us not getting what we expected, as well as compels us to think about the index usage patterns and what a good fill factor would be. In your scenario I recommend using 100 as a starting point.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

techguy817 Valued Member Points: 52 More actions · Answer 7

I went ahead and ran that statement to create the index with the fill factor. I read online that when the rows for a clustered index are not unique SQL needs to create it's own, additional, internal column to identify each element in the b-tree structure. These are what my first 100 rows for this column look like:

2012-05-11 10:48:22.000

2012-05-11 10:48:23.000

2012-05-11 10:48:24.000

2012-05-11 10:49:41.000

2012-05-11 10:49:54.000

2012-05-11 10:49:58.000

2012-05-11 10:50:01.000

2012-05-11 10:55:53.000

2012-05-11 11:51:32.000

2012-05-11 14:02:12.000

2012-05-11 14:02:13.000

2012-05-11 14:02:14.000

2012-05-11 14:05:24.000

2012-05-11 15:58:12.000

2012-05-11 15:58:13.000

2012-05-11 15:58:14.000

Orlando Colamatteo SSC Guru Points: 182289 More actions · Answer 8

That's true. It is known as a uniqueifier. Whether you provide the uniqueness or whether SQL Server must help, a clustered index entry must uniquely identify a row. So, if your chosen clustering key will not get you to a single row in all cases (your scenario) then you cannot use the UNIQUE modifier when creating the index. In this scenario SQL Server handles the uniqueness requirement for us under the covers. Nothing to be too concerned about. If you have doubts test with the non-unique clustering key, and also test where you have an identity with a separate index for your logdate and see the differences in terms of space used for the table and indexes, and performance of your specific queries.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

techguy817 Valued Member Points: 52 More actions · Answer 9

Thanks for clarifying. I must have done something really wrong here because now the target column doesn't appear to be showing the correct data. It looks like a lot of repeated data but it doesn't show the hundreds of downloaded .PPTX and .DOCX files that I query for (except for today). I can't imagine that this was related to the index but I don't recall changing anything else. I'll have to try and restore from backup.

Orlando Colamatteo SSC Guru Points: 182289 More actions · Answer 10

Just so you can rule it out, adding (or dropping) an index will not modify data in your table in any way shape or form.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

techguy817 Valued Member Points: 52 More actions · Answer 11

Ok thanks for confirming this. I didn't think so but many of the rows seemed to have redundant, overwritten data. I just did a restore, merged in the latest rows, and created the index.