January 24, 2008 at 9:22 am
Hi,
I have a table that logs each visits on a web site. I would like to know the number of unique visitors. So I use a select distinct, but for a table of 25 M rows, the query take 3 to 5 seconds. Is there a way to speed that up?
The table :
CREATE TABLE [dbo].[tbl_visits](
[id] [int] IDENTITY(1,1) NOT NULL,
[startTime] [datetime] NULL CONSTRAINT [DF_tbl_visits_startTime] DEFAULT (getdate()),
[fk_tbl_ip_address_id] [int] NULL,
CONSTRAINT [PK_tbl_visits] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [idx_ip] ON [dbo].[tbl_visits]
(
[startTime] ASC,
[fk_tbl_ip_address_id] ASC
) ON [PRIMARY]
The query :
select count(distinct fk_tbl_ip_address_id) from tbl_visits
and startTime between '2007-01-11 00:00:00' and '2008-01-11 23:59:00'
The index is used and an seek is made. When I remove the distinct, it runs really fast. I try to think about a table to store each unique visitor, but it's not possible since the number of unique visitors changes depending of the date range (a visitor can be counted once in a day, but 5 times in a week)
Any idea how to speed that up?
Thanks for any help
Stephane
January 24, 2008 at 10:33 am
Some other ideas:
it sounds like you have to be able to report down to visits per day so you could
create a reporting table that gets updated each night for the ip activity by day
you could also insert / update to it on the fly instead of nightly based on needs
January 24, 2008 at 10:34 am
you could also insert the non distinct hits to a temp table and aggregate against that as a test.
January 24, 2008 at 10:46 am
Joel Ewald (1/24/2008)
you could also insert the non distinct hits to a temp table and aggregate against that as a test.
Hmm... how would you do that?
January 24, 2008 at 10:52 am
CREATE TABLE #Test ( fk_tbl_ip_address_id varchar(15) )
INSERT INTO #Test
select fk_tbl_ip_address_id
from tbl_visits
and startTime between '2007-01-11 00:00:00' and '2008-01-11 23:59:00'
select count(distinct fk_tbl_ip_address_id ) FROM #Test
you might even want to create an index against the temp table
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply