March 16, 2010 at 9:55 am
Hey guys.
I'm trying to select all rows of data that have the same ID. Basically I have a table where the same unique person could've arrived at the park at different times. I'm looking at eventually displaying a flag to show that someone has re-entered.
For now I'm simply having trouble displaying the records I want. I currently have ~650,000 to sort through, so my statement is very inefficient.
Below I'll create some random data similar to what I'm using: -
IF EXISTS (SELECT *
FROM sys.tables t
JOIN sys.schemas s
ON t.schema_id = s.schema_id
WHERE t.[Name] = 'Test_Data'
AND s.[Name] = 'dbo')
DROP TABLE dbo.test_data
CREATE TABLE dbo.test_data (
[ID] VARCHAR(128) NOT NULL,
[address] VARCHAR(250) NOT NULL,
[StartDate] DATETIME NOT NULL)
ON [PRIMARY]
DECLARE @maxRandomValue TINYINT,
@minRandomValue TINYINT,
@cnt INT,
@string VARCHAR(256),
@stringlen INT,
@index INT,
@fakeLength INT,
@fakeaddress VARCHAR(8000)
SET @maxRandomValue = 100
SET @minRandomValue = 1
/* CAREFUL! CURRENTLY CREATING 650,000 ROWS OF FAKE DATA */
SET @cnt = 650000
SELECT @string = 'abcdefghijklmnopqrstuvwxyz'
SELECT @stringlen = Len(@string)
WHILE @cnt > 0
BEGIN
SET @cnt = @cnt - 1
SET @fakeLength = Cast(((@maxRandomValue + 20) - (@minRandomValue + 5)) * Rand() + (@minRandomValue + 5) AS TINYINT)
SELECT @fakeaddress = ''
WHILE (@fakeLength > 0)
BEGIN
SELECT @index = (Abs(Checksum(Newid()))%@stringlen) + 1
SELECT @fakeaddress = @fakeaddress + Substring(@string,@index,1)
SELECT @fakeLength = @fakeLength - 1
END
INSERT INTO dbo.test_data
([ID],
[address],
[StartDate])
SELECT Cast(((@maxRandomValue + 1) - @minRandomValue) * Rand() + @minRandomValue AS TINYINT),
Cast(Cast(((@maxRandomValue + 91) - @minRandomValue) * Rand() + @minRandomValue AS TINYINT) AS VARCHAR) + ' ' + @fakeaddress,
Getdate() - ((18 * 365) + Rand() * (47 * 365))
END
GO
/* The above just creates some random data. */
Now, what I'm using to select what I need from this data worked fine on small ammounts of data but on large amounts such as the above table, it takes forever.
SELECT *
FROM [dbo].[test_data] AS a,
[dbo].[test_data] AS b
WHERE a.[ID] = b.[ID]
AND a.[StartDate] <> b.[StartDate];
Would love a pointer or two in the correct direction for this!
March 16, 2010 at 10:02 am
My apologies, I meant to give you some sample output.
I'd want something like: -
------------------------------------------------------------------------------------
ID - | -- Address - | - Start Date --------------- | ID - | -- Address - | - Start Date
------------------------------------------------------------------------------------
14 - | 15 iojwepoh | - 1976-02-26 18:35:20.500- | 14 - | 15 iojwepoh -| 1977-02-26 18:35:20.500
14 - | 15 iojwepoh | - 1977-02-26 18:35:20.500 -| 14 - | 15 iojwepoh-| 1979-02-26 18:35:20.500
March 16, 2010 at 10:06 am
I would consider creating an index - Something with ID, and startdate.
Also, reworked the last select.
SELECT *
FROM [dbo].[test_data] AS a
INNER JOIN [dbo].[test_data] AS b
ON a.[ID] = b.[ID]
WHERE a.[StartDate] <> b.[StartDate];
-- Cory
March 16, 2010 at 10:20 am
I would add an index as Cory suggests.
WITH DupCheck AS (
SELECT ID, COUNT(*) [COUNT]
FROM dbo.test_data
GROUP BY ID
HAVING COUNT(*) > 1
)
SELECT DupCheck.ID, Address, StartDate FROM DupCheck
LEFT JOIN dbo.test_data
ON DupCheck.ID = test_data.ID
ORDER BY DupCheck.ID
March 17, 2010 at 3:27 am
Thanks for that. I've added an ORDER BY StartDate so I can keep things orderly, then added a quick script to test the time taken for the script to run. Suffice to say, a massive improvement. 7 second execution time, in comparison to the previous several minutes.
DECLARE @time DATETIME
SET @time = Getdate();
WITH [DupCheck]
AS (SELECT [ID],
Count(* ) [COUNT]
FROM dbo.test_data
GROUP BY [ID]
HAVING Count(* ) > 1)
SELECT [DupCheck].[ID],
[Address],
[StartDate]
FROM [DupCheck]
LEFT JOIN dbo.test_data
ON [DupCheck].[ID] = [test_data].[ID]
ORDER BY [DupCheck].[ID], [StartDate]
SELECT Datediff(ms,@time,Getdate()) AS "time taken for grabbing data"
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy