Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Performance on 10 million row table Expand / Collapse
Author
Message
Posted Wednesday, September 18, 2013 4:03 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, November 30, 2013 2:38 PM
Points: 15, Visits: 57
Hello all,

I have a table, which is destined to rapidly grow. To give you an idea, the table is :

CREATE TABLE [dbo].[Production]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[OlympusID] [int] NOT NULL,
[DateTimeStamp] [datetime] NOT NULL,
[PackCount] [int] NOT NULL,
[CaseCount] [int] NOT NULL,
[WorkOrderID] [int] NOT NULL,
[State] [bit] NOT NULL,
[RowID] [INT] INT
)

There will be a query that will read counts occuring within a shift (ie 06:00 - 14:00), HOWEVER , since the counts occur every 5 minutes, I will need to include the last row before 06:00 AND the 1st row following 14:00.

Example:

Counts Time
10 05:54
309 06:02
...
..
25 13:58
91 14:02

So , the query will also need to include the above highlighted rows as well.
A solution that I have considered is a self join using rownumber().
I would join them as

select * from  (SELECT * FROM ProductionTest where datetime='Some day') A INNER JOIN ProductionTest B ON A.RowID=B.RowID-1

Bear in mind that the table will REALLY grow. Undoubtedly, the table with include indexes(probably on date and RowID).

However, do you see any other better way ?

As always, I will really appreciate your thoughts.

Thanks,
V
Post #1495829
Posted Wednesday, September 18, 2013 5:14 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 1:57 PM
Points: 41,558, Visits: 34,479
Why so complex?

Write the query in the simplest possible way. If that query form proves, in testing, to have performance problems, consider alternates.

SELECT <column names> FROM ProductionTest WHERE DateTimeStamp >= '<date> 05:55' and DateTimeStamp <= '<date> 14:05'

If the counts are logged every 5 minutes, then that will catch the one before 6AM and the one after 2PM and all in between. If the counts are logged at different intervals, then you'd need to adjust the times



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1495853
Posted Wednesday, September 18, 2013 5:26 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, November 30, 2013 2:38 PM
Points: 15, Visits: 57
Thanks for the reply,

So according to my example :


Counts Time
10 05:54
309 06:02
...
..
25 13:58
91 14:02


The point is that you don't want just records before 06:02 am, you want to include it in your query because you want to find the exact number of counts
YOU WOULD EXPECT to have at 06:00! So my point is not just to include 05:54 and its counts but to use it for calculation in order to estimate the number of counts at 06:00 precisely.

Does it make sense?

Apologies for not explaining
Post #1495861
Posted Wednesday, September 18, 2013 5:34 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 1:57 PM
Points: 41,558, Visits: 34,479
No, makes less sense.
The query i posted doesn't return just rows from before 6:02. It would return all rows between 5:55 and 14:05 inclusive.

If the intervals are every 5 minutes, why is there a count at 5:54 and another at 6:02? (8 minutes apart)



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1495867
Posted Wednesday, September 18, 2013 5:52 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, November 30, 2013 2:38 PM
Points: 15, Visits: 57

Either you are missing the point or I fail to explain

I will shortly post the script to avoid confusions.

Post #1495874
Posted Wednesday, September 18, 2013 4:04 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Today @ 7:54 AM
Points: 758, Visits: 633
I would do it with a UNION ALL query. One for the interval, and two SELECT TOP(1) queries to get the values just outside the interval. An index on DateTimeStamp is essential. And either clustered, or including Count.

Erland Sommarskog, SQL Server MVP, www.sommarskog.se
Post #1496160
Posted Friday, September 20, 2013 4:35 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, November 30, 2013 2:38 PM
Points: 15, Visits: 57
Hello Erlang,

Many thanks for your reply. Couple of days ago, I found a solution, check the following please :

with cte_test (DatetimeStamp,Packcount,WorkOrderID,RowID)
AS
(
SELECT *,ROW_NUMBER() OVER (Order by X.Datetimestamp) RowID FROM
(
SELECT * FROM
(
select top(1) DateTimeSTamp,PackCount,WorkOrderID from Production
where DateTimeStamp<@StartTime and WorkOrderID=@WorkOrderID
order by DateTimeStamp DESC
) minusrow
UNION
select DateTimeSTamp,PackCount,WorkOrderID from Production
where DateTimeStamp>@StartTime and WorkOrderID=@WorkOrderID AND DateTimeStamp<=@EndTime
UNION
select * from
(
select top(1) DateTimeSTamp,PackCount,WorkOrderID from Production
where DateTimeStamp>@@EndTime and WorkOrderID=@WorkOrderID
order by DateTimeStamp ASC
) plusrow
) X
)


INSERT INTO @teml
select * from cte_test
SELECT * FROM @temp A INNER JOIN @temp B ON a.RowID=b.RowID-1)

Today, I visited the forum to check if there are any messages and I saw your solution and it seems we both thought exactly the same
Post #1497077
Posted Friday, September 20, 2013 4:53 PM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 1:57 PM
Points: 41,558, Visits: 34,479
UNION ALL, not UNION. Otherwise you're paying for an unnecessary distinct sort.


Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1497078
Posted Friday, September 20, 2013 5:11 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, November 30, 2013 2:38 PM
Points: 15, Visits: 57
Thanks Gail,

That's a good spot. Will correct accordingly.

V
Post #1497082
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse