February 28, 2003 at 2:08 am
The situation is the following.
The server configuration is:
4x 2.0GHz/2M iL3,
2 ultra scsi 3 36GB Raid 1 (Drive C, D)
3 ultra scsi 3 72GB Raid 5 (Drive E)
1.5 GB Ram.
Windows 2000 Server, SQL 2000 Enterprise
The database size is 10GB and is located on Drive E.
The logfile is 5GB and is located on drive C.
The IndexFile is 5GB and is located on drive D.
Tempdb is 2GB and tempdblog =1GB
We have a table(fvalues) with 15 million records.
the structure of the table is:
Field1 Integer,Field2 Integer,Field3 Integer,Field4 Integer,Field5 Float
The 15 million records are divided into 3 periods (5 million records each). Field1 holds the period number.
Field2 is the primary key. (numbers from 1 to 15 million)
We have indexes on Field1(period) , Field2(primary key). Field1 is a clustered index. Cluster index is located on drive E.
Then we run a select statement in the query analyser:
select * from Fvalues where period=2
(It must retrieve 5 million records)
It takes around 1:20-1:50 to execute.
Do you have any sugestions why is so slow? Any sugestions to improve performance?
Thanks,
Spyros Christodoulou
February 28, 2003 at 3:51 am
Take a look at the execution plan and see if it is even using the Clustered index on Field1. I don't think it will because the index only has 3 possible items and therefore does not have a high density value. I think that you may actualy see better performance if you make field2 the Clustered index and dropped the one for field1. It may be using the wrong index or just performing a table scan.
March 3, 2003 at 12:15 am
Dear Antares686,
The select statement is executed on the SQL server.(It uses the clustered index) The reason we are using the period(field1) field is the reduce the time needed to execute the query.
If we use :
select * from fvalues where field2<=5000000
in order to select 5 million records it takes 8-9 minutes.Using field1(period) ..where period =1 we managed to reduce the time to 1:20 - 1:50 minutes
The database is for testing and finding ways to optimise a database system that we are going to develop.
The actual amount of data we are going to have on the table is 2.5 million records per week times 104 weeks (two years of data -total 260 million records).
Most processing of data will occur for one period (2.5 million records), but there are cases that will need to get more than one periods, perhaps up to 5.(12.5 million records-very long process)
The test so far gave us results for selecting 5 million records out of 15 million. 1:20 minutes - 1:50 minutes. (Also, we are going to try the same selection out of 260 million records to see the difference)
What are we looking for are some ways to improve the speed of the query.
Regards,
Spyros Christodoulou
March 3, 2003 at 3:56 am
Other possibilities could be amoutn of memory insufficient, network bandwidth, other processes, amount of memory on receiving station, and a whole world of others.
To get an idea if it is index related or not run
SELECT TOP 5000000 * FROM Fvalues
and see how lon it takes. Also try
SET ROWCOUNT 5000000
SELECT * FROM Fvalues
If the performance does not increase any then you have some other limitation. These should give a good idea if hardware or network limitation is possible since they just fetch records without care for index.
Also note
quote:
It uses the clustered index
Is this a seek or scan. If a scan then you should be able to drop the index and get the same results since a Clustered Index Scan is roughly equal to a Table Scan. As I said an index really should have far more unique values to be good.
March 3, 2003 at 7:39 am
Dear Antares686,
I found the problem. Query Analyser grid is very very slow. If i put the 5 million records from the query in a table in tempdb it takes only 30 seconds.
Regards,
Spyros Christodoulou
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply