166 days to create index

Question

166 days to create index

scott_lotus

SSCommitted

Points: 1504
More actions
April 29, 2014 at 1:46 am

#306795

I have a table containing 2,163,568,622 rows hosted on SQL server Standard 2008.
Attempted to create a new index:
--------------------------------------------------------------------------------------------------------------------------------------------------
Using TEMPDB = ON this ran for 2 hours 8 mins and only read 27m rows.
run time 02:08:12.174
rows 27,293,752
Web applications reported periodic timeout messages.
Cancelled.
--------------------------------------------------------------------------------------------------------------------------------------------------
Tried again with TEMPDB = OFF. Ran for 4 hours 34 mins and read 53m rows.
run time 04:35:09.826
row 53,166,762
Web application reported no timeout errors.
Cancelled.
--------------------------------------------------------------------------------------------------------------------------------------------------
TempDB disk space is 500GB , the table I am reading is larger but the resulting index size is smaller.
13m record reads per hours / 2,163,568,622 rows = 166 days to create this index !
5 other indexes on this table display about 70% fragmentation. Will try rebuilding these first. Previous rebuilds on existing indexes took approx 1 - 2 hours only.
Enterprise edition is not an option.
Index was suggested via tuning advisor and the dev team concur.
Thanks for ideas.
Scott
CREATE NONCLUSTERED INDEX [_dta_index_L_9_375672386__K3_K2_K10_1_6_7_9_11_12] ON [dbo].[L]
(
[DID] ASC,
[JID] ASC,
[Date] ASC
)
INCLUDE ( [LID],
[Ig],
[Sp],
[Od],
[TID],
[SID]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
CREATE TABLE [dbo].[L](
[LID] [int] IDENTITY(1,1) NOT NULL,
[JID] [int] NULL,
[DID] [int] NOT NULL,
[Lat] [decimal](20, 10) NULL,
[Long] [decimal](20, 10) NULL,
[Ig] [bit] NULL,
[Sp] [decimal](4, 1) NULL,
[Co] [decimal](4, 1) NULL,
[Od] [decimal](8, 2) NULL,
[Date] [datetime] NULL,
[TID] [int] NULL,
[SID] [int] NULL,
[In] [datetime] NULL,
[Icon] [int] NULL,
[Re] [bit] NULL,
[Lst] [int] NULL,
[Cel] [decimal](13, 8) NULL,
[Ch] [decimal](13, 8) NULL,
[St] [int] NULL,
[Ce] [int] NULL,
[La] [int] NULL,
[Co] [int] NULL,
[So] [decimal](13, 8) NULL,
CONSTRAINT [L_PK] PRIMARY KEY CLUSTERED

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 398679 More actions · Answer 1

First, please, for clarity's sake, rename that index. Anytime I open a server and see _dta at the start of an index, I almost want to cry.

I would suggest you capture the wait statistics on the process as it's running in order to understand where your slow-down is coming from. I'd think it's either in memory or possibly waiting on I/O within tempdb, but understanding why and where is going to be a big help.

How much memory does the system have? Is tempdb and your database on the same disk? What's the clustered index on? I can see that you have one, but not the definition.

But, at the end of the day, you'll probably have to throw hardware at this problem. You're trying to sort a very large data set. You need memory and lots of fast disks to get it done.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

scott_lotus SSCommitted Points: 1504 More actions · Answer 2

Thanks for the reply.

"First, please, for clarity's sake, rename that index. Anytime I open a server and see _dta at the start of an index, I almost want to cry."

Can do - sorry.

"I would suggest you capture the wait statistics on the process as it's running in order to understand where your slow-down is coming from."

I captured the following dueing index creation using SP_WHO3.

---------------------------------------------------------------------------------

dd hh ss ss mss: 00 02:08:12.174

sql_text: <?query -- insert [dbo].[L] select * from [dbo].[L] option (maxdop 1) --?>

writes: 662

wait info: NULL

blocking session: NULL

TempDB = on.

---------------------------------------------------------------------------------

dd hh ss ss mss: 00 02:11:41.440

sql_text: <?query -- insert [dbo].[L] select * from [dbo].[L] option (maxdop 1)--?>

writes: 675

wait info: (26544ms)PREEMPTIVE_OS_WRITEFILEGATHER

blocking session :NULL

TempDB = on.

---------------------------------------------------------------------------------

dd hh ss ss mss: 00 00:01:29.580

sql_test: <?query -- insert [dbo].[L] select * from [dbo].[L] option (maxdop 1) --?>

writes: 5

wait info: (33ms)PAGEIOLATCH_SH:DM2:1(*)

blocking session: NULL

TempDB = off.

-----------------------------------------------------------------------------------

dd hh ss ss mss:00 04:35:09.826

sql_test: <?query -- insert [dbo].[L] select * from [dbo].[L] option (maxdop 1) --?>

writes: 4,084

wait info: NULL

blocking session: NULL

TempDB = off.

----------------------------------------------------------------------------------

"I'd think it's either in memory or possibly waiting on I/O within tempdb, but understanding why and where is going to be a big help."

I use Quest Spotlight to monitor blocking and locking performance.

The "WAIT STATISICS" tab is currently showing 75% MISC WAITS.

I will check it again during the index creation and publish the results.

Anything specific to look for ?

"How much memory does the system have? "

96GB

"Is tempdb and your database on the same disk?"

m: data 2.18tb (raid10)

n: logs 553gb (raid1)

t: tempdb 558gb (raid1)

"What's the clustered index on? I can see that you have one, but not the definition."

Clustered index on Primary Key LID (ASC).

NAME: LID

SORT: ASC

DATA TYPE: INT

Size: 4

Idenity: YES

Allow nulls: NO

ALTER TABLE [dbo].[L] ADD CONSTRAINT [L_PK] PRIMARY KEY CLUSTERED

(

[LID] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO

"But, at the end of the day, you'll probably have to throw hardware at this problem. You're trying to sort a very large data set. You need memory and lots of fast disks to get it done. "

15x600GB (SAS) disks in a NexSan SaysBoy. (1 hot spare).

Disk performance during daily operations is great.

SQL performance during daily operation is great.

- 18% CPU utilisation

- 64Gb of 96gb used

- cache hit rate is 82%.

- no deadlocking

- low number of blocked processes

- lock waits 2-5 seconds on Object Locks

Maintenance on a large single table:

- can rebuild in 1 - 2 hours but some writes will timeout.

- can delete 3000 rows per min before causing blocking problems (2.8m inserts per day in this table - another table has more but can truncate).

- cannot create new index tbl.L.

We are very heavy write and low read in general during normal operations.

Thanks for any advice.

Scott

Grant Fritchey SSC Guru Points: 398679 More actions · Answer 3

scott_lotus (4/29/2014)
I use Quest Spotlight to monitor blocking and locking performance.
The "WAIT STATISICS" tab is currently showing 75% MISC WAITS.
I will check it again during the index creation and publish the results.
Anything specific to look for ?

Just what the actual waits are. You already have two that are interesting. PREEMPTIVE_OS_WRITEFILEGATHER is an indication of file growth occurring and your system is waiting on it. So, if you're growing files during this process, it'll certainly slow you down. You'd want to grow those files ahead of time so it's not waiting. You're also seeing PAGEIOLATCH_SH which is flat out an indication you're waiting on the disk for some other process to clear. But those waits seem to be associated with an INSERT query, not your index build. You need to see what the index build itself is waiting on.

ALTER TABLE [dbo].[L] ADD CONSTRAINT [L_PK] PRIMARY KEY CLUSTERED
(
[LID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

Tiny thing, unlikely to help overall, but since your PK is on LID, you don't need to have that in the INCLUDE statement. SQL Server will just ignore it anyway since the key for the PK is automatically stored with the nonclustered index.

15x600GB (SAS) disks in a NexSan SaysBoy. (1 hot spare).
Disk performance during daily operations is great.
SQL performance during daily operation is great.
- 18% CPU utilisation
- 64Gb of 96gb used
- cache hit rate is 82%.
- no deadlocking
- low number of blocked processes
- lock waits 2-5 seconds on Object Locks
Maintenance on a large single table:
- can rebuild in 1 - 2 hours but some writes will timeout.
- can delete 3000 rows per min before causing blocking problems (2.8m inserts per day in this table - another table has more but can truncate).
- cannot create new index tbl.L.
We are very heavy write and low read in general during normal operations.
Thanks for any advice.
Scott

You have a reasonably hefty system. I'd think it could handle what you're adding to it, but you need to identify where the bottleneck is first, then understand why it's a bottleneck. Early, and incomplete, indications are that it's disk related, but you need to get more accurate information about the specific blocking occurring in your index creation script.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

ScottPletcher SSC Guru Points: 100946 More actions · Answer 4

What indexes does SQL report are missing? What is the usage of existing indexes on that table?

There's a reasonable chance that the table should be clustered by [DID] rather than [LID], but obviously not enough info yet to decide that. And would need to know the insert pattern of [DID] column.

First, please run the queries below, after changing the db name and table name if needed, and post the results:

--!! chg to your db name

USE [<your_db_name>]

SET DEADLOCK_PRIORITY LOW --probably irrelevent, but just in case

DECLARE @list_missing_indexes bit

DECLARE @table_name_pattern sysname

SET @list_missing_indexes = 1 --1=list missing index(es); 0=don't.

--!! put your table name/name pattern here

SET @table_name_pattern = 'L'

PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)

--SELECT create_date FROM sys.databases WITH (NOLOCK) WHERE name = N'tempdb'

IF @list_missing_indexes = 1

BEGIN

SELECT

GETDATE() AS capture_date,

DB_NAME(mid.database_id) AS Db_Name,

OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,

mid.equality_columns, mid.inequality_columns, mid.included_columns,

ca1.sql_up_days AS days_in_use,

migs.*,

mid.statement, mid.object_id, mid.index_handle

FROM sys.dm_db_missing_index_details mid WITH (NOLOCK)

CROSS APPLY (

SELECT DATEDIFF(DAY, create_date, GETDATE()) AS sql_up_days FROM sys.databases WHERE name = 'tempdb'

) AS ca1

LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON

mig.index_handle = mid.index_handle

LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON

migs.group_handle = mig.index_group_handle

--order by

--DB_NAME, Table_Name, equality_columns

WHERE

1 = 1

AND mid.database_id = DB_ID()

AND OBJECT_NAME(mid.object_id) LIKE @table_name_pattern

ORDER BY

--avg_total_user_cost * (user_seeks + user_scans) DESC,

Db_Name, Table_Name, equality_columns, inequality_columns

END --IF

-- list index usage stats (seeks, scans, etc.)

SELECT

ius2.row_num, DB_NAME() AS db_name,

i.name AS index_name,

OBJECT_NAME(i.object_id/*, DB_ID()*/) AS table_name,

i.index_id, --ius.user_seeks + ius.user_scans AS total_reads,

dps.row_count,

SUBSTRING(key_cols, 3, 8000) AS key_cols, SUBSTRING(nonkey_cols, 3, 8000) AS nonkey_cols,

ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,

ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,

fk.Reference_Count AS fk_ref_count,

FILEGROUP_NAME(i.data_space_id) AS filegroup_name,

ca1.sql_up_days AS days_in_use,

ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,

ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update

FROM sys.indexes i WITH (NOLOCK)

INNER JOIN sys.objects o WITH (NOLOCK) ON

o.object_id = i.object_id

CROSS APPLY (

SELECT DATEDIFF(DAY, create_date, GETDATE()) AS sql_up_days FROM sys.databases WHERE name = 'tempdb'

) AS ca1

OUTER APPLY (

SELECT

', ' + COL_NAME(object_id, ic.column_id)

FROM sys.index_columns ic

WHERE

ic.key_ordinal > 0 AND

ic.object_id = i.object_id AND

ic.index_id = i.index_id

ORDER BY

ic.key_ordinal

FOR XML PATH('')

) AS key_cols (key_cols)

OUTER APPLY (

SELECT

', ' + COL_NAME(object_id, ic.column_id)

FROM sys.index_columns ic

WHERE

ic.key_ordinal = 0 AND

ic.object_id = i.object_id AND

ic.index_id = i.index_id

ORDER BY

COL_NAME(object_id, ic.column_id)

FOR XML PATH('')

) AS nonkey_cols (nonkey_cols)

LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON

dps.object_id = i.object_id AND

dps.index_id = i.index_id

LEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON

ius.database_id = DB_ID() AND

ius.object_id = i.object_id AND

ius.index_id = i.index_id

LEFT OUTER JOIN (

SELECT

database_id, object_id, MAX(user_scans) AS user_scans,

ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans

FROM sys.dm_db_index_usage_stats WITH (NOLOCK)

WHERE

database_id = DB_ID()

--AND index_id > 0

GROUP BY

database_id, object_id

) AS ius2 ON

ius2.database_id = DB_ID() AND

ius2.object_id = i.object_id

LEFT OUTER JOIN (

SELECT

referenced_object_id, COUNT(*) AS Reference_Count

FROM sys.foreign_keys

WHERE

is_disabled = 0

GROUP BY

referenced_object_id

) AS fk ON

fk.referenced_object_id = i.object_id

WHERE

i.object_id > 100 AND

i.is_hypothetical = 0 AND

i.type IN (0, 1, 2) AND

o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND

(

o.name LIKE @table_name_pattern AND

o.name NOT LIKE 'dtprop%' AND

o.name NOT LIKE 'filestream[_]' AND

o.name NOT LIKE 'MSpeer%' AND

o.name NOT LIKE 'MSpub%' AND

--o.name NOT LIKE 'queue[_]%' AND

o.name NOT LIKE 'sys%'

)

--AND OBJECT_NAME(i.object_id, DB_ID()) IN ('tbl1', 'tbl2', 'tbl3')

ORDER BY

--row_count DESC,

--ius.user_scans DESC,

--ius2.row_num, --user_scans+user_seeks

-- list clustered index first, if any, then other index(es)

db_name, table_name, CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END, index_name

PRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 5

Have you tried using the ONLINE = ON option in your index creation statement?

ScottPletcher SSC Guru Points: 100946 More actions · Answer 6

Michael Valentine Jones (4/29/2014)
Have you tried using the ONLINE = ON option in your index creation statement?

I believe OP said that Enterprise Edition was not an option.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Jeff Moden SSC Guru Points: 1003857 More actions · Answer 7

As Grant said, it looks like the system might be waiting on file growth. Make an educated guess on how big the Leaf Level of the index is going to be and grow your database at least that much.

Also, if you're in the FULL recovery model, your log file is going to grow a lot on this one. Make a backup just before you're ready to try another rebuild and the set the server to the BULK LOGGED recovery model. Your point-in-time backups will take the hit of not being able to recover to a point in time because, in the BULK LOGGED mode, CREATE INDEX is minimally logged.

Also, what kind of table is this? Is it an "audit" or "order detail" type of table by any chance? Do you have a "Date_Created" column on this table and, regardless of type, is it "temporal" and the old rows are mostly static?

As for the new index, everyone on your team realizes that NCIs are a duplication of data that includes all keys, includes, and the PK columns, right? You sure you REALLY want to make an index that turns out to be 9 columns wide (PK includes in the INCLUDEs)? Are you sure that key lookups are going to hurt that bad? Are you sure that your backups can handle the extra load and that you actually have the time to do the backups each night. Are you really, really sure that you want another index and set of stats to maintain on this relatively large table?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Andy sql SSCrazy Eights Points: 9481 More actions · Answer 8

Maybe not relevant, but.... the column [LID] has datatype [int] with IDENTITY(1,1).

The table contains 2,163,568,622 rows.

Max size of an integer is 2,147,483,647.

I guess it's possible that negative numbers have been used, but otherwise I see a problem!

scott_lotus SSCommitted Points: 1504 More actions · Answer 9

Thank you again for the reply , very helpful.

Grant Fritchey (4/29/2014)
Just what the actual waits are. You already have two that are interesting. PREEMPTIVE_OS_WRITEFILEGATHER is an indication of file growth occurring and your system is waiting on it. So, if you're growing files during this process, it'll certainly slow you down. You'd want to grow those files ahead of time so it's not waiting. You're also seeing PAGEIOLATCH_SH which is flat out an indication you're waiting on the disk for some other process to clear. But those waits seem to be associated with an INSERT query, not your index build. You need to see what the index build itself is waiting on.

Ok will grow the MDF.

ScottPletcher (4/29/2014)
First, please run the queries below, after changing the db name and table name if needed, and post the results:

Results are as follows (unmasked names - apologies for masking before - hope this formatting is legible):

2014-04-30 08:37:19.390DM2LocationNULL[DateTimeInserted][DID]48339384434302014-04-30 00:02:00.940NULL30685.116185009395.1800NULLNULL00[DM2].[dbo].[Location]375672386339383

2014-04-30 08:37:19.390DM2 Location[DID][SentDate][LID], [JID], [Latitude], [Longitude], [Ignition], [Speed], [Course], [Odometer], [PacketTypeID], [PacketStatusID], [DateTimeInserted], [DisplayIcon], [RESETFlag], [LastAckTime], [CellVoltage], [CellChargeVoltage], [GSMStrength], [GSMCID], [GSMLacID], [CoulombCounter], [SourceVoltage]48211032723130302014-04-30 08:18:30.490NULL3.1263729958005593.4500NULLNULL00[DM2].[dbo].[Location]3756723862110326

79DM2 Location_PKLocation12166456225LID NULL5681059650464500791017587392014-04-30 08:37:15.4402014-04-30 00:02:00.9402014-04-30 08:37:19.3402014-04-30 08:37:19.3471PRIMARY480700NULL2014-04-26 00:54:22.787NULLNULL

79DM2_dta_index_Location_100_375672386__K2D_K7_1_3_4_5_6_8_9_10_11_12_13_14_15_16Location172166456225JID, SpeedCourse, DateTimeInserted, DID, DisplayIcon, Ignition, LastAckTime, Latitude, LID, Longitude, Odometer, PacketStatusID, PacketTypeID, RESETFlag, SentDate17193336001017587392014-04-30 08:37:19.283NULLNULL2014-04-30 08:37:19.3471PRIMARY480000NULLNULLNULLNULL

79DM2 _dta_index_Location_5_103671417__K2_K1_K3_4_5_6_7_8_9_10_11_12_13_14Location22166456225JID, LID, DID, Course, DateTimeInserted, DisplayIcon, Ignition, Latitude, Longitude, Odometer, PacketStatusID, PacketTypeID, SentDate, Speed50709411001017587392014-04-30 08:37:19.340NULLNULL2014-04-30 08:37:19.3471PRIMARY480000NULLNULLNULLNULL

79DM2 _dta_index_Location_5_103671417__K3_K1Location32166456225DID, LID NULL29265001017587222014-04-30 08:36:54.340NULLNULL2014-04-30 08:37:19.3471PRIMARY480000NULLNULLNULLNULL

79DM2 _dta_index_Location_did_sent_status_locid_invalidsLocation152166456225DID, SentDate, PacketStatusID, LID NULL11839657001017587222014-04-30 08:37:19.307NULLNULL2014-04-30 08:37:19.3471PRIMARY480000NULLNULLNULLNULL

I used this statement found in this forum which reports same results:

select * from

(select user_seeks * avg_total_user_cost * (avg_user_impact * 0.01) as index_advantage,

migs.* from sys.dm_db_missing_index_group_stats migs) as migs_adv

inner join sys.dm_db_missing_index_groups as mig on migs_adv.group_handle=

mig.index_group_handle

inner join sys.dm_db_missing_index_details as mid on mig.index_handle=mid.index_handle

where statement like '%location%'

order by migs_adv.index_advantage

896.941267673874211032723530702014-04-30 09:48:22.680NULL3.1264128343777193.4500NULLNULL002110327211032621103269375672386[DID][SentDate][LID], [JID], [Latitude], [Longitude], [Ignition], [Speed], [Course], [Odometer], [PacketTypeID], [PacketStatusID], [DateTimeInserted], [DisplayIcon], [RESETFlag], [LastAckTime], [CellVoltage], [CellChargeVoltage], [GSMStrength], [GSMCID], [GSMLacID], [CoulombCounter], [SourceVoltage][DM2].[dbo].[Location]

1255862.02415035339384434302014-04-30 00:02:00.940NULL30685.116185009395.1800NULLNULL003393843393833393839375672386NULL[DateTimeInserted][DID][DM2].[dbo].[Location]

Jeff Moden (4/29/2014)
As Grant said, it looks like the system might be waiting on file growth. Make an educated guess on how big the Leaf Level of the index is going to be and grow your database at least that much.

Good idea.

USE DM2

GO

ALTER DATABASE DM2

MODIFY FILE

(NAME = DM2,

SIZE = 1375GB)

GO

-- current size 1350004736KB

Jeff Moden (4/29/2014)
Also, if you're in the FULL recovery model ...

In SIMPLE MODE.

Jeff Moden (4/29/2014)
Also, what kind of table is this? Is it an "audit" or "order detail" type of table by any chance? Do you have a "Date_Created" column on this table and, regardless of type, is it "temporal" and the old rows are mostly static?

Logs location data , approx 2.5m rows per day.

6 hours a day we execute the following in a bid to reduce the table size.

delete top (3000) from location where -- DID is unallocated

This has been added fairly recently, causing index fragmentation.

Jeff Moden (4/29/2014)
As for the new index, everyone on your team realizes that NCIs are a duplication of data that includes all keys, includes, and the PK columns, right? You sure you REALLY want to make an index that turns out to be 9 columns wide (PK includes in the INCLUDEs)? Are you sure that key lookups are going to hurt that bad? Are you sure that your backups can handle the extra load and that you actually have the time to do the backups each night. Are you really, really sure that you want another index and set of stats to maintain on this relatively large table?

All good questions. Yes we understand the space considerations when adding NCIs. We have an overnight service gathering reporting data. This service timed out on a specific report (location table) after 20 mins if concurrent instances of this reports are running (WCF service appears to allow 5 max) but execute successfully one at a time. The service cursors which i believe is inefficient.

OPTION (MAXDOP 1) has been set to no effect. NOLOCK (which i understand is not best practice) has been tested to no effect (not blocking issues). Query appears to SCAN. Tunning Advisor recommended the index listed above. Independently the dev team came to the same conclusion.

Before attempting to add the index its clearer as a result of this post we should investigate:

- growing the mdf manually.

- rebuild existing indexes.

- amend the report service to allow administrative control over the number of concurrent executions and set to 1 for testing.

- investigate the removal of the cursor / making the query more efficient.

- add a statistics job to table location (disabled some time ago) , ensure this is run often

- analyse the query in tuning advisor again

If the index still needs to be added then:

- grow the MDF to the appropriate size.

- min number of columns

Thank you for taking the time to respond.

Scott

RTaylor2208 SSChampion Points: 13298 More actions · Answer 10

I may have missed some information as this is a long thread but I noticed that you have not set a MAXDOP value when creating your index, if the MAXDOP setting on the server is set to say 1 for example, you will be limiting the index creation statement to only use a single thread.

I have came across this scenario before and setting the MAXDOP = X value in the index creation statement to a higher value to utilise more CPU has dramatically reduced the index creation time.

Be sure though to not set the value to high that it affects the overall performance of the SQL instance.

MCITP SQL 2005, MCSA SQL 2012

Grant Fritchey SSC Guru Points: 398679 More actions · Answer 11

I strongly recommend looking over the execution plan for the query and do your own investigation and tuning. The DTA is notorious for making poor choices at times and you seem to have multiple indexes from the DTA.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

scott_lotus SSCommitted Points: 1504 More actions · Answer 12

Grant Fritchey (4/30/2014)
I strongly recommend looking over the execution plan for the query and do your own investigation and tuning. The DTA is notorious for making poor choices at times and you seem to have multiple indexes from the DTA.

That is an interesting comment. I had read in the past to avoid creating your own indexes because DTA was more likely to do it correctly.

Grant Fritchey SSC Guru Points: 398679 More actions · Answer 13

scott_lotus (4/30/2014)
That is an interesting comment. I had read in the past to avoid creating your own indexes because DTA was more likely to do it correctly.

I would love to know where you read that (assuming it wasn't on Microsoft documentation about the DTA). I really don't know anyone who suggests using the DTA except in cases where there is literally no one who knows anything about SQL Server and you're desperate for some hope of better performance. Some of its recommendations can even be dangerous for performance. The kindest thing I can say about it is, always thoroughly test its recommendations.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Jeff Moden SSC Guru Points: 1003857 More actions · Answer 14

@scott_lotus,

I agree with Grant and I believe that you'll find that most of the heavy hitters (on this site, anyway) are very leary of taking DTA at its word. Looking at the width of some of your indexes (all, apparently, DTA recommendations), they appear to be full covering indexes. While that's not necessarily bad, it's also not necessarily good and it may be that less wide indexes with the related key row lookups might not be that bad and could have the advantage of reducing you maintenance windows and table/index footprint without slowing down the related queries in a significant manner.

Of course, if you have the space and the maintenance window has the time, it all might be a moot point.

The reason why I asked what type of table this is is to get a feel for the temporal nature of the table. It would appear that this is, indeed, a temporal log table where the rows are inserted and never updated. It may be that your could benefit greatly from partitioning. Since you have the Standard Edition, that would require a partitioned View rather than a partitioned Table, but the effort may very well be worth the effort. Once automated, it'll save on just about all aspects of the table especially if you do something like partitioning by month and placing the table for each month on separate file groups. It would also give you the hidden advantage of being able to do things like piece-meal restores should that need ever raise its ugly head.

Partitioned Views have the disadvantage of requiring whole row inserts if you want the view to be updateable. You can get by the problem by writing an Instead-Of trigger for INSERTs so that you can still have an identity column on the underlying tables. It may also help your queries through "partition elimination" if one of the main criteria for your query is the date the rows were created. It would also substantially reduce index maintenance time if you truly do write rows and never update them. We could also show you how to "pack" the file groups once you did the empty DID deletes so that you don't have unneccessarily large file groups that reserve a lot of blank space forever.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)