Index spool to physical index for performance

Question

Index spool to physical index for performance

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply

matt.newman Hall of Fame Points: 3910 More actions · Answer 1

Thanks both for the actual plan info versus cached. I had thought the cached plan would update last run param values or first run after the cache was cleaned, not knowing it was just the estimated.

Attached select from plan.

Also, losing the nolocks won't work out all that well based on our current isolation level use and other architecture. Dirty reads are not the biggest threat here and I had not seen other issues arise from the nolock use. Most calls are flag locked with a bit field in the column if we need it, and run quite a few things single threaded, asynchronously. We use explicit locking when needed. I had looked at setting isolation level to snapshot but we do have some need in locking some things, at some times. Plus the biggest thing asking me to do is come into a place I worked at for a year and change the architecture they had been using since 2003 with millions of lines of code. I saw a couple "no-nos" when starting, but until they become reactive problems, I don't have much time by myself to react on everything coded here. SQL 2005 Standard, 1 server to rule them all - no reporting server and reports go from same box as OLTP.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

Suggestions:

Split the query into two. You've got a union all in the middle, break the query there and do two inserts. Preferably into a temp table not a table variable (table variable is going to mess with query costing anywhere it's later used)

There's a fair few index improvements that can be made here.

Index on [TransCharges] (inTChargeOrigTransID, inTChargeTransID) include (inTChargeAccessorialID, nmTChargeAmount)

p.s. Are those IDs identities? Can they ever be less than zero? Same question for inTransInvoiceID.

Index on TransCluster (inTransState, inTransType, dtTransInvoiceDate) include (inTransUID, inTransBillCustID, inTransInvoiceID, inTransBrandID)

Index on TransSteps (inTransStepTransStepTypeID,inTransStepState) include (inTransStepID, inTransStepTransID,inTransStepAppUserID)

As always, test carefully before throwing onto prod, but I think there will be a good improvement from that.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

matt.newman Hall of Fame Points: 3910 More actions · Answer 3

Tried the table variable versus temp table, table variable had given about a 2 minute time cut over temp table. The results that come back are usually only 1 to 30 records.

Tried the break on the insert for the union, part of pushing this to a table variable. Really didn't do anything to improve this. I was hoping the plan generator could get me a better plan for the select and it really had no affect.

The index on TransCharges is what I have been wondering how to approach and putting off due to disk cost of the size on the index for now and padding needed to keep from going nuts on page splits due to a lot of updates on that index. Although, what you suggest would not resort the order of the index frequently enough to really kill on performance and may be what I try out.

IDs are identities and keys. No, they will never be zero. Most are seeded identities. Same, seeded identifier PKEY.

I had not thought to look outside of TransCharges as it was my main focus. Good to get another perspective to make me step back. TransSteps is another couple GB index size 42.4 million record table. TransCluster is an index on Trans that I can't risk changing for this proc. Trans is a 4 GB table that ties into quite a bit that I am not in scope to alter.

I will definitely play on the stage box with the TransCharges index and will kick around TransSteps. Steps may not need to have a bigger fill factor either. Just need that storage space before going to prod though.

Thanks again for the fresh perspective and the execution plan cache stuff. Been tuning off cache for some time - good to better myself now!

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

matt.newman (4/19/2012)
IDs are identities and keys. No, they will never be zero. Most are seeded identities. Same, seeded identifier PKEY.

Then why are there multiple filters for <some table>ID > 0? If the intention is just to filter out nulls (from left joins), then the predicate should be specifically WHERE <some table>ID IS NOT NULL or the left joins should be turned into inner joins.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

matt.newman Hall of Fame Points: 3910 More actions · Answer 5

Ahh, I see. OrigTransID is original Trans ID which could be 0 if the current one is the original and there is no back dated lookup needed.

Sorry, "IDs" are linked to foreign keys are are primary keys but in this case that field is a foreign key that can be set to 0 for being "blank"- no previously linked Identifier.

--------

Index on [TransCharges] (inTChargeOrigTransID - foreign key linked to another tables PKEY identity, can be 0, inTChargeTransID foreign key equal to another tables PKEY identity equal to an actual record) include (inTChargeAccessorialID foreign key linked to another table, nmTChargeAmount)

p.s. Are those IDs identities? Can they ever be less than zero? Same question for inTransInvoiceID. <-- missed that one ID although yes is an identity, and no cannot be less than 0 can be 0 and is not a seeded column but foreign key and likely to be 0 in many cases.

Index on TransCluster (inTransState, inTransType, dtTransInvoiceDate) include (inTransUID pkey of table, inTransBillCustID foreign key, inTransInvoiceID foreign key, inTransBrandID foreign key)

Index on TransSteps (inTransStepTransStepTypeID resolver table foregin key,inTransStepState state is logically deleted 1 or 0) include (inTransStepID, inTransStepTransID,inTransStepAppUserID) - index would not break the table apart too much. TypeID is type table for resolver and state is logically delete state.

matt.newman Hall of Fame Points: 3910 More actions · Answer 6

You're awesome for the help, Gail. Thank you as always, very much. Just putting up some results in case you get as curious as I.

Index took just over 3 minutes to create on test server with no other load.

rowsreserveddataindex_sizeunused -- initial size without indexes

53133319 12553824 KB7525168 KB5020640 KB8016 KB

rowsreserveddataindex_sizeunused -- added the TransCharges index plus the inTChargeState as we always use this for logical deletion and always holds a record.

53133336 14610152 KB7525168 KB7076856 KB8128 KB

rowsreserveddataindex_sizeunused -- added the TransCharges index exactly.

53133336 16666264 KB7525168 KB9132920 KB8176 KB

Execution time for test environment on procedure call just over 23 minutes on initial index creation and 30 minutes after secondary. I would not trust too much on the results as in testing I would guess that with 2 GB mem on the test server that it dumped some stuff from cache to make the second index and then had to pull back from disk. Index spool still going on, for both index additions. I know that the statistics should auto renew and invalidate the proc but I am still running sp_recompile for the proc each time and not clearing the cache with freeprocache just yet, just in case. I also split the union to two inserts again to try and kick the query plan generation to a better standing. I didn't do an include on statistics or statistics io on considering the time taking to test. So really I did the changes but didn't keep a clean pallet each time which could have skewed results. The main goal was getting rid of that index Spool and then moving to other parts.

Basics from "sp_who2 active" running with index update (again, not going to set statistics or other reads yet as it adds overhead and I'm just going for the execution plan here):

CPUTimeDiskIO

1,029,6566,671,065

Went to utilize the missing indexes DMVs. Picked through them and found a close match. Took what I eyeballed as a possibly good match:

create index holyCow on TransCharges

([inTChargeTransID], [inTChargeChgCstType], [inTChargeTranSegmentID], [inTChargeState],

[inTChargeOrigTransID])

include ([inTChargeAccessorialID], [nmTChargeAmount])

^^ index that will improve performance. Note: Fill factor was not sized for the index in testing. The index has potential for page splits from updates. Would need to test correct padding allotment. Size determination does not factor in the padding. This was tested on Stage.

rowsreserveddataindex_sizeunused – initial size without index

53133319 12553824 KB7525168 KB5020640 KB8016 KB

rowsreserveddataindex_sizeunused – after index was added

53133349 14822448 KB7525168 KB7289136 KB8144 KB

Find missing indexes script (a modification of the index script by Jesse Roberge) that I used, for anyone curious:

SELECT

sys.objects.name AS object_name,

sys.dm_db_missing_index_group_stats.user_seeks,sys.objects.type,

partitions.Rows, partitions.SizeMB,

sys.dm_db_missing_index_details.equality_columns AS equality_columns,

sys.dm_db_missing_index_details.inequality_columns AS inequality_columns,

sys.dm_db_missing_index_details.included_columns AS included_columns,

sys.dm_db_missing_index_group_stats.unique_compiles, sys.dm_db_missing_index_group_stats.user_scans,

sys.dm_db_missing_index_group_stats.avg_total_user_cost, sys.dm_db_missing_index_group_stats.avg_user_impact,

sys.dm_db_missing_index_group_stats.last_user_seek, sys.dm_db_missing_index_group_stats.last_user_scan,

sys.dm_db_missing_index_group_stats.system_seeks, sys.dm_db_missing_index_group_stats.system_scans,

sys.dm_db_missing_index_group_stats.avg_total_system_cost, sys.dm_db_missing_index_group_stats.avg_system_impact,

sys.dm_db_missing_index_group_stats.last_system_seek, sys.dm_db_missing_index_group_stats.last_system_scan,

(CONVERT(Numeric(19,6), sys.dm_db_missing_index_group_stats.user_seeks)+CONVERT(Numeric(19,6), sys.dm_db_missing_index_group_stats.unique_compiles))*CONVERT(Numeric(19,6), sys.dm_db_missing_index_group_stats.avg_total_user_cost)*CONVERT(Numeric(19,6), sys.dm_db_missing_index_group_stats.avg_user_impact/100.0) AS Score

FROM

sys.objects

JOIN (

SELECT

object_id, SUM(CASE WHEN index_id BETWEEN 0 AND 1 THEN row_count ELSE 0 END) AS Rows,

CONVERT(numeric(19,3), CONVERT(numeric(19,3), SUM(in_row_reserved_page_count+lob_reserved_page_count+row_overflow_reserved_page_count))/CONVERT(numeric(19,3), 128)) AS SizeMB

FROM sys.dm_db_partition_stats

WHERE sys.dm_db_partition_stats.index_id BETWEEN 0 AND 1 --0=Heap; 1=Clustered; only 1 per table

GROUP BY object_id

) AS partitions ON sys.objects.object_id=partitions.object_id

JOIN sys.schemas ON sys.objects.schema_id=sys.schemas.schema_id

JOIN sys.dm_db_missing_index_details ON sys.objects.object_id=dm_db_missing_index_details.object_id

JOIN sys.dm_db_missing_index_groups ON sys.dm_db_missing_index_details.index_handle=sys.dm_db_missing_index_groups.index_handle

JOIN sys.dm_db_missing_index_group_stats ON sys.dm_db_missing_index_groups.index_group_handle=sys.dm_db_missing_index_group_stats.group_handle

WHERE

sys.dm_db_missing_index_details.database_id=DB_ID()

AND sys.schemas.name LIKE sys.schemas.name

AND sys.objects.name LIKE '%tablename%' /*sys.objects.name*/ /* table name to review */

/*AND partitions.SizeMB < 500*/ /* remove for report conditions - infrequently but heavy use calls */

ORDER BY sys.dm_db_missing_index_group_stats.user_seeks desc

I am still at a bit of an impass on if I should push this index out. For the up-time the server had the seek count was only at 723 but a similar index without inTChargeOrigTransID was at 10,729 seeks. Very low considering the update counts on the indexes:

(No column name)type_descuser_seeksuser_scansuser_lookupsuser_updates

IX_TransChargesCLUSTERED1168900804465038058621804295

IX_TransCharges_1NONCLUSTERED27950021428204

IX_TransCharges_2NONCLUSTERED49485061021428204

PK_TransChargesNONCLUSTERED1074332901016603526

found by running:

select db.name, tbl.name, isnull(ind.name,'NO CLUSTERED INDEX'), ind.type_desc,

user_seeks,user_scans,user_lookups,user_updates,

last_user_seek, last_user_scan, last_user_lookup,

system_seeks, system_scans, system_lookups, system_updates,

last_system_seek, last_system_scan, last_system_lookup, last_system_update

from sys.dm_db_index_usage_stats ust

left join sys.objects tbl on tbl.object_id = ust.object_id

join sys.databases db on db.database_id = ust.database_id

left join sys.indexes ind on tbl.object_id = ind.object_id and ust.index_id = ind.index_id

where

/* user_seeks < user_updates and */

/* user_scans < 100 and */

/* user_updates > 1000 and */

tbl.name = 'tablename' and

/* ind.type_desc = 'clustered' and user_scans > 1000 and */

db.name = db_name(db_id())

order by 8 desc,1,2,3

The index data on existing indexes makes me want to do a create with drop existing on IX_TransCharges_1 which has no included columns and looks like:

/****** Object: Index [IX_TransCharges_1] Script Date: 04/23/2012 10:03:36 ******/

CREATE NONCLUSTERED INDEX [IX_TransCharges_1] ON [dbo].[TransCharges]

(

[inTChargeState] ASC,

[inTChargeChgCstType] ASC,

[inTChargeOrigTransID] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

The confirmation that the desire for a spool is that an index be needed may give me the push to tweak the existing index unless maybe that's not the right path? I have documented the findings in our ticket system for after we get a hardware upgrade.

Thanks again!

matt.newman Hall of Fame Points: 3910 More actions · Answer 7

Thought this WAS attached, but I hit back by accident. Ah well:

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

Looks good. I could easily have missed something when I did the initial analysis, it's a huge query.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Orlando Colamatteo SSC Guru Points: 182276 More actions · Answer 9

opc.three (4/19/2012)
GilaMonster (4/19/2012)
opc.three (4/19/2012)
You can retrieve the actual plan for your session, check out this article for next time:
Viewing the Execution Plan of a Running Query in SQL Server
That gets an estimated plan, not an actual plan. The only way to get an actual execution plan is to run SET STATISTICS XML ON before running the query (which is what management studio does) or to use profiler and get the showplan xml event, but that has to be running before the query starts.
Hmmm...so no way to get the actual plan without running the query to completion? Typing this out I suppose that would make sense otherwise how would the "actual number of rows" be populated...if so, bummer...thanks

I was curious about this one because of the usefulness of being able to grab an actual execution plan vs. the estimated plan on a session other than your own and learned that we have the ability to grab an actual plan in SQL Server 2012 using Extended Events thanks to the efforts of Jonathan Kehayias:

Extended Events Action to collect actual execution plan

I know this thread is for 2005, but I thought it worth mentioning, for future reference and readers of this thread. That said, the actual plan is still only available after the query completes for the reasons I stated above but we have another way to get at it, and I suspect EE will be cheaper than Trace.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

matt.newman Hall of Fame Points: 3910 More actions · Answer 10

One of the reasons I side-tracked to actual plan gratitude was that I was bugging Jason Strate about how he was collecting the parameters just before that, and having trouble in finding where in the cached plans (was not coming from cached plans) that info was coming from (I did not know how to explain what I was fully looking for but Jason was giving me a pointer).

http://www.jasonstrate.com/resources/extended-event-resources/

At my company we have and audit tracking system in place, but I could see how it may be of benefit to set up an extended event session for catching actual execution plans, and mine out parameter values. I am a fan of audit capture to see who ran what with which parameters and when I saw Jason give some presentations for PASS MN using the actual execution plan and also thought of his presentation on extended events, I was looking forward to putting the two together to play around with parameters used in executions.

Though, again, we already have a system where I work to see who ran what with which values sent into the parameters which I am grateful for. Before working here I did not have an audit trail to reproduce some problems. I tune greatly off wait stats using Confio (love it) and can see a spike on wait stats for execution of a stored procedure. I like to link what I see in the data combed from there with the value that were put for the execution command to see why the procedure ran longer. Usually it's a longer date range in a report, but sometimes it's that blasted parameter sniffing issue and I need to get the right parameters to test where the problem is just to reproduce consistently.