July 2, 2011 at 3:48 am
(This applies to SQL Server 2008 R2 x64)
I am trying to figure out if the high CPU utilisation I'm seeing during ad-hoc T-SQL row inserts is related to a problem with the plan cache.
Don't think it's relevant, but the table I'm inserting into is...
[ID(int, PK)], [TimeStamp1(datetime,PK)], [TimeStamp2(datetime,PK)], [TimeStamp3(datetime,PK)], [TimeStamp4(datetime,PK)], [Data(float)]
My testcase is an insert of 10000 rows which typically takes a few seconds.
I start my test with 'DBCC FREEPROCCACHE' followed by 'SELECT * FROM [syscacheobjects] ORDER BY usecounts DESC' to verify that my insert query is not known to the engine.
I then kick off my 10000 row insert test (C# app via ADO.net) which works in the following way...
Each row inserted by a statement like this...
INSERT INTO [MyTable]([Id],[TimeStamp1],[TimeStamp2],[TimeStamp3],[TimeStamp4],[Data]) VALUES(0,CONVERT(DateTime, '2011-07-01 23:10:38.115', 120),CONVERT(DateTime, '2011-07-01 22:57:30.115', 120),CONVERT(DateTime, '2011-07-01 22:57:30.115', 120),CONVERT(DateTime, '2011-07-01 22:57:30.115', 120),4.89675543633633E+307);
The app concatenates INSERT statements in a 16kb string command buffer until it's been filled. Then it invokes the command buffer to the database engine. I think you get about 50 row inserts in a 16kb buffer.
After the test completes, 'SELECT * FROM [syscacheobjects] ORDER BY usecounts DESC' shows the following row at the top, with 10000 'usecounts'. 'cacheobjtype' is 'Compiled Plan' and 'objtype' is 'prepared'
(@1 int,@2 varchar(8000),@3 varchar(8000),@4 varchar(8000),@5 varchar(8000),@6 float(53))INSERT INTO [MyTable]([Id],[TimeStamp1],[TimeStamp2],[TimeStamp3],[TimeStamp4],[Data]) values(@1,CONVERT([datetime],@2,(120)),CONVERT([datetime],@3,(120)),CONVERT([datetime],@4,(120)),CONVERT([datetime],@5,(120)),@6)
So it seems to me from this evidence alone that the auto-parameterisation has worked perfectly and the query has been effectively cached.
But the fuzzy feeling goes away when I watch perfmon 'SQL Compilations/sec' and 'Batch Requests/sec' during the test. They are both the same (somewhere around 1000/sec). Everything I've read (which now feels like an awful lot for such a narrow subject) says to me that compilations/sec should be ~0 for this test.
So my question: why am I seeing such high compilations/sec when the plan cache seems to have things under control?
I've looked at 'ALTER DATABASE MyDatabase SET PARAMETERIZATION FORCED' but it's done nothing to help (that really felt like a stab in the dark).
I know there are all the questions like: why not use parameters or stored procedures. I know I could look at table value parameters / bulk inserts for large numbers of row inserts, but I just need to get to the bottom of whether there's anything that can be done right now with this application as it is.
James
July 2, 2011 at 1:46 pm
I am not an expert but figured I would kick off the conversation.
I want to know the root cause too but I will still say that SQL Server is not as capable at handling a long series of insert statements as other RDBMS' I have met. Table-Valued Parameters are going to be a better move long-term, or even better yet look at the SqlBulkCopy .NET class. Coming from C# it should be a cinch to refactor the app to leverage either method but it sounds like you already know that...sorry, I still had to say it 😀
Back to root cause...
A couple trains of thought come to mind:
1. Thoughts of needless overhead calling CONVERT so many times
> Can I ask why you bother wrapping your date values in CONVERT functions? SQL Server will do an implicit conversion for you provided your string literals are properly formed dates, which it looks like they are from your INSERT INTO example. When I ran your sample INSERT 10,000 times it took 4 seconds consistently. When I ran it without the CONVERT functions it took 3 seconds consistently.
2. Thoughts of page splits generating high amounts of CPU
> What is the clustered index on the target table and what is its fill factor?
> What non-clustered indexes are on the table and what are their fill factors?
> Are you doing a sorted series insert per the clustered index?
As an aside should you stick with ad hoc SQL I would stick with a series of INSERT statements. 10K INSERT INTO statements with the CONVERT function in place took 4 seconds. It took 3 seconds when I removed the CONVERT functions. I also tested using one INSERT INTO statement with no CONVERT and row constructors (took 8 seconds for 10 statements each with 1K rows) as well as with an INSERT INTO...SELECT...UNION ALL (killed it after 2 minutes for 10K rows).
EDIT: adjust timings
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 2, 2011 at 1:48 pm
PS This is fresh in my mind due to another spirited thread but here is your query to look at execution use counts rewritten to use the current set of DMV/Fs:
SELECT cp.bucketid,
cp.refcounts,
cp.usecounts,
cp.size_in_bytes,
cp.memory_object_address,
cp.cacheobjtype,
cp.objtype,
cp.plan_handle,
cp.pool_id,
st.text
FROM sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st
ORDER BY cp.usecounts DESC
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
July 3, 2011 at 7:00 am
Thanks very much for replying.
I was motivated to initially post because I was worried that the high Complilations/sec was causing the high CPU utilisation during inserts. To me, this behaviour is an anomaly, given the apparent use of the cached plan. Everything I read seems to indicate that each query compilation will take up CPU time, so I'm keen to rule this out as the major contributor towards my high CPU usage. I'm quite prepared to believe that the CPU time is actually being taken up by something else (as per your suggestions). It's just that I couldn't see a way of correlating those perfmon Complilations/sec with any event type in sql server profiler. For instance, the 'TSQL' class of events in profiler allows you to capture 'StmtRecompile', but nothing like 'StmtCompile'.
I did find this interesting posting on dba.stackexchange. The interesting answer by 'yrushka', shows that if you capture 'Performance: Performance statistics' and run your test (with initially empty plan cache), then you'll see two rows of 'performance statistics' showing up in the profiler whenever a query is first compiled and added to the plan cache. On my 10000 row insert test, I do indeed get these rows logged at the point where the database engine sees the query for the first time. The properties of the second row of 'performance statistics' shows the auto-parameterised query. There are no more 'performance statistics' rows logged for the rest of the test, so again, this evidence points towards the database engine using the cached query with no repeated compilations actually occurring.
This is leading me to suspect that perhaps perfmon Complilations/sec is considering the parse and recognition of an ad-hoc non-parameterised query to be classified as a compilation. i.e. maybe it counts as a compilation even if the ad-hoc query is subsequently matched to an auto-parameterised query in the cache. Maybe this counter would stop showing high values if I changed the ad-hoc queries to be parameterised. I will try to give this a go and see if it makes any difference. I want to get the fundamental usefulness of this counter back.
On to your suggestions...
Removing the CONVERT()s seems like a good suggestion to me, but I was unable to get any measurable reduction in insert time by removing them from around the datetime literals. I did try upping the row count to 20000 and still didn't find any measurable time saving. BTW, my 10000 row insert is about 4 seconds too. I didn't write the app, but I imagine the developer considered that giving SQL Server the exact specification as how to parse the date time (style ID 120) literal would a) possible end up saving it parsing time and b) form a formalised safety net for parameter passing. I don't really have an opinion on it though.
Just so I definitely understand your recommendation of using 'a series of INSERT statements' - are you saying that it's good to continue to batch up n INSERT statements in a string command buffer on the client and then despatch that to the database engine? I mean the alternative being to do one row insert per .net ExecuteNonQuery(). I've tried this before, a long time ago, and remember convincing myself it was slower. I've just switched over to that insert model and repeated the 10000 row insert test and found that it didn't measurably slow down the inserts. Maybe because I'm working with a local SQL Server instance? Maybe the overhead of a non-loopback network connection would start to be disclosed with so many 'chatty' statements being invoked from client to database engine?
I'll look into the indexes and fill-factors next and may be back for some advice on those soon.
July 5, 2011 at 10:27 am
jimmerthy (7/3/2011)
Just so I definitely understand your recommendation of using 'a series of INSERT statements' - are you saying that it's good to continue to batch up n INSERT statements in a string command buffer on the client and then despatch that to the database engine?
Yes.
I mean the alternative being to do one row insert per .net ExecuteNonQuery().
No, definitely a worse option than sending 10K at a time...and from my testing 10K separate inserts per batch outperformed both one insert followed by a 10K row constructor as well as one insert followed by a 10K SELECT...UNION ALL set.
I'll look into the indexes and fill-factors next and may be back for some advice on those soon.
Sure. Please post back the results.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply