Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

T-SQL Variable vs String Equivalent Expand / Collapse
Author
Message
Posted Monday, April 8, 2013 2:13 PM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, September 19, 2014 1:08 PM
Points: 80, Visits: 351
Very true. However, life is shades of gray, never black & white.

Many of the applications that run on my boxes are vendor applications.

Some are vendors that do not even exist any more.

I have a limited staff (2) and resources to manage > 50 production servers.

Given that fact, I only worry about the ones that are taking a major time to run once or take a medium time but run many times.

In short, I kill the large elephants but a lion or two might still be out there.

But if they do not bother anyone, I do not worry about them.


Good luck in your endeavors!

John



John Miner
Crafty DBA
www.craftydba.com
Post #1440018
Posted Monday, April 8, 2013 3:54 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: 2 days ago @ 7:27 AM
Points: 35,769, Visits: 32,437
Heh... there are no shades of gray. Either the code is fast or slow. Whether you choose to work on it or not is a decision.

Anyway, I've seen it on my home machine where the literal is always faster than the variable. That's using 2K5 Developer's Edition (virtually identical to the Enterprise Edition) on a 32 bit box. I'm not there right now or I'd show you the runs from that.

On the 2K5 Enterprise Edition 64K Box at work, the two pieces of code take turns winning within milliseconds of each other on a million row table. Not that I used dump-variables to take the display out of the picture because I don't want to measure time to display. The execution plan on both is identical to the other including % of batch.

As a bit of a sidebar, using % of batch to figure out which piece of code is more efficient is a very bad thing to do. Even after an actual run, it's still an estimate and I've seen it be 100% incorrect. I even gave Grant Fritchey the example and he included it in his APress book on execution plans.

Trying to test such problems with only 2 rows won't get anyone anywhere. It's easy to make a million row test table for such things in seconds. Here's some code that does just such a thing for this problem.

-- DROP TABLE tempdb.dbo.Table7
SELECT TOP 1000000
PK_Value = IDENTITY(INT,1,1)
, Field_Value = CASE WHEN ABS(CHECKSUM(NEWID()))%2 = 0 THEN 'No' ELSE 'Yes' END
INTO tempdb.dbo.Table7
FROM master.sys.all_columns ac1
CROSS JOIN master.sys.all_columns ac2
;
ALTER TABLE tempdb.dbo.Table7
ADD PRIMARY KEY CLUSTERED (PK_Value)
;
CREATE INDEX IX_Table7_Field_Value
ON tempdb.dbo.Table7 (Field_Value)
;

Here's the code I used to test with. Like I said, I used variables to take the display time out of the picture. It's also worth noting that you should never use SET STATISTICS when testing User Defined Fuctions because it adds a huge amount of time to Scalar Functions in particular.

DECLARE @PK_Value INT
, @Field_Value CHAR(3)
;
SET STATISTICS TIME ON
;
DECLARE @Yes VARCHAR(3)
SELECT @Yes = 'Yes'
;
SELECT @PK_Value = PK_Value
, @Field_Value = Field_Value
FROM tempdb.dbo.Table7
WHERE Field_Value = @Yes
;
SELECT @PK_Value = PK_Value
, @Field_Value = Field_Value
FROM tempdb.dbo.Table7
WHERE Field_Value = 'Yes'
;
GO
SET STATISTICS TIME OFF
;



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1440040
Posted Tuesday, April 9, 2013 8:18 PM
SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Monday, November 3, 2014 4:30 PM
Points: 4,574, Visits: 8,366
On my desktop (2k8) both queries perform equally:

(1 row(s) affected)

SQL Server Execution Times:
CPU time = 141 ms, elapsed time = 148 ms.

(1 row(s) affected)

SQL Server Execution Times:
CPU time = 156 ms, elapsed time = 148 ms.

Times may vary from run to run by couple of ms, but they are always in line with each other.

That's probably because the "generic" execution plan chosen for @Yes version is identical to the "sniffing" plan.
Both are cached and reused.
So SQL Server runs effectively the same code against the same set of data - no difference to be expected.

But it's this particular case of the table structure and data distribution.

If I change the cardinality by using this"
,       Field_Value = CASE WHEN ABS(CHECKSUM(NEWID()))%20 = 1 THEN 'Yes' ELSE 'No' END

then literal constant outperforms varuiable.
it's still < 5ms difference on my machine, but with total execution time around 20 ms it's noticable.

In reverse case (90% of 'YES' and 10% of 'NO') times are basically the same:
 SQL Server Execution Times:
CPU time = 281 ms, elapsed time = 289 ms.

(1 row(s) affected)

SQL Server Execution Times:
CPU time = 297 ms, elapsed time = 287 ms.

....

SQL Server Execution Times:
CPU time = 281 ms, elapsed time = 297 ms.

(1 row(s) affected)

SQL Server Execution Times:
CPU time = 312 ms, elapsed time = 302 ms.

Post #1440630
Posted Tuesday, April 9, 2013 8:27 PM
SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Monday, November 3, 2014 4:30 PM
Points: 4,574, Visits: 8,366
sqlpadawan_1 (4/8/2013)
Imagine a SELECT in a proc that returns active widgets, the WHERE is always going to be the same. So do you code your where clause 'WHERE Active = 'YES'' or WHERE Active = @Yes? If I can consistently prove that the variable code uses more CPU and is slower, than in my mind, it's an open and shut case. What does the variable declaration buy you? I'm either lazy or efficient (probably lazy), but in my mind this just clouds the code.


If you are lazy (good on you! ) create a view ActiveWidgets with hardcoded literal - and use it in all relevant procedures.
Be careful not to put any calculations/conversions into that view, just row filter (and possibly limited columns).
Otherwise you're gonna create horrible performance issues.
Post #1440631
Posted Wednesday, April 10, 2013 3:15 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Thursday, December 18, 2014 1:37 AM
Points: 997, Visits: 3,089
j.miner (4/8/2013)
Maybe I'm misunderstanding param sniffing, but isn't the optimizer going to generate a plan and keep it until something changes (index rebuilt\reorg, stats update, table records added\deleted, etc...), and then generate a new plan based on the updated objects? Where the variable value never changes, param sniffing won't come into play here?

It's got nothing to do with having the wrong plan in cache, but everything to do with the estimate row counts.

When you use a literal, SQL Server will estimate the row count using statistics. When using a local variable, SQL Server has no choice but to do a thumb suck. The thumb suck is row count / 3 when there are more than 2 distinct values in the column and row count / 2 when there are 2 distinct values.

Consider this example:

I have a table with 100,000 rows. Let's say 90% of the rows contain the value 'Yes'. When using a literal the estimated number of rows will be 90,000. When using a variable the estimated number of rows is 33,333.3

The following code illustrates this:

CREATE TABLE #Test (ID INT, Value VARCHAR(3));

INSERT INTO #Test (ID)
SELECT TOP 100000
ROW_NUMBER() OVER (ORDER BY a.object_id)
FROM
sys.all_columns a
CROSS JOIN
sys.all_columns b;

UPDATE #Test SET Value = 'No' WHERE ID <= 5000;
UPDATE #Test SET Value = 'Mbe' WHERE ID <= 10000 AND ID > 5000;
UPDATE #Test SET Value = 'Yes' WHERE ID > 10000;

DECLARE @Yes VARCHAR(3) = 'Yes';

SET STATISTICS XML ON;

SELECT * FROM #Test WHERE Value = 'Yes';
SELECT * FROM #Test WHERE Value = @Yes;

SET STATISTICS XML OFF;

DROP TABLE #Test;





The SQL Guy @ blogspot

@SeanPearceSQL

About Me
Post #1440709
Posted Wednesday, April 10, 2013 3:36 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Thursday, December 18, 2014 1:43 AM
Points: 6,890, Visits: 14,254
If you examine the two plans in the first post of this thread, the first one – for the query with a constant predicate – has been considered for simple parameterization (WHERE [field_value]=@1). Simple parameterization promotes plan re-use – you probably want two separate plans cached for WHERE [field_value] = 'Yes' and WHERE [field_value] = 'No' unless the distribution stats were wildly different. There’s a good chance that SQL Server will use the same plan for both queries.

“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1440715
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse