Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««123

REPLACE() and NULL Expand / Collapse
Author
Message
Posted Monday, July 12, 2010 12:57 PM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 8:48 AM
Points: 1,304, Visits: 1,672
webrunner (7/12/2010)

I guess my question is why does the REPLACE function replace anything if the string pattern (second parameter) isn't in the string expression (first parameter)?

It doesn't. It checks for NULL before doing any replacing. This is because of the special quality of NULL. As posted before, unknown in -> unknown out.

The problem comes because we are used to thinking of replace as two operations:
1. Look for a search string within the given string.
2. If found, replace with replacement string.

The conditional there allows us to ignore an unknown or invalid third parameter if there's no replacement to be made.

But that's procedural thinking. Declarative thinking would phrase it more like* "The given string with replacement string in place of all instances of search string".

There's no conditional, so all inputs have to be considered in the function. Consequently, since one of the inputs is unknown, the result should be unknown as well.

* or some such; I'm certainly not an expert on declarative syntax.
Post #950987
Posted Monday, July 12, 2010 1:09 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 10:35 AM
Points: 2,278, Visits: 3,056
I personally think the reason the function returns null is because of NULL concatenation math. I think the underlying code uses loops and concatentates the string piece by piece similar to the construct below. Because an empty string has no length the code cant iterate through the string correctly, thus the returned value is NULL. Even if you could determine the length of the string, the NULL concatenation would make the entire string NULL anyway.

I think it looks something like this under the hood. (Obviously the construct is not TSQL, but the construct should be logically similar to this)

DECLARE @i INT,
@LenStr INT,
@Str VARCHAR(8000),
@Search VARCHAR(8000),
@ReplaceWith VARCHAR(8000),
@NewStr VARCHAR(8000)

SET @Str = 'Hello World ->>/ this is coolz World<<-'
SET @LenStr = DATALENGTH(@Str)
SET @Search = 'World'
SET @ReplaceWith = 'Worldz'--change to null to see what happens
SET @i = 1
SET @NewStr = ''

WHILE @i < @LenStr+1
BEGIN
IF SUBSTRING(@Str,@i,DATALENGTH(@Search)) = @Search
BEGIN
SET @NewStr = @NewStr + @ReplaceWith
SET @i = @i + DATALENGTH(@Search)
END
ELSE
BEGIN
SET @NewStr = @NewStr + SUBSTRING(@Str,@i,1)
SET @i = @i + 1
END
END

SELECT @NewStr





My blog: http://jahaines.blogspot.com
Post #950992
Posted Monday, July 12, 2010 1:27 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 1:04 PM
Points: 2,347, Visits: 2,691
sknox (7/12/2010)
webrunner (7/12/2010)

I guess my question is why does the REPLACE function replace anything if the string pattern (second parameter) isn't in the string expression (first parameter)?

It doesn't. It checks for NULL before doing any replacing. This is because of the special quality of NULL. As posted before, unknown in -> unknown out.

The problem comes because we are used to thinking of replace as two operations:
1. Look for a search string within the given string.
2. If found, replace with replacement string.

The conditional there allows us to ignore an unknown or invalid third parameter if there's no replacement to be made.

But that's procedural thinking. Declarative thinking would phrase it more like* "The given string with replacement string in place of all instances of search string".

There's no conditional, so all inputs have to be considered in the function. Consequently, since one of the inputs is unknown, the result should be unknown as well.

* or some such; I'm certainly not an expert on declarative syntax.


Thanks. I think this code illustrates your explanation, if I understand correctly, that if any parameter passed in is NULL, NULL has to be returned for all rows because unknown in -> unknown out.

-- Create a temp table to store the values.
CREATE TABLE #NullTest (pk int, val varchar(15))

-- Insert the values, including one row with NULL.
INSERT INTO #NullTest VALUES (1, 'aaa');
INSERT INTO #NullTest VALUES (2, NULL);
INSERT INTO #NullTest VALUES (3, 'bbb');
INSERT INTO #NullTest VALUES (4, 'ccc');

-- SELECT 1
SELECT pk, val FROM #NullTest

-- SELECT 2
SELECT pk, REPLACE(val, 'zzz', 'ddd')
FROM #NullTest

-- SELECT 3
SELECT pk, REPLACE(val, 'zzz', NULL)
FROM #NullTest

-- SELECT 4
SELECT pk, REPLACE(val, 'zzz', NULL)
FROM #NullTest
WHERE val IS NOT NULL

-- Drop the temp table.
DROP TABLE #NullTest

In fact, SELECTs 3 and 4 return all NULLs even if there are no rows with a NULL value for the val column. Change NULL to 'eee' in the INSERT statement for pk = 2, for example.

What's funny is that even if you filter out the NULL values (as in SELECT 4 above), you still get all NULLs, except in the example above you get 3 NULL rows instead of 4.

Seems like overkill, but honestly I don't understand it 100% so I can't say that doing it any other way would be better.

Thanks again,
webrunner


-------------------
"Operator! Give me the number for 911!" - Homer Simpson

"A SQL query walks into a bar and sees two tables. He walks up to them and says 'Can I join you?'"
Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html
Post #950998
Posted Monday, July 12, 2010 2:19 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Sunday, April 27, 2014 6:26 AM
Points: 1,521, Visits: 3,036
Adam Haines (7/12/2010)
I personally think the reason the function returns null is because of NULL concatenation math. I think the underlying code uses loops and concatentates the string piece by piece similar to the construct below. Because an empty string has no length the code cant iterate through the string correctly, thus the returned value is NULL. Even if you could determine the length of the string, the NULL concatenation would make the entire string NULL anyway.

But the REPLACE works just fine with empty strings -- those that have zero length. SQL does not consider strings with NULL value as the same thing as empty strings.
Declare @SearchMe       VarChar(50)
Declare @FindMe VarChar(50)
Declare @UseMeInstead VarChar(50)

Set @SearchMe = 'This is a simple sentence.'
Set @FindMe = 'simple'
Set @UseMeInstead = 'dull'
Select #1NormalReplace = Replace(@SearchMe, @FindMe, @UseMeInstead)

Set @SearchMe = 'This is a simple sentence.'
Set @FindMe = 'simple '
Set @UseMeInstead = ''
Select #2ReplaceWithEmpty = Replace(@SearchMe, @FindMe, @UseMeInstead)

Set @SearchMe = ''
Set @FindMe = 'simple '
Set @UseMeInstead = 'non-existent'
Select #3SearchTheEmptyString = Replace(@SearchMe, @FindMe, @UseMeInstead)

Set @SearchMe = 'This is a simple sentence.'
Set @FindMe = ''
Set @UseMeInstead = 'will not be used'
Select #4FindEmpty = Replace(@SearchMe, @FindMe, @UseMeInstead)

Results:
#1NormalReplace
--------------------------
This is a dull sentence.

(1 row(s) affected)

#2ReplaceWithEmpty
--------------------------
This is a sentence.

(1 row(s) affected)

#3SearchTheEmptyString
--------------------------


(1 row(s) affected)

#4FindEmpty
--------------------------
This is a simple sentence.

(1 row(s) affected)
Post #951026
Posted Monday, July 12, 2010 2:50 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 10:35 AM
Points: 2,278, Visits: 3,056
Correct, so my assumption is the code has some control flow logic checks that make the loop exit early if NULL values are detected, but it doesnt make sense to go through the motions, if the end result is going to be NULL.

I suspect something like this is in place.

WHILE @i < @LenStr+1 AND DATALENGTH(@Search) > 0 AND DATALENGTH(@ReplaceWith) > 0

My main point was that even if you were to get past the NULL parameters it would do you no good because NULL cannot concenate with strings to produce a value.




My blog: http://jahaines.blogspot.com
Post #951049
Posted Monday, July 12, 2010 2:56 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Monday, June 16, 2014 9:38 AM
Points: 2,163, Visits: 2,189
Good question, and brings up something I haven't run into before. Thanks!
Post #951053
Posted Wednesday, July 14, 2010 9:41 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, July 24, 2014 4:28 AM
Points: 1,100, Visits: 4,898
Yet another example of NULL causing unexpected results in a query!

(I would say we should stop using NULL but I know we're not supposed to suggest that )

The lesson is, once again, Beware of NULL.
Post #952464
Posted Friday, July 23, 2010 11:51 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, November 25, 2013 9:09 AM
Points: 216, Visits: 118
Use of the below could be argued as an acceptable solution.
if (@SearchMe is not null and @FindMe is not null and @ReplaceMe is not null) Select #1NormalReplace = Replace(@SearchMe, @FindMe, @ReplaceMe)

We have all tried to explain that NULL does not equal 0 or NULL to an end user and often the response I've gotten is "That's STUPID!" This raises the question, What does null really mean? It is more involved than Schrödinger's cat delima. Null brings into question is there a cat or radioactive isotope in the box. My spin is that NULL means "I don't know", or that it has not intentionally been assigned a value by the system or a user. A value of 0 or empty string can be inferred as an intentional assignment. Often we may want to establish a starting point (e.g. shopping cart total). Here a default should be used and not allow nulls. Other times you may have a Yes/No question. Do you use a default? By using a default you have forced an answer to all questions. In fact the answer to the question may be "I don't know" or "I abstain".

In my opinion NULLs serve a purpose. They are the love child of two people arguing semantics.
Post #958146
Posted Tuesday, July 27, 2010 7:10 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Yesterday @ 9:14 AM
Points: 8,679, Visits: 9,205
timothy.bates (7/23/2010)
We have all tried to explain that NULL does not equal 0 or NULL to an end user and often the response I've gotten is "That's STUPID!" This raises the question, What does null really mean? It is more involved than Schrödinger's cat delima. Null brings into question is there a cat or radioactive isotope in the box. My spin is that NULL means "I don't know", or that it has not intentionally been assigned a value by the system or a user. A value of 0 or empty string can be inferred as an intentional assignment. Often we may want to establish a starting point (e.g. shopping cart total). Here a default should be used and not allow nulls. Other times you may have a Yes/No question. Do you use a default? By using a default you have forced an answer to all questions. In fact the answer to the question may be "I don't know" or "I abstain".
In my opinion NULLs serve a purpose. They are the love child of two people arguing semantics.


Null in SQL means only one thing: the absence of a value. In an execution context, local variables can fail to have a value (eg when being assigned in a select statement, or when they have been declared but no value has ever been assigned) - if they do, they are null. NULL occurs in a table when there is no value in the database for this column and row. It doesn't of course indicate why no value is present - just that none is.

In particular it doesn't mean that the value is unknown - it may be known but bad code failed to enter it when it should, it may be known but there hasn't been time to enter it yet, it may be known but secret and withheld from the database, it may be be that the attribute is not present because for this row it doesn't exist (eg wife's maiden name for a bachelor, that is inapplicable as opposed to unknown - although in SQL, which doesn't have a separate NULL to cater for this case, it is usually bad practise to have a schema with columns that are sometimes inapplicable), and so on.

It is unfortunate that the ANSI and ISO people persist in using a definition of NULL that claims that it is a value, when it is not a value but rather an indicator that no value is present. I think that that definition has caused some of the confusion that surrounds NULL. It's even more of a pity that some DBMS suppliers (including Microsoft) continue to treat NULLs in expressions by default as values equivalent, depending on context, to a 0 integer or zero length string or whatever seems to fit in the value domain; this has caused most of the confusion over NULL; the one cheering thing on that front is that soon the present default treatment of NULL in expressions in SQL Server will cease to be the default and won't even be able to be selected, only the proper treatment of NULL will be possible.

Reasonable arguments can be made either way for how REPLACE ought to operate when there second parameter matches nothing in the first and the third parameter is NULL. "NULL in means NULL out" (as seen in several earlier comments) isn't one of those arguments - operators like COALESCE and ISNULL are obvious counterexamples to that argument. Personally I like the simplicity of the current behaviour but dislike the hoops I sometimes have to jump through to programme around it - but despite those hoops (and some severe doubts about the complications the current behavious causes to any functional semantics for T-SQL) I think the current behaviour is probably best - KISS usually wins.


Tom
Post #959363
« Prev Topic | Next Topic »

Add to briefcase «««123

Permissions Expand / Collapse