Blog Post

Concatenation Truncation – Are your strings being truncated?

,

2017-09-25 21_18_34

 

It wasn’t until recently that i paid attention to truncation during concatenation, but that’s because whenever I used concatenation before it was only using short data lengths so I didn’t see the effects of truncation.

When I first noticed that truncation can occur I was building a Huge piece of dynamic SQL , Don’t worry this was just a setup script designed to be ran just once and then saved off for later use on other servers where required, it’s not like I was going to run this thing every couple of minutes on Production or anything ??

So there I was building this massive VARCHAR(MAX) string and concatenated at various points in my code were Database names of the datatype NVARCHAR(128).

The interesting part was that I was expecting SQL server to use my largest data type – the VARCHAR(MAX) and just concatenate the NVARCHAR(128) values into it

this was not the case – what actually happened was my string of  VARCHAR(MAX) characters being truncated down to an NVARCHAR(4000)!

There is a reason for this and its all to do with Data Type Precedence in this case the NVARCHAR is preceding my VARCHAR unless of course I explicitly convert the NVARCHAR to a VARCHAR.

Normally for me this would go undetected as the length of string that I would normally concatenate is way below 4000 chars but as i said earlier – this was a massive piece of dynamic SQL approx 36000 characters in length.

To illustrate the issue better I have put together some examples which show when truncation can occur.

 

For my NVARCHAR Value I will be using the results of the below query :

SELECT TOP 1
name
FROM sys.databases
ORDER BY name ASC

 

Result: AdventureWorks2014  (18 Characters in Length)

 

I will be using this query throughout the examples so that we return a real result from a table, then we will be concatenating the results of two REPLICATE functions just so that we can keep things tidy for the demonstration (and to save me having to count out 4000 Characters of text ?? )

The results will be inserted into a table so that we can demonstrate the final Datatype used.

 

Example 1:

--Concatenate NVARCHAR(128) , NVARCHAR(4000) and NVARCHAR(4000) 
SELECT TOP 1 
name + REPLICATE(N'A',4000) + REPLICATE(N'A',4000) as MyConcatenation 
INTO Results 
FROM sys.databases 
ORDER BY name ASC
--Check the Length of the Column we are expecting 8018 Chars - we get 4000
SELECT LEN(MyConcatenation) AS MyConcatenation_ColumnLength
FROM Results
--Check the DataLength for the Datatype created - NVARCHAR(4000)
SELECT 
Columns.name AS [ColumnName],
Columns.column_id AS [ID],
Types.name AS [DataType],
CAST(CASE WHEN UserTypes.name IN (N'nchar', N'nvarchar') AND Columns.max_length <> -1 THEN Columns.max_length/2 ELSE Columns.max_length END AS int) AS [Length],
Columns.is_nullable AS [Nullable]
FROM
sys.tables AS tbl
INNER JOIN sys.all_columns AS Columns ON Columns.object_id=tbl.object_id
LEFT OUTER JOIN sys.types AS Types ON Types.user_type_id = Columns.user_type_id
LEFT OUTER JOIN sys.types AS UserTypes ON (UserTypes.user_type_id = Columns.system_type_id 
and UserTypes.user_type_id = UserTypes.system_type_id) or 
((UserTypes.system_type_id = Columns.system_type_id) and (UserTypes.user_type_id = Columns.user_type_id) and (UserTypes.is_user_defined = 0) and (UserTypes.is_assembly_type = 1)) 
WHERE tbl.name= N'Results'
DROP TABLE Results

2017-09-25 20_24_23

 

 

So being Naive I wrongly assumed that this would automatically be converted to an NVARCHAR(MAX) , SQL will not do this for us and rightly so – just because I am concatenating two values that together go beyond the 4000 Character Limit for NVARCHAR (non MAX) SQL is going to simply treat this as an NVARCHAR(4000).

 

 

Example 2:

--Next up Explicit NVARCHAR(MAX) Conversion
SELECT TOP 1 
CAST(name AS NVARCHAR(MAX)) + REPLICATE(N'A',4000) + REPLICATE(N'A',4000) as MyConcatenation 
INTO Results 
FROM sys.databases 
ORDER BY name ASC
--Check the Length of the Column, expecting 8018 and we have 8018
SELECT LEN(MyConcatenation) AS MyConcatenation_ColumnLength
FROM Results
--Check the DataLength for the Datatype created - NVARCHAR(MAX) 
SELECT 
Columns.name AS [ColumnName],
Columns.column_id AS [ID],
Types.name AS [DataType],
CAST(CASE WHEN UserTypes.name IN (N'nchar', N'nvarchar') AND Columns.max_length <> -1 THEN Columns.max_length/2 ELSE Columns.max_length END AS int) AS [Length],
Columns.is_nullable AS [Nullable]
FROM
sys.tables AS tbl
INNER JOIN sys.all_columns AS Columns ON Columns.object_id=tbl.object_id
LEFT OUTER JOIN sys.types AS Types ON Types.user_type_id = Columns.user_type_id
LEFT OUTER JOIN sys.types AS UserTypes ON (UserTypes.user_type_id = Columns.system_type_id and UserTypes.user_type_id = UserTypes.system_type_id) or ((UserTypes.system_type_id = Columns.system_type_id) and (UserTypes.user_type_id = Columns.user_type_id) and (UserTypes.is_user_defined = 0) and (UserTypes.is_assembly_type = 1)) 
WHERE tbl.name= N'Results'
DROP TABLE Results

2017-09-25 20_30_33

 

This time we get the 8018 Characters as expected and we also get our NVARCHAR(MAX) .

 

So What about VARCHAR? how does that behave?

 

Onto Example 3:

 

--IMPLICIT CONVERSION, Truncation to NVARCHAR(4000) will occur 
SELECT TOP 1 
name + REPLICATE('A',4000) + REPLICATE('A',4000) as MyConcatenation 
INTO Results 
FROM sys.databases ORDER BY name ASC
--Check the Length of the Column 
SELECT LEN(MyConcatenation) AS MyConcatenation_ColumnLength
FROM Results
--Check the DataLength for the Datatype created
SELECT 
Columns.name AS [ColumnName],
Columns.column_id AS [ID],
Types.name AS [DataType],
CAST(CASE WHEN UserTypes.name IN (N'nchar', N'nvarchar') AND Columns.max_length <> -1 THEN Columns.max_length/2 ELSE Columns.max_length END AS int) AS [Length],
Columns.is_nullable AS [Nullable]
FROM
sys.tables AS tbl
INNER JOIN sys.all_columns AS Columns ON Columns.object_id=tbl.object_id
LEFT OUTER JOIN sys.types AS Types ON Types.user_type_id = Columns.user_type_id
LEFT OUTER JOIN sys.types AS UserTypes ON (UserTypes.user_type_id = Columns.system_type_id and UserTypes.user_type_id = UserTypes.system_type_id) or ((UserTypes.system_type_id = Columns.system_type_id) and (UserTypes.user_type_id = Columns.user_type_id) and (UserTypes.is_user_defined = 0) and (UserTypes.is_assembly_type = 1)) 
WHERE tbl.name= N'Results'
DROP TABLE Results

 

2017-09-25 20_39_54

 

Interesting….Here is that Data type precedence I mentioned about earlier

 

 

So this time using VARCHAR we can see that Data type precedence occurred converting our two VARCHAR(4000) values to NVARCHAR(4000) and as we can see in the results this has truncated our data.

 

 

So What happens when we explicity convert the NVARCHAR(128) to VARCHAR(MAX)?

 

Example 4:

--Explicit VARCHAR(MAX) Conversion
SELECT TOP 1
CAST(name AS VARCHAR(MAX)) + REPLICATE('A',4000) + REPLICATE('A',4000) as MyConcatenation
INTO Results
FROM sys.databases ORDER BY name ASC
--Check the Length of the Column
SELECT LEN(MyConcatenation) AS MyConcatenation_ColumnLength
FROM Results
 
--Check the DataLength for the Datatype created
SELECT
Columns.name AS [ColumnName],
Columns.column_id AS [ID],
Types.name AS [DataType],
CAST(CASE WHEN UserTypes.name IN (N'nchar', N'nvarchar') AND Columns.max_length <> -1 THEN Columns.max_length/2 ELSE Columns.max_length END AS int) AS [Length],
Columns.is_nullable AS [Nullable]
FROM
sys.tables AS tbl
INNER JOIN sys.all_columns AS Columns ON Columns.object_id=tbl.object_id
LEFT OUTER JOIN sys.types AS Types ON Types.user_type_id = Columns.user_type_id
LEFT OUTER JOIN sys.types AS UserTypes ON (UserTypes.user_type_id = Columns.system_type_id and UserTypes.user_type_id = UserTypes.system_type_id) or ((UserTypes.system_type_id = Columns.system_type_id) and (UserTypes.user_type_id = Columns.user_type_id) and (UserTypes.is_user_defined = 0) and (UserTypes.is_assembly_type = 1))
WHERE tbl.name= N'Results'
 
DROP TABLE Results

 

2017-09-25 20_38_41

 

Total control – 8018 characters and the target datatype and length is correct.

One thing I did find quite frustrating was that I couldn’t see these Implicit conversions in any of the execution plans, the only way I would see them occur was by running the results into a table and then reading out the metadata from the table.

There are also a situation where I find that the above mentioned examples do not occur which I find very odd , here is an example where Data type Precedence occurs BUT truncation does not:

 

DECLARE @String VARCHAR(MAX) = REPLICATE('A',4000)
DECLARE @Name NVARCHAR(128) = 'SQLUNDERCOVER'
 
SELECT @String + @Name AS MyConcatenation
INTO Results;
SELECT LEN(MyConcatenation) AS MyConcatenation_ColumnLength
FROM Results;
 
--Check the DataLength for the Datatype created
SELECT
Columns.name AS [ColumnName],
Columns.column_id AS [ID],
Types.name AS [DataType],
CAST(CASE WHEN UserTypes.name IN (N'nchar', N'nvarchar') AND Columns.max_length <> -1 THEN Columns.max_length/2 ELSE Columns.max_length END AS int) AS [Length],
Columns.is_nullable AS [Nullable]
FROM
sys.tables AS tbl
INNER JOIN sys.all_columns AS Columns ON Columns.object_id=tbl.object_id
LEFT OUTER JOIN sys.types AS Types ON Types.user_type_id = Columns.user_type_id
LEFT OUTER JOIN sys.types AS UserTypes ON (UserTypes.user_type_id = Columns.system_type_id and UserTypes.user_type_id = UserTypes.system_type_id) or ((UserTypes.system_type_id = Columns.system_type_id) and (UserTypes.user_type_id = Columns.user_type_id) and (UserTypes.is_user_defined = 0) and (UserTypes.is_assembly_type = 1))
WHERE tbl.name= N'Results'
DROP TABLE Results
 

 

2017-09-25 21_36_21

 

How Bizzare – this Time  NVARCHAR has taken Precendence and has a data length of MAX so truncation did not occur!  very odd.

 

Thanks for reading ??

 

 

 

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating