SQLServerCentral Article

The Case of the Space at the End

,

As DBA's and developers, we are not always present at the genesis of an object or system. We initially spend a lot of time familiarizing ourselves with not only the database, instance structures and their ecosystem, but the context of the data within our gilded vaults of data. You may think, "I know where the database is, I understand the system it's installed on, I have the script that created this object, and I know what the data means, so what else is there to know?" If you're familiar with the deep, dark inner workings of SQL Server, a lot.

Recently, an attempt was made to replace an existing ETL process with a new one, shiny new software and all. The team implementing this new system was having trouble replicating the exact data generated by the existing process. More specifically, the original process was leaving trailing spaces at the end of varchar columns in a table. They had created a temporary replica of the table but couldn't produce the same results. It's important to note: I don't know HOW they created them, but we'll get to that very shortly.

Their support couldn't figure it out, so it was suggested that they consult their local, friendly DBA. Being a semi-competent DBA, the first thing I checked was the default setting for ANSI_PADDING on the database. Nope, ANSI_PADDING was set to off. In retrospect, that answer should have been obvious, as both tables existed on the same database, but functioned differently. So how come one table would pad the values and the other wouldn’t?

Some of you with more advanced knowledge already know the answer. More importantly, if you're like me, and couldn't think of an answer immediately, hopefully the solution to this problem will help you solve yours.

For reference:

  • SSMS Version: Microsoft SQL Server Management Studio 13.0.16106.4
  • Instance Version: Microsoft SQL Server 2016 (SP1) - 13.0.4001.0 (X64)
  • Database Collation: SQL_Latin1_General_CP1_CI_AS

Let's do some detective work and drill down into a very important setting.

First things first - Let's run this statement and verify ANSI_PADDING is off by default in this database

SELECT DATABASEPROPERTYEX(DB_NAME(), 'IsAnsiPaddingEnabled');

Returned 0. Definitely off! Always good to double check.

Now let's create a table from SSMS, changing nothing and making no alterations to the ANSI_PADDING setting. We'll create char and varchar columns. Remember, char should pad the values to the full length with spaces. That will be important to remember.

IF EXISTS (SELECT * FROM sys.tables where name = 'TEST_SSMS_Default')
DROP TABLE [dbo].[TEST_SSMS_Default]
GO
CREATE TABLE [dbo].[TEST_SSMS_Default]
(
[ID]                 INT IDENTITY(1,1)
,[Column_char]       CHAR(10)
,[Column_varchar]    VARCHAR(20)
) ON [PRIMARY]
GO

Now let's add some data, and see what happens:

INSERT INTO [dbo].[TEST_SSMS_Default]
       ([Column_char], [Column_varchar])
VALUES
('Adam', 'St. Pierre'),
('Adam      ', 'St. Pierre     ');

SELECT
       [ID]
       ,[Column_char]
       ,REPLACE ([Column_char], ' ', '*')       AS [Column_char_Show_Spaces]
       ,LEN ([Column_char])                     AS [Column_char_Len]
       ,DATALENGTH ([Column_char])              AS [Column_char_DataLength]
       ,[Column_varchar]
       ,REPLACE ([Column_varchar], ' ', '*')    AS [Column_varchar_Show_Spaces]
       ,LEN([Column_varchar])                   AS [Column_varchar_Len]
       ,DATALENGTH ([Column_varchar])           AS [Column_varchar_DataLength]
FROM
       [dbo].[TEST_SSMS_Default]

Here are our results:

The first record had no trailing spaces added in the INSERT statement, while the second one did for our varchar column value. Two things of note:

  • LEN() and DATALENGTH() return different values for length. LEN() will include only characters, whereas DATALENGTH() will include the trailing spaces (Or in the case of char, the spaces used to pad the value).
  • We can see that the varchar column preserved the trailing spaces in our second row, without making any explicit statement and despite the fact that the default value is OFF for the database.

We can confirm that ANSI_PADDING is, in fact, turned on at the table level:

SELECT
       c.[name]  AS [Column_Name]
       ,t.[name] AS [Column_Type]
       ,CASE c.is_ansi_padded
              WHEN 1 THEN 'On'
              ELSE 'Off'
       END AS [ANSI_PADDING]
FROM
       sys.[all_columns] c
       INNER JOIN
       sys.[types] t ON
              c.[system_type_id] = t.[system_type_id]
              AND c.[user_type_id] = t.[user_type_id]
WHERE
       c.[object_id] = object_id('TEST_SSMS_Default')

So how did that happen with the database default value set to OFF? Well, as it turns out, SSMS has certain default settings you can toggle on or off for every query execution, and one of them is, you guessed it:

Great! So now we know how running a query in SSMS created a table with ANSI_PADDING set to ON. The above screenshot is a capture of the default settings for SSMS. Note that if you change this, it won't take effect until you restart the program.

Now that we know how it got there, can we override the SSMS setting? We sure can! Let's create a new table with padding off, then turn it back on, insert the exact same values, and verify the output is what we expect.

IF EXISTS (SELECT * FROM sys.tables where name = 'TEST_ANSI_OFF')
DROP TABLE  [dbo].[TEST_ANSI_OFF]
GO
SET ANSI_PADDING OFF;
CREATE TABLE [dbo].[TEST_ANSI_OFF]
(
       [ID]                 INT IDENTITY(1,1)
       ,[Column_char]       CHAR(10)
       ,[Column_varchar]    VARCHAR(20)
       ) ON [PRIMARY]
GO

SET ANSI_PADDING ON;
INSERT INTO [dbo].[TEST_ANSI_OFF]
       ([Column_char], [Column_varchar])
VALUES
       ('Adam', 'St. Pierre'),
       ('Adam      ', 'St. Pierre     ');
SELECT
       [ID]
       ,[Column_char]
       ,REPLACE ([Column_char], ' ', '*')       AS [Column_char_Show_Spaces]
       ,LEN ([Column_char])                     AS [Column_char_Len]
       ,DATALENGTH ([Column_char])              AS [Column_char_DataLength]
       ,[Column_varchar]
       ,REPLACE ([Column_varchar], ' ', '*')    AS [Column_varchar_Show_Spaces]
       ,LEN([Column_varchar])                   AS [Column_varchar_Len]
       ,DATALENGTH ([Column_varchar])           AS [Column_varchar_DataLength]
FROM
       [dbo].[TEST_ANSI_OFF]

SELECT
       c.[name]  AS [Column_Name]
       ,t.[name] as [Column_Type]
       ,CASE c.is_ansi_padded
              WHEN 1 THEN 'On'
              ELSE 'Off'
       END AS [ANSI_PADDING]
FROM
       sys.[all_columns] c
       INNER JOIN
       sys.[types] t ON
              c.[system_type_id] = t.[system_type_id]
              and c.[user_type_id] = t.[user_type_id]
WHERE
       c.[object_id] = object_id('TEST_ANSI_OFF')

We can see from the second set of results that ANSI_PADDING is indeed off and our varchar column now doesn't retain the trailing spaces, as we expected.

But what's this?! Our char column, despite having a fixed length of 10, no longer has trailing spaces either! Which begs the question: Can we alter an individual column's ANSI_PADDING flag after it's been created? Let's give it a try!

SET ANSI_PADDING ON;
ALTER TABLE [dbo].[TEST_ANSI_OFF]
ALTER COLUMN [Column_char] CHAR(10) NULL;
GO
SELECT
       c.[name]  AS [Column_Name]
       ,t.[name] as [Column_Type]
       ,CASE c.is_ansi_padded
              WHEN 1 THEN 'On'
              ELSE 'Off'
       END AS [ANSI_PADDING]
FROM
       sys.[all_columns] c
       INNER JOIN
       sys.[types] t ON
              c.[system_type_id] = t.[system_type_id]
              and c.[user_type_id] = t.[user_type_id]
WHERE
       c.[object_id] = object_id('TEST_ANSI_OFF')

Sure enough, Column_char now has ANSI_PADDING turned on. What happens when we add another value? Let's make sure ANSI_PADDING is turned ON too.

SET ANSI_PADDING ON;
INSERT INTO [dbo].[TEST_ANSI_OFF]
([Column_char], [Column_varchar])
VALUES
('John      ', 'Smith     ');
SELECT
       [ID]
       ,[Column_char]
       ,REPLACE ([Column_char], ' ', '*')       AS [Column_char_Show_Spaces]
       ,LEN ([Column_char])                     AS [Column_char_Len]
       ,DATALENGTH ([Column_char])              AS [Column_char_DataLength]
       ,[Column_varchar]
       ,REPLACE ([Column_varchar], ' ', '*')    AS [Column_varchar_Show_Spaces]
       ,LEN([Column_varchar])                   AS [Column_varchar_Len]
       ,DATALENGTH ([Column_varchar])           AS [Column_varchar_DataLength]
FROM
       [dbo].[TEST_ANSI_OFF]

The char column now has its values padded for all existing values as well as our new one, and the varchar column functions exactly the same.

Now let’s look at what happens if we try to change the column back with ANSI_PADDING OFF:

SET ANSI_PADDING OFF;
ALTER TABLE [dbo].[TEST_ANSI_OFF]
ALTER COLUMN [Column_char] CHAR(10) NULL;
GO
SELECT
       c.[name]  AS [Column_Name]
       ,t.[name] as [Column_Type]
       ,CASE c.is_ansi_padded
              WHEN 1 THEN 'On'
              ELSE 'Off'
       END AS [ANSI_PADDING]
FROM
       sys.[all_columns] c
       INNER JOIN
       sys.[types] t ON
              c.[system_type_id] = t.[system_type_id]
              and c.[user_type_id] = t.[user_type_id]
WHERE
       c.[object_id] = object_id('TEST_ANSI_OFF')

Well that’s odd. It won’t let us turn it off. Maybe it’s because of the existing values already in the table? Let’s truncate and find out.

SET ANSI_PADDING OFF;
TRUNCATE TABLE [dbo].[TEST_ANSI_OFF]
GO
ALTER TABLE [dbo].[TEST_ANSI_OFF]
ALTER COLUMN [Column_char] CHAR(10) NULL;
GO
SELECT * FROM  [dbo].[TEST_ANSI_OFF];
SELECT
       c.[name]  AS [Column_Name]
       ,t.[name] as [Column_Type]
       ,CASE c.is_ansi_padded
              WHEN 1 THEN 'On'
              ELSE 'Off'
       END
       AS [ANSI_PADDING]
FROM
       sys.[all_columns] c
       INNER JOIN
       sys.[types] t ON
              c.[system_type_id] = t.[system_type_id]
              and c.[user_type_id] = t.[user_type_id]
WHERE
       c.[object_id] = object_id('TEST_ANSI_OFF')

So even with no values in the table, once you turn ON ANSI_PADDING, you can’t turn it off. Note that this holds true for the varchar column as well.

So buyer beware! There is one other important thing to remember about ANSI standards when considering whether or not to turn padding on for existing tables, or off for new ones. From the KB article:

https://support.microsoft.com/en-us/help/316626/inf-how-sql-server-compares-strings-with-trailing-spaces

SQL Server follows the ANSI/ISO SQL-92 specification (Section 8.2, , General rules #3) on how to compare strings with spaces. The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them. The padding directly affects the semantics of WHERE and HAVING clause predicates and other Transact-SQL string comparisons. For example, Transact-SQL considers the strings 'abc' and 'abc ' to be equivalent for most comparison operations.

The only exception to this rule is the LIKE predicate. When the right side of a LIKE predicate expression features a value with a trailing space, SQL Server does not pad the two values to the same length before the comparison occurs. Because the purpose of the LIKE predicate, by definition, is to facilitate pattern searches rather than simple string equality tests, this does not violate the section of the ANSI SQL-92 specification mentioned earlier.

In a nutshell, this means that if you have existing columns where padding is on and perform certain operations where the padding is off, you could encounter issues depending on your operation. It also doesn’t mention DATALENGTH(), in which the padding DOES matter.

I had originally tried to write a comparison using variable declarations, one declared with the padding ON and another with the padding OFF, but only one option will be valid in a single session. Good to know!

Let’s set up an example table with some values:

IF EXISTS (SELECT * FROM sys.tables where name = 'TEST_ANSI_PADDING')
       DROP TABLE  [dbo].[TEST_ANSI_PADDING]
GO
SET ANSI_PADDING OFF;
CREATE TABLE [dbo].[TEST_ANSI_PADDING]
(
       [ID]                 INT IDENTITY(1,1)
       ,[PADDING_OFF]       CHAR(10)
) ON [PRIMARY]
GO
SET ANSI_PADDING ON;
ALTER TABLE [dbo].[TEST_ANSI_PADDING]
ADD [PADDING_ON] CHAR(10);
GO
SELECT
       c.[name]  AS [Column_Name]
       ,t.[name] as [Column_Type]
       ,CASE c.is_ansi_padded
              WHEN 1 THEN 'On'
              ELSE 'Off'
       END
       AS [ANSI_PADDING]
FROM
       sys.[all_columns] c
       INNER JOIN
       sys.[types] t ON
              c.[system_type_id] = t.[system_type_id]
              and c.[user_type_id] = t.[user_type_id]
WHERE
       c.[object_id] = object_id('TEST_ANSI_PADDING')

INSERT INTO [dbo].[TEST_ANSI_PADDING]
       ([PADDING_OFF], [PADDING_ON])
VALUES
       ('Adam      ', 'Adam      ' );
SELECT
       [ID]
       ,[PADDING_OFF]
       ,REPLACE ([PADDING_OFF], ' ', '*') AS [PADDING_OFF_Show_Spaces]
       ,LEN ([PADDING_OFF])              AS [PADDING_OFF_Len]
       ,DATALENGTH ([PADDING_OFF])       AS [PADDING_OFF_DataLength]
       ,[PADDING_ON]
       ,REPLACE ([PADDING_ON], ' ', '*') AS [PADDING_ON_Show_Spaces]
       ,LEN([PADDING_ON])                AS [PADDING_ON_Len]
       ,DATALENGTH ([PADDING_ON])        AS [PADDING_ON_DataLength]
FROM
       [dbo].[TEST_ANSI_PADDING]

Now let’s run some comparisons:

SET ANSI_PADDING OFF;
SELECT
       'LEN() ' AS [Operation]
       ,CASE
              WHEN (LEN(PADDING_OFF) = LEN(PADDING_ON)) THEN
              'OK'
              ELSE
              'Failed!'
       END AS [Result]
FROM
       [dbo].[TEST_ANSI_PADDING]
UNION ALL
SELECT
       'DATALENGTH() ' AS [Operation]
       ,CASE
              WHEN (DATALENGTH(PADDING_OFF) = DATALENGTH(PADDING_ON)) THEN
              'OK'
              ELSE
              'Failed!'
       END AS [Result]
FROM
       [dbo].[TEST_ANSI_PADDING]
UNION ALL
SELECT
       ' =  Comparison:' AS [Operation]
       ,CASE
              WHEN (PADDING_OFF = PADDING_ON) THEN
              'OK'
              ELSE
              'Failed!'
       END AS [Result]
FROM
       [dbo].[TEST_ANSI_PADDING]
UNION ALL
SELECT
       ' LIKE, with PADDING_ON on the LEFT side of the operation:' AS [Operation]
       ,CASE
              WHEN (PADDING_ON LIKE PADDING_OFF) THEN
              'OK'
              ELSE
              'Failed!'
       END AS [Result]
FROM
       [dbo].[TEST_ANSI_PADDING]
UNION ALL
SELECT
       ' LIKE, with PADDING_ON on the RIGHT side of the operation:' AS [Operation]
       ,CASE
              WHEN (PADDING_OFF LIKE PADDING_ON) THEN
              'OK'
              ELSE
              'Failed!'
       END AS [Result]
FROM
       [dbo].[TEST_ANSI_PADDING]

And of course, our outputs for our simple comaprisions are exactly what we’d expect:

Take Aways

Here are a few things to know:

  • The default settings of your tools (SSMS or otherwise) matter, and can impact future queries against an object after the time of creation.
  • ANSI_PADDING can be turned ON AFTER creation, but not OFF.
  • Variable declarations can also be set with padding ON or OFF, but not BOTH within the same session.
  • Depending on how the column is used you can conceivably save some extra coding by not storing unneeded trailing spaces on char columns, provided you know what operations will be performed against this data.. It depends!

I'm uncertain if having the ANSI_PADDING OFF will net you any savings on space for char columns, though. I'll leave that to someone who's better at generating large sample sets to investigate.

I hope you have found this article interesting and informative. ANSI_PADDING is just one setting that matters at the time of creation, but there are others - Knowledge of HOW your tools create objects is just as important as WHAT you create.

Adam wrote this article to try ANSI if he can PAD his resume…. On.

Rate

4.83 (24)

You rated this post out of 5. Change rating

Share

Share

Rate

4.83 (24)

You rated this post out of 5. Change rating