Changing NVARCHAR to VARCHAR...

Question

Changing NVARCHAR to VARCHAR...

Arsh

SSCertifiable

Points: 6345
More actions
October 31, 2017 at 12:44 pm

#340255

Dear Experts,
As part of tuning on a database , I want to reduce the high physical reads by reducing the number of pages as an administrator. The dev team is also working to improve the code too. Since varchar2 takes double the space + 2 bytes for each row , can I hope to gain any improvements by converting the type to varchar as we don't intend to use multi-lingual feature? Thanks in advance.

Viewing 15 posts - 1 through 15 (of 36 total)

You must be logged in to reply to this topic. Login to reply

ZZartin SSC-Dedicated Points: 30900 More actions · Answer 1

Sure you might be able to reduce the total size of your table and yes that would theoretically decrease the the maximum size of your database and backups and if you really don't need nvarchar sure you can not use it. Will it improve your overall day to day performance? That entire depends, what makes you think that the nvarchar columns are a significant cause of your high page reads?

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 2

Arsh - Tuesday, October 31, 2017 12:44 PM
Dear Experts,
As part of tuning on a database , I want to reduce the high physical reads by reducing the number of pages as an administrator. The dev team is also working to improve the code too. Since varchar2 takes double the space + 2 bytes for each row , can I hope to gain any improvements by converting the type to varchar as we don't intend to use multi-lingual feature? Thanks in advance.

I normally find such conversions not leading to too much (read futile) improvements, when I see such suggestions I will go in looking for schema level problems. What is the exact cause of the problem you are trying to mitigate?
😎
Further, does the server have proper memory allocations, are the IO subsystems up to standards etc.

Arsh SSCertifiable Points: 6345 More actions · Answer 3

Thanks Eirikur and ZZartin .. It's a 600 GB database growing rapidly . The biggest used for BI are over 50 million rows with the biggest at 150 million and growing at a rate of over a million rows per days.. has 64 GB RAM and 20 processors totally and still has performance issues..Overtime stats collection and lot of study shows huge PAGEIO's ... The architecture shows extensive use of Nvarchar every where though there's is no multilingual requirement .. My understanding is that since the nvarchar uses double space versus varchar , wouldn't it occupy much more pages tha required? Pls correct me if i am wrong .

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 4

Arsh - Tuesday, October 31, 2017 2:09 PM
Thanks Eirikur and ZZartin .. It's a 600 GB database growing rapidly . The biggest used for BI are over 50 million rows with the biggest at 150 million and growing at a rate of over a million rows per days.. has 64 GB RAM and 20 processors totally and still has performance issues..Overtime stats collection and lot of study shows huge PAGEIO's ... The architecture shows extensive use of Nvarchar every where though there's is no multilingual requirement .. My understanding is that since the nvarchar uses double space versus varchar , wouldn't it occupy much more pages tha required? Pls correct me if i am wrong .

64Gb is the same as my current work laptop, very meager in current day and age, you may want to look at expanding that and adding buffer pool extentions.
😎
What is the ratio of the stale/active data?

Jeff Moden SSC Guru Points: 1003895 More actions · Answer 5

Arsh - Tuesday, October 31, 2017 2:09 PM
Thanks Eirikur and ZZartin .. It's a 600 GB database growing rapidly . The biggest used for BI are over 50 million rows with the biggest at 150 million and growing at a rate of over a million rows per days.. has 64 GB RAM and 20 processors totally and still has performance issues..Overtime stats collection and lot of study shows huge PAGEIO's ... The architecture shows extensive use of Nvarchar every where though there's is no multilingual requirement .. My understanding is that since the nvarchar uses double space versus varchar , wouldn't it occupy much more pages tha required? Pls correct me if i am wrong .

Depending on the ratio of character based columns to non-character based columns, it could require up to twice the space. Just remember that things like temporal and numeric datatypes not only DON'T require 2 bytes per character, they're usually <= 8 btyes with exceptions for things like BIGINT, FLOAT, Money, and Decimal datatypes, etc.

The real key is that the NVARCHAR may have very little to do with a large number of page I/Os. It's probably because of a fair bit of poorly written code and poor or missing indexes.

Also, if you do the simple math, 64GB for 20 processors is a terrible meager 3.2GB per processor. That's not much especially in the presence of poorly written code, etc. My first step would to be to get that up to a minimum of 128GB and, if you have the Enterprise Edition of SQL Server, max out the box with memory. Additional memory is, hands down, the best investment with the highest ROI that you can get.

Still, it's not a panacea of performance. You might get a 2X performance improvement by adding even impossible amounts of memory. Fixing poor code can easily get you somewhere between a very typical 60X to 1000X improvement. Oddly enough, good code also reduces the memory requirement.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jason A. Long SSC-Insane Points: 23781 More actions · Answer 6

Getting rid of NVARCHARs has a benefit that (IMO) far out weighs any potential space savings...

IMPLICIT CONVERSIONS...

SQL Server will allow you to mix & match CHAR, VARCHAR, NCHAR & NVARCHAR datatypes and never throw out any sort of warning. You actually have to check the execution plan and look for the cardinality warnings.

The net result is developers, who are used to VARCHAR, will omit the Unicode marker and those who are accustomed to Unicode, will use N’Somevalue’ against VARCHAR columns... The result is equally bad in either direction... Both will kill SARGabilyty. It’s no different than putting functions around column predicates.

Worse actually... it’s easy to spot functions in a WHERE clause. ANSI/ Unicode mismatches are much tough spot.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

Jason A. Long - Tuesday, October 31, 2017 9:48 PM
Getting rid of NVARCHARs has a benefit that (IMO) far out weighs any potential space savings... IMPLICIT CONVERSIONS...

I'd disagree. Changing the column is more likely to cause exactly that.

The net result is developers, who are used to VARCHAR, will omit the Unicode marker and those who are accustomed to Unicode, will use Nâ€™Somevalueâ€™ against VARCHAR columns... The result is equally bad in either direction...

No. If the column is nvarchar and the parameter passed is varchar, SQL will implicitly convert the parameter to match the column. This may result in a slight estimation error, nothing more.
If the column is varchar and the parameter nvarchar, SQL will implicitly convert the column to match the parameter, which depending on the collation may well get you a full table scan.

Hence, converting the column from nvarchar to varchar without fixing the code calling it could easily make the performance problem much, much worse.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jason A. Long SSC-Insane Points: 23781 More actions · Answer 8

GilaMonster - Tuesday, October 31, 2017 10:10 PM
Jason A. Long - Tuesday, October 31, 2017 9:48 PM
Getting rid of NVARCHARs has a benefit that (IMO) far out weighs any potential space savings... IMPLICIT CONVERSIONS...
I'd disagree. Changing the column is more likely to cause exactly that.
The net result is developers, who are used to VARCHAR, will omit the Unicode marker and those who are accustomed to Unicode, will use Nâ€™Somevalueâ€™ against VARCHAR columns... The result is equally bad in either direction...
No. If the column is nvarchar and the parameter passed is varchar, SQL will implicitly convert the parameter to match the column. This may result in a slight estimation error, nothing more.
If the column is varchar and the parameter nvarchar, SQL will implicitly convert the column to match the parameter, which depending on the collation may well get you a full table scan.
Hence, converting the column from nvarchar to varchar without fixing the code calling it could easily make the performance problem much, much worse.

Gail,
I would have sworn I'd seem a scan result from not adding N' to a NVARCHAR...
Then again, I'm not aware of you having ever been wrong... about anything... so I did a quick test and, yup, you're still batting a 1000.
I stand corrected and I thank you for that much needed clarification.

EDIT: Just for clarification...Even if I had been 100% correct, I wouldn't ever recommend altering a columns data type without making the requisite changes to to other referencing objects and application code..
On that particular point, it was never my intention to suggest otherwise.

proof is in the execution plan...
WITH cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)), cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b), cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b), cte_Tally (n) AS ( SELECT TOP (1000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM cte_n3 a CROSS JOIN cte_n3 b ) SELECT rn = ISNULL(t.n, 0), ascii_text = ISNULL(CAST(CONCAT(v01.let, v02.let, v03.let, v04.let, v05.let, v06.let, v07.let, v08.let, v09.let, v10.let, ' ', v11.let, v12.let, v13.let, v14.let, v15.let, v16.let, v17.let, v18.let, v19.let, v20.let) AS CHAR(20)), ''), unicode_text = ISNULL(CAST(CONCAT(v01.let, v02.let, v03.let, v04.let, v05.let, v06.let, v07.let, v08.let, v09.let, v10.let, ' ', v11.let, v12.let, v13.let, v14.let, v15.let, v16.let, v17.let, v18.let, v19.let, v20.let) AS NCHAR(20)), N'') INTO dbo.ascii_unicode_test FROM cte_tally t CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 65)) ) v01 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v02 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v03 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v04 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v05 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v06 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v07 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v08 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v09 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v10 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 65)) ) v11 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v12 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v13 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v14 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v15 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v16 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v17 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v18 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v19 (let) CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v20 (let);

--=======================================================================

ALTER TABLE dbo.ascii_unicode_test ADD CONSTRAINT pk_ansii_unicode_test PRIMARY KEY CLUSTERED (rn) WITH (FILLFACTOR = 100) ON [PRIMARY];

CREATE NONCLUSTERED INDEX ix_ascii ON dbo.ascii_unicode_test (ascii_text) INCLUDE (rn, unicode_text) WITH (ONLINE = ON, FILLFACTOR = 100) ON [PRIMARY];

CREATE NONCLUSTERED INDEX ix_unicode ON dbo.ascii_unicode_test (unicode_text) INCLUDE (rn, ascii_text) WITH (ONLINE = ON, FILLFACTOR = 100) ON [PRIMARY];

SELECT c.object_id, c.name, t.name, c.is_nullable FROM sys.columns c JOIN sys.types t ON c.system_type_id = t.system_type_id WHERE c.object_id = OBJECT_ID(N'dbo.ascii_unicode_test'); /* object_id name name is_nullable 1989582126 rn bigint 0 1989582126 ascii_text char 0 1989582126 unicode_text nchar 0 */

SELECT SERVERPROPERTY('collation') -- SQL_Latin1_General_CP1_CI_AS

SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.ascii_text = 'Nmfkgkakan Fccgcujlk'; GO SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.ascii_text = N'Nmfkgkakan Fccgcujlk'; GO SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.unicode_text = 'Nmfkgkakan Fccgcujlk'; GO SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.unicode_text = N'Nmfkgkakan Fccgcujlk'; GO SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.ascii_text LIKE 'Nmfkg%'; GO SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.ascii_text LIKE N'Nmfkg%'; GO SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.unicode_text LIKE 'Nmfkg%'; GO SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.unicode_text LIKE N'Nmfkg%'; GO

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 9

Arsh - Tuesday, October 31, 2017 2:09 PM
Thanks Eirikur and ZZartin .. It's a 600 GB database growing rapidly . The biggest used for BI are over 50 million rows with the biggest at 150 million and growing at a rate of over a million rows per days.. has 64 GB RAM and 20 processors totally and still has performance issues..Overtime stats collection and lot of study shows huge PAGEIO's ... The architecture shows extensive use of Nvarchar every where though there's is no multilingual requirement .. My understanding is that since the nvarchar uses double space versus varchar , wouldn't it occupy much more pages tha required? Pls correct me if i am wrong .

Can you post the DDL for the table that's growing the most, including the indices please?
😎

Arsh SSCertifiable Points: 6345 More actions · Answer 10

GilaMonster - Tuesday, October 31, 2017 10:10 PM
I'd disagree. Changing the column is more likely to cause exactly that.
The net result is developers, who are used to VARCHAR, will omit the Unicode marker and those who are accustomed to Unicode, will use Nâ€™Somevalueâ€™ against VARCHAR columns... The result is equally bad in either direction...
No. If the column is nvarchar and the parameter passed is varchar, SQL will implicitly convert the parameter to match the column. This may result in a slight estimation error, nothing more.
If the column is varchar and the parameter nvarchar, SQL will implicitly convert the column to match the parameter, which depending on the collation may well get you a full table scan.
Hence, converting the column from nvarchar to varchar without fixing the code calling it could easily make the performance problem much, much worse.

Thank you Gail for the valuable insights. Yes I agree that the code has to be changed accordingly. Now the real question is , can I hope to get any performance benefit by converting from nvarchar to varchar, if I know that the application is not going to be multi-lingual ? It was an offshore project I spent time on analysing it but not currently working on it. The application faces severe bottlenecks when some reports are run.

Arsh SSCertifiable Points: 6345 More actions · Answer 11

Eirikur,
It was an offshore project I spent time on analysing it but not currently working on it . But there were many tables like this in that database. The NVARCHARs are there every and of unnecessarily longer lengths than required. Wherever a character or alphanumeric is required , there 's NVARCHAR. Don't the extra pages for each row get , unnecessarily loaded into the memory.My idea is to know if getting rid of NVARCHAR helps in reducing the pages and hence gaining some performance. How am I wrong in thinking so ? I agree memory has to be doubled atleast but its a standard edition. Thank you.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 12

Jason A. Long - Wednesday, November 1, 2017 12:25 AM
EDIT: Just for clarification...Even if I had been 100% correct, I wouldn't ever recommend altering a columns data type without making the requisite changes to to other referencing objects and application code..

Oh, absolutely agreed.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 13

Arsh - Wednesday, November 1, 2017 3:42 AM
Now the real question is , can I hope to get any performance benefit by converting from nvarchar to varchar, if I know that the application is not going to be multi-lingual ?

Unlikely. You might get some minor gains, but unlikely to be particularly good for the amount of effort and work that this change will require.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 14

What SQL Version are you on, have you considered compression?
😎