Changing NVARCHAR to VARCHAR...

  • Dear Experts,
    As part of tuning on a database , I want to reduce the high physical reads by reducing the number of pages as an administrator. The dev team is also working to improve the code too. Since varchar2 takes double the space + 2 bytes for each row , can I hope to gain any improvements by converting the type to varchar as we don't intend to use multi-lingual feature? Thanks in advance.
  • Sure you might be able to reduce the total size of your table and yes that would theoretically decrease the the maximum size of your database and backups and if you really don't need nvarchar sure you can not use it.  Will it improve your overall day to day performance?  That entire depends, what makes you think that the nvarchar columns are a significant cause of your high page reads?

  • Arsh - Tuesday, October 31, 2017 12:44 PM

    Dear Experts,
    As part of tuning on a database , I want to reduce the high physical reads by reducing the number of pages as an administrator. The dev team is also working to improve the code too. Since varchar2 takes double the space + 2 bytes for each row , can I hope to gain any improvements by converting the type to varchar as we don't intend to use multi-lingual feature? Thanks in advance.

    I normally find such conversions not leading to too much (read futile) improvements, when I see such suggestions I will go in looking for schema level problems. What is the exact cause of the problem you are trying to mitigate?
    ๐Ÿ˜Ž
    Further, does the server have proper memory allocations, are the IO subsystems up to standards etc.

  • Thanks Eirikur and ZZartin .. It's a 600 GB database growing rapidly . The biggest used for BI are over 50 million rows with the biggest at 150 million and growing at a rate of over a million rows per days.. has 64 GB RAM and 20 processors totally and still has performance issues..Overtime stats collection and lot of study shows huge PAGEIO's ... The architecture shows extensive use of Nvarchar every where though there's is no multilingual requirement .. My understanding is that since the nvarchar uses double space versus varchar , wouldn't it occupy much more pages tha required? Pls correct me if i am wrong .

  • Arsh - Tuesday, October 31, 2017 2:09 PM

    Thanks Eirikur and ZZartin .. It's a 600 GB database growing rapidly . The biggest used for BI are over 50 million rows with the biggest at 150 million and growing at a rate of over a million rows per days.. has 64 GB RAM and 20 processors totally and still has performance issues..Overtime stats collection and lot of study shows huge PAGEIO's ... The architecture shows extensive use of Nvarchar every where though there's is no multilingual requirement .. My understanding is that since the nvarchar uses double space versus varchar , wouldn't it occupy much more pages tha required? Pls correct me if i am wrong .

    64Gb is the same as my current work laptop, very meager in current day and age, you may want to look at expanding that and adding buffer pool extentions.
    ๐Ÿ˜Ž
    What is the ratio of the stale/active data?

  • Arsh - Tuesday, October 31, 2017 2:09 PM

    Thanks Eirikur and ZZartin .. It's a 600 GB database growing rapidly . The biggest used for BI are over 50 million rows with the biggest at 150 million and growing at a rate of over a million rows per days.. has 64 GB RAM and 20 processors totally and still has performance issues..Overtime stats collection and lot of study shows huge PAGEIO's ... The architecture shows extensive use of Nvarchar every where though there's is no multilingual requirement .. My understanding is that since the nvarchar uses double space versus varchar , wouldn't it occupy much more pages tha required? Pls correct me if i am wrong .

    Depending on the ratio of character based columns to non-character based columns, it could require up to twice the space.  Just remember that things like temporal and numeric datatypes not only DON'T require 2 bytes per character, they're usually <= 8 btyes with exceptions for things like BIGINT, FLOAT, Money, and Decimal datatypes, etc.

    The real key is that the NVARCHAR may have very little to do with a large number of page I/Os.  It's probably because of a fair bit of poorly written code and poor or missing indexes.

    Also, if you do the simple math, 64GB for 20 processors is a terrible meager 3.2GB per processor.  That's not much especially in the presence of poorly written code, etc.  My first step would to be to get that up to a minimum of 128GB and, if you have the Enterprise Edition of SQL Server, max out the box with memory.  Additional memory is, hands down, the best investment with the highest ROI that you can get.

    Still, it's not a panacea of performance.  You might get a 2X performance improvement by adding even impossible amounts of memory.  Fixing poor code can easily get you somewhere between a very typical 60X to 1000X improvement.  Oddly enough, good code also reduces the memory requirement.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Getting rid of NVARCHARs has a benefit that (IMO) far out weighs any potential space savings...

    IMPLICIT CONVERSIONS...

    SQL Server will allow you to mix & match CHAR, VARCHAR, NCHAR & NVARCHAR datatypes and never throw out any sort of warning. You actually have to check the execution plan and look for the cardinality warnings.

    The net result is developers, who are used to VARCHAR, will omit the Unicode marker and those who are accustomed to Unicode, will use Nโ€™Somevalueโ€™ against VARCHAR columns... The result is equally bad in either direction... Both will kill SARGabilyty. Itโ€™s no different than putting functions around column predicates.

    Worse actually... itโ€™s easy to spot functions in a WHERE clause. ANSI/ Unicode mismatches are much tough spot.

  • Jason A. Long - Tuesday, October 31, 2017 9:48 PM

    Getting rid of NVARCHARs has a benefit that (IMO) far out weighs any potential space savings... IMPLICIT CONVERSIONS...

    I'd disagree. Changing the column is more likely to cause exactly that.

    The net result is developers, who are used to VARCHAR, will omit the Unicode marker and those who are accustomed to Unicode, will use N’Somevalue’ against VARCHAR columns... The result is equally bad in either direction...

    No. If the column is nvarchar and the parameter passed is varchar, SQL will implicitly convert the parameter to match the column. This may result in a slight estimation error, nothing more.
    If the column is varchar and the parameter nvarchar, SQL will implicitly convert the column to match the parameter, which depending on the collation may well get you a full table scan.

    Hence, converting the column from nvarchar to varchar without fixing the code calling it could easily make the performance problem much, much worse.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster - Tuesday, October 31, 2017 10:10 PM

    Jason A. Long - Tuesday, October 31, 2017 9:48 PM

    Getting rid of NVARCHARs has a benefit that (IMO) far out weighs any potential space savings... IMPLICIT CONVERSIONS...

    I'd disagree. Changing the column is more likely to cause exactly that.

    The net result is developers, who are used to VARCHAR, will omit the Unicode marker and those who are accustomed to Unicode, will use N’Somevalue’ against VARCHAR columns... The result is equally bad in either direction...

    No. If the column is nvarchar and the parameter passed is varchar, SQL will implicitly convert the parameter to match the column. This may result in a slight estimation error, nothing more.
    If the column is varchar and the parameter nvarchar, SQL will implicitly convert the column to match the parameter, which depending on the collation may well get you a full table scan.

    Hence, converting the column from nvarchar to varchar without fixing the code calling it could easily make the performance problem much, much worse.

    Gail,
    I would have sworn I'd seem a scan result from not adding N' to a NVARCHAR... 
    Then again, I'm not aware of you having ever been wrong... about anything... so I did a quick test and, yup, you're still batting a 1000. 
    I stand corrected and I thank you for that much needed clarification.

    EDIT: Just for clarification...Even if I had been 100% correct, I wouldn't ever recommend altering a columns data type without making the requisite changes to to other referencing objects and application code..
    On that particular point, it was never my intention to suggest otherwise. 

    proof is in the execution plan...
    WITH
        cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)),
        cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
        cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),
        cte_Tally (n) AS (
            SELECT TOP (1000000)
                ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
            FROM
                cte_n3 a CROSS JOIN cte_n3 b
            )
    SELECT
        rn = ISNULL(t.n, 0),
        ascii_text = ISNULL(CAST(CONCAT(v01.let, v02.let, v03.let, v04.let, v05.let, v06.let, v07.let, v08.let, v09.let, v10.let, ' ',
                v11.let, v12.let, v13.let, v14.let, v15.let, v16.let, v17.let, v18.let, v19.let, v20.let) AS CHAR(20)), ''),
        unicode_text = ISNULL(CAST(CONCAT(v01.let, v02.let, v03.let, v04.let, v05.let, v06.let, v07.let, v08.let, v09.let, v10.let, ' ',
                v11.let, v12.let, v13.let, v14.let, v15.let, v16.let, v17.let, v18.let, v19.let, v20.let) AS NCHAR(20)), N'')
        INTO dbo.ascii_unicode_test
    FROM
        cte_tally t
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 65)) ) v01 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v02 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v03 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v04 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v05 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v06 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v07 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v08 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v09 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v10 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 65)) ) v11 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v12 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v13 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v14 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v15 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v16 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v17 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v18 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v19 (let)
        CROSS APPLY ( VALUES (CHAR(ABS(CHECKSUM(NEWID())) % 26 + 97)) ) v20 (let);

        --=======================================================================    

        ALTER TABLE dbo.ascii_unicode_test ADD
            CONSTRAINT pk_ansii_unicode_test PRIMARY KEY CLUSTERED (rn)
            WITH (FILLFACTOR = 100) ON [PRIMARY];

        CREATE NONCLUSTERED INDEX ix_ascii
            ON dbo.ascii_unicode_test (ascii_text)
            INCLUDE (rn, unicode_text)
            WITH (ONLINE = ON, FILLFACTOR = 100) ON [PRIMARY];

        CREATE NONCLUSTERED INDEX ix_unicode
            ON dbo.ascii_unicode_test (unicode_text)
            INCLUDE (rn, ascii_text)
            WITH (ONLINE = ON, FILLFACTOR = 100) ON [PRIMARY];

        SELECT c.object_id, c.name, t.name, c.is_nullable
        FROM sys.columns c JOIN sys.types t ON c.system_type_id = t.system_type_id
        WHERE c.object_id = OBJECT_ID(N'dbo.ascii_unicode_test');
        /*
            object_id    name            name    is_nullable
            1989582126    rn                bigint    0
            1989582126    ascii_text        char    0
            1989582126    unicode_text    nchar    0
        */

        SELECT SERVERPROPERTY('collation') -- SQL_Latin1_General_CP1_CI_AS

        SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.ascii_text = 'Nmfkgkakan Fccgcujlk';
        GO
        SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.ascii_text = N'Nmfkgkakan Fccgcujlk';
        GO
        SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.unicode_text = 'Nmfkgkakan Fccgcujlk';
        GO
        SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.unicode_text = N'Nmfkgkakan Fccgcujlk';
        GO
        SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.ascii_text LIKE 'Nmfkg%';
        GO
        SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.ascii_text LIKE N'Nmfkg%';
        GO
        SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.unicode_text LIKE 'Nmfkg%';
        GO
        SELECT * FROM dbo.ascii_unicode_test aut WHERE aut.unicode_text LIKE N'Nmfkg%';
        GO

  • Arsh - Tuesday, October 31, 2017 2:09 PM

    Thanks Eirikur and ZZartin .. It's a 600 GB database growing rapidly . The biggest used for BI are over 50 million rows with the biggest at 150 million and growing at a rate of over a million rows per days.. has 64 GB RAM and 20 processors totally and still has performance issues..Overtime stats collection and lot of study shows huge PAGEIO's ... The architecture shows extensive use of Nvarchar every where though there's is no multilingual requirement .. My understanding is that since the nvarchar uses double space versus varchar , wouldn't it occupy much more pages tha required? Pls correct me if i am wrong .

    Can you post the DDL for the table that's growing the most, including the indices please?
    ๐Ÿ˜Ž

  • GilaMonster - Tuesday, October 31, 2017 10:10 PM

    I'd disagree. Changing the column is more likely to cause exactly that.

    The net result is developers, who are used to VARCHAR, will omit the Unicode marker and those who are accustomed to Unicode, will use N’Somevalue’ against VARCHAR columns... The result is equally bad in either direction...

    No. If the column is nvarchar and the parameter passed is varchar, SQL will implicitly convert the parameter to match the column. This may result in a slight estimation error, nothing more.
    If the column is varchar and the parameter nvarchar, SQL will implicitly convert the column to match the parameter, which depending on the collation may well get you a full table scan.

    Hence, converting the column from nvarchar to varchar without fixing the code calling it could easily make the performance problem much, much worse.

    Thank you Gail for the valuable insights. Yes I agree that the code has to be changed accordingly. Now the real question is , can I hope to get any performance benefit by converting from nvarchar to varchar, if I know that the application is not going to be multi-lingual ? It was an offshore project I spent time on analysing it but not currently working on it. The application faces severe bottlenecks when some reports are run.

  • Eirikur,
    It was an offshore project I spent time on analysing it but not currently working on it . But there were many tables like this in that database. The NVARCHARs are there every and of unnecessarily longer lengths than required. Wherever a character or alphanumeric is required , there 's NVARCHAR. Don't the extra pages for each row get , unnecessarily loaded into the memory.My idea is to know if getting rid of NVARCHAR helps in reducing the pages and hence gaining some performance. How am I wrong in thinking so ? I agree memory has to be doubled atleast but its a standard edition. Thank you.
  • Jason A. Long - Wednesday, November 1, 2017 12:25 AM

    EDIT: Just for clarification...Even if I had been 100% correct, I wouldn't ever recommend altering a columns data type without making the requisite changes to to other referencing objects and application code..

    Oh, absolutely agreed.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Arsh - Wednesday, November 1, 2017 3:42 AM

    Now the real question is , can I hope to get any performance benefit by converting from nvarchar to varchar, if I know that the application is not going to be multi-lingual ?

    Unlikely. You might get some minor gains, but unlikely to be particularly good for the amount of effort and work that this change will require.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • What SQL Version are you on, have you considered compression?
    ๐Ÿ˜Ž

Viewing 15 posts - 1 through 15 (of 36 total)

You must be logged in to reply to this topic. Login to reply