Row size, indexes and varchar(max)

  • I have a table with 100m+ rows. We're seeing very bad performance on queries as the data grows. I noticed that the row size is pretty big (10190) and I'm thinking that this is affecting indexes/index performance.

    the table has a bunch of columns set to wrong data types (lots of ints where tinyints are more appropriate, etc). I went and updated the table to change what I could.

    Original row size is 10190 and I was able to get it down to 10090 by adjusting ints to smallint or tinyint.

    There are two columns set to varchar(2048). I changed those to varchar(max) and the row size went down to about 6000.

    I used the queries found here http://www.sqlservercentral.com/Forums/Topic226699-5-1.aspx to get the row size.

    My question is: Does changing the varchar(2048) columns to varchar(max) help indexing/perfromance when those columns are not used often? What about getting row size below 8000?

  • titan2782 (4/21/2011)


    I have a table with 100m+ rows. We're seeing very bad performance on queries as the data grows. I noticed that the row size is pretty big (10190) and I'm thinking that this is affecting indexes/index performance.

    the table has a bunch of columns set to wrong data types (lots of ints where tinyints are more appropriate, etc). I went and updated the table to change what I could.

    Original row size is 10190 and I was able to get it down to 10090 by adjusting ints to smallint or tinyint.

    There are two columns set to varchar(2048). I changed those to varchar(max) and the row size went down to about 6000.

    I used the queries found here http://www.sqlservercentral.com/Forums/Topic226699-5-1.aspx to get the row size.

    My question is: Does changing the varchar(2048) columns to varchar(max) help indexing/perfromance when those columns are not used often? What about getting row size below 8000?

    Are the columns that you've re-defined as varchar(max) part of any indexes?

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • no. they are infrequently used columns

  • Did you execute this DBCC CLEANTABLE

    Read about it here to see if it could help

    http://technet.microsoft.com/en-us/library/ms174418(SQL.90).aspx

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • titan2782 (4/21/2011)


    no. they are infrequently used columns

    Oh good. You should look into making them SPARSE columns. You can save a lot of space that way.

    http://msdn.microsoft.com/en-us/library/cc280604.aspx

    If they aren't in the index, it shouldn't make any difference to the indexing. One thing you might want to look at to improve performance with your indexes, if you're sure you have the right columns in your index, are included columns.

    http://msdn.microsoft.com/en-us/library/ms190806.aspx

    If you have queries that are frequently used that don't return those varchar(max) columns then you can add those columns as included columns and get a good performance boost.

    Have you done any analysis to see how those varchar(max) columns are used? If you do some checking you might be able to knock them down below what they were to begin with.

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • thanks for that info. After i made the changes, it looks like the query runs slower now. Even though the row size is 6kb and I dropped/rebuilt all indexes. Guess it wasn't a problem after all.

    I'm going to apply the sparse attribute to them though because the majority of the rows have nulls in those columns.

  • Then perhaps you can take a look at the execution plan and go from there...

  • titan2782 (4/22/2011)


    thanks for that info. After i made the changes, it looks like the query runs slower now. Even though the row size is 6kb and I dropped/rebuilt all indexes. Guess it wasn't a problem after all.

    I'm going to apply the sparse attribute to them though because the majority of the rows have nulls in those columns.

    Did you try the included columns in the indexes?

    Is your query using the indexes?

    --------------------------------------
    When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions.
    --------------------------------------
    It’s unpleasantly like being drunk.
    What’s so unpleasant about being drunk?
    You ask a glass of water. -- Douglas Adams

  • titan2782 (4/21/2011)


    There are two columns set to varchar(2048). I changed those to varchar(max) and the row size went down to about 6000. I used the queries found here http://www.sqlservercentral.com/Forums/Topic226699-5-1.aspx to get the row size. My question is: Does changing the varchar(2048) columns to varchar(max) help indexing/perfromance when those columns are not used often? What about getting row size below 8000?

    That script does not correctly account for MAX data types - it will use -1 as the length! Changing VARCHAR(n) to VARCHAR(MAX) does not reduce the storage needed because VARCHAR(MAX) is stored exactly the same as VARCHAR(n).

    If you would like a more complete answer, please post the definition (CREATE TABLE) statement. Feel free to obfuscate the column names if you need to. The idea of using SPARSE columns may be a very good one, but it would still be nice to see exactly what we are talking about. For variable-length and/or NULLable columns, please also tell us what the average length of the data actually stored in that column is, and how likely the data is to be NULL.

  • data types are probably pretty far down on the list of reasons why your query performance is poor. Look to indexing and query construction first.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply