nvarchar(max) data and performance

  • Hi,

    They are going to put a lot of records with nvarchar(max) values into a table (around 3 million). The table will grow from 3GB to 300GB and make the database four times its current size.

    I understand there are performance impilcation from this, as the data will occupy most of the buffer cache (the server has only 128GB of RAM) and as most of it will not fit into the cache the disk is going to suffer a lot from the read overhead.

    Is there any way to minimise that performance impact? E.g. putting a table onto a separate LUN or something like this?

    Thanks.

  • Roust_m (11/14/2011)


    Hi,

    They are going to put a lot of records with nvarchar(max) values into a table (around 3 million). The table will grow from 3GB to 300GB and make the database four times its current size.

    I understand there are performance impilcation from this, as the data will occupy most of the buffer cache (the server has only 128GB of RAM) and as most of it will not fit into the cache the disk is going to suffer a lot from the read overhead.

    Is there any way to minimise that performance impact? E.g. putting a table onto a separate LUN or something like this?

    Thanks.

    ONLY 128 GB????

    None of my servers has that much RAM!

    Putting the table on a separate LUN will help with the I/O, but won't mitigate buffer cache contention.

    You'd better isolate this column from the rest of the table if buffer cache is a major concern.

    What are you planning to do with this nvarchar(max) column?

    -- Gianluca Sartori

  • Gianluca Sartori (11/15/2011)


    Roust_m (11/14/2011)


    Hi,

    They are going to put a lot of records with nvarchar(max) values into a table (around 3 million). The table will grow from 3GB to 300GB and make the database four times its current size.

    I understand there are performance impilcation from this, as the data will occupy most of the buffer cache (the server has only 128GB of RAM) and as most of it will not fit into the cache the disk is going to suffer a lot from the read overhead.

    Is there any way to minimise that performance impact? E.g. putting a table onto a separate LUN or something like this?

    Thanks.

    ONLY 128 GB????

    None of my servers has that much RAM!

    Putting the table on a separate LUN will help with the I/O, but won't mitigate buffer cache contention.

    You'd better isolate this column from the rest of the table if buffer cache is a major concern.

    What are you planning to do with this nvarchar(max) column?

    How do I isolate the column from the rest of the table?

    The column will have some product description content.

  • Couple of things here will help.

    For starters, make sure you use the TEXTIMAGE_ON for the table creation to move the BLOB data to its own filegroup. You can shove that filegroup anywhere, but it'll help keep your database's fragmentation and space usage to something normal. BLOBs tend to really screw up the primary DB if you don't do this.

    Be aware you can't use TEXTIMAGE_ON for a separate filegroup if you intend to use partitioning.

    The cache should only be affected if you use SELECT * or directly attempt to call this column if you put it in a separate filegroup. Make sure all queries that want to deal with this data only retrieve a few rows at a time. This will help mitigate the issue.

    Alternatively, especially if you want to partition, create a 1 to 1 relation table from the primary table that will store the BLOB data and is only used when that specific data is required. The data itself should only cause you serious problems in memory if your queries are pulling back huge volumes of rows AND your BLOB data is In_Row_Data. Avoid that unless the MAX is for exceptions (1 in 10,000 or less as my personal guide) instead of the rule. You will most likely want this all stored as LOB_DATA.

    A bit more information from Books Online (copy paste to your BOL): ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_0evalplan/html/5b6751a2-e67e-432c-8999-eb6382d960c4.htm

    Warning, that starts geting pretty heavily into the internals and makes a few assumptions about some other knowledge you will have (Extents, Pages, GAM/SGAM, etc).

    The easiest way to avoid most of the problems is as previously suggested, create a table on a separate filegroup dedicated to LOB data and 1-1 relationship it.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I should have mentioned one thing. I try to avoid LOBs like the plague. Obviously I haven't been able to all the time and have worked with them, but they are just a royal PITA. The only time they should really be stored in the DB is when they need to stay in lockstep with other data in the database, and restored/backed up simultaneously and in sync.

    For any other reason, I do my best to store them as real files and simply offer links to the files on the fileshare stored in row like any other varchar() column.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I was also thinking of using filestreaming to store the nvarchar(max) values on the file system. This should fix the memory issue, as the buffer cache is not used with filestreaming. And the data fragmentation issue as well. The only thing is we are using replication, so I am not sure if filestreamed data can be replicated.

  • The only thing is we are using replication, so I am not sure if filestreamed data can be replicated.

    Yes you can but with some limitations.

    Using FILESTREAM with Other SQL Server Features

    http://technet.microsoft.com/en-us/library/bb895334.aspx#Replication

  • Dev (11/16/2011)


    The only thing is we are using replication, so I am not sure if filestreamed data can be replicated.

    Yes you can but with some limitations.

    Using FILESTREAM with Other SQL Server Features

    http://technet.microsoft.com/en-us/library/bb895334.aspx#Replication

    The only other concern is the fact that a typical file size is around 60-80KB, and Microsoft reccomends using Filestream for files larger then 1MB.

    So it may be better to store them in the table, but on the other side having the database size quadripple because just of one table is not good.

  • The only other concern is the fact that a typical file size is around 60-80KB, and Microsoft reccomends using Filestream for files larger then 1MB.

    So it may be better to store them in the table, but on the other side having the database size quadripple because just of one table is not good.

    Practically we can’t avoid Tradeoffs. I am more incline to store them into table.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply