Varchar Column type estimates

  • I just realized sql server is estimating different data size when my column is varchar(7000) with table 72 million records, it is estimating about 250GB. I understand how it is getting to that number . Is there a better way to handle this other than reducing the column width because i could receive data that is 300 bytes and sometimes even upto 7000 bytes. My concern is though the real data is less 7000 characters , optimizer is estimating 250GB of data size which is making it reserve 250GB of space in RAM when in reality data size is only 17GB.

  • How about using BULK INSERT ?

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Friday, March 3, 2017 9:58 PM

    How about using BULK INSERT ?

    These are select statements.

  • curious_sqldba - Friday, March 3, 2017 2:19 PM

    I just realized sql server is estimating different data size when my column is varchar(7000) with table 72 million records, it is estimating about 250GB. I understand how it is getting to that number . Is there a better way to handle this other than reducing the column width because i could receive data that is 300 bytes and sometimes even upto 7000 bytes. My concern is though the real data is less 7000 characters , optimizer is estimating 250GB of data size which is making it reserve 250GB of space in RAM when in reality data size is only 17GB.

    Of course, "It Depends" on your query and a couple of other things.  If you're talking about a query that does NOT include the VARCHAR(7000) column but still relies on the clustered index, then the creation of a "duplicate" non-clustered index (same keys as the CI) along with (possibly) a couple of INCLUDEs (in other words, a "covering index) may be the answer to both the reduction of memory estimates and a serious increase in performance.  Of course, remember that any non clustered index is a duplication of data that affects backups, restores, etc, and will take time to build on such a large table.  If you have index maintenance routines, it will affect those, as well.

    As a bit of a sidebar, I've not rebuilt any indexes on any of my servers since 17 Jan 2016 (over a year now) and, although some disk space is wasted, it's not much and performance has only gotten better (especially in the first month).  If you do decide to do index maintenance on a regular basis, I recommend it be done base of "average percent of page used" rather than percent of fragmentation.

    Of course, no matter what you do with your indexes, it's uber important to keep statistics up to date.  You can have the best indexes in the world and still suffer greatly if the stats aren't up to date.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Wouldn't be able to create any index on column with varchar(7000). I will try to post sample code here soon.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply