When is 8000 too small?

  • Comments posted to this topic are about the item When is 8000 too small?

  • My guess would be that 8000 is too small as soon as you need 8001. 😉

    While it would be nice to maybe have more than 8060 bytes, it'll get seriously abused. Also, I'd just as soon not have to go through 2 years of service packs to fix all the bugs that could be induced by such a change.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I take it that the 8060 byte limit has its roots with Sybase. Not all database systems have such a limitation. I had the opportunity to work with one of those many years ago. It was part of a head to head competition between MS SQL Server 6.5 and Borland InterBase. Now InterBase is a good product and has many interesting features (like storing multideminsional arrays -- why, not sure). It didn't have a fixed page size. If your record was 64K long, it didn't care.

    The main reason InterBase lost the competition was that the lead programmer determined it was easier to code against SQL Server than InterBase using Visual Basic (can't remember if it was VB 5 or VB 6, but think it was VB 6). Coding to the InterBase ODBC was extremely bloody according to this individual.

  • A limitation is always bad. If someone wants to store an entire book in one cell why not let them? In the end I'm sure that will provide us with more new handy algorithms and functions.

  • Have a look in the resource database - it ain't going away anytime soon... how many times could MS hard-code 8000?

    With 2005/2008 it's easy to create unlimited version of string functions via CLR (C# etc) what annoyed me was way you had to do it. SqlString needs a [SqlFacit(maxlength=-1)] or use a SqlChar with .ToString() in your code for performing string functions. Otherwise it defaults to 8000 bytes (4000 nchars) - why?

    But the worst part is that these limits don't appear in the documentation, until AFTER someone complains about then.

    You mentioned hashbytes, case in point, it was some time before the doco was updated to reflect this limitation.

  • rlobbe (2/11/2010)


    With 2005/2008 it's easy to create unlimited version of string functions via CLR (C# etc) what annoyed me was way you had to do it. SqlString needs a [SqlFacit(maxlength=-1)] or use a SqlChar with .ToString() in your code for performing string functions. Otherwise it defaults to 8000 bytes (4000 nchars) - why?

    I agree with you in the sense that SQLCLR can allow us to get around this limitation in most places. Although many people are still CLR-shy. But I do have a function in my SQL# project (in the Free version at http://www.SQLsharp.com/) that does exactly HASHBYTES (I call it: Util_Hash). It does what is available in .Net so it does not do MD2, MD4, or SHA but it does so MD5 and SHA1 which are available in HASHBYTES. Also, it does: SHA256, SHA384, and SHA512 (which are not available in HASHBYTES).

    Regarding the requirement of [SqlFacet(maxlength=-1)] or SqlChars, that is not 100% true. SqlFacet is used by the Visual Studio Deploy function. You can always map a SqlString to NVARCHAR(MAX) without the SqlFacet by creating your own T-SQL Proc that points to the .Net method OR simply ALTER the auto-generated Proc. Which means that you can also change it. If you specify [SqlFacet(maxlength=2500)] in your .Net code you can always change the definition in the Proc to be NVARCHAR(500) or NVARCHAR(MAX). Just FYI.

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • This has already been resolved.

    If you want to work with strings greater than 8K then declare them as XML or TEXT.

    In Access use a Note data type.

    a Varchar is by definition a variable length mutable string with a max of 8000 characters.

    To the educated cross platform developer, wanting to Extract, Transform, or Load a string value larger than 8K into a Varchar is like storing a Dec(24,16) in a Integer value. It is bad code and data loss will occur.

    I guess Hufgo Kornelis would like for us to stop using advanced concepts as memory efficient well defined data types. Perhaps we should also use go to X statement and cursors again also.

  • Having the page size be a fixed 8KB for over ten years is, in and of itself, subject to serious re-thought; are all the reasons it was chosen then still valid now? Will they still be valid in another 5 years, given storage innovations? Are there any new reasons for 8KB size, or another size, that should be considered?

    I would also note that both DB2 and Oracle allow the DBA to select one _or more_ page (block) sizes to suit their particular application. A brief survey shows that the minimum size at one time was 4KB and 2KB, respectively, and both had a maximum of 32KB at one time. SQL Server's fixed block size is significantly more restrictive than that, even without adding in the fact that both DB2 and Oracle allow multiple page (block) sizes to be used together!

    For instance, given page compression, it is logical that larger page sizes would generally be better; a smaller relative portion of each page would need to be allocated to the dictionary and other overhead.

    I can certainly understand that there are arguments for 8KB (or nKB) as the best compromise for most use... but why not set whatever the "best compromise" for "most use" is as the default, and then allow us to change that size to be, perhaps, the best choice for each of our very particular uses?

  • Having also worked with Oracle for 25 years, Oracle has had user-specified database pages (e.g., 2K to 32K) for decades. See DB_BLOCK_SIZE. Rows were never limited to the page size.

    Additionally, all of the character functions (e.g., SUBSTR) work the same regardless of the underlying character data type (CHAR, VARCHAR, NCHAR, NVARCHAR, CLOB, NCLOB).

    So the SQL Server row length limit of 8000 bytes, albeit somewhat fixed in SQL Server 2005, along with the limitations on numerous built-in functions, has been a major source of frustration when I'm developing in SQL Server. Honestly, SQL Server is about a decade behind other major database vendors.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Indeed. Even though SQL Server has Sybase roots, Sybase still allows one to select a page size like Oracle and DB2 (which someone mentioned above). In my previous life as a Sybase DBA we used 2K pages. 4K pages were considered 'big'.

    I think MS should allow multiple page sizes like the other vendors. Maybe not multiple page sizes within one database, but at database creation time allow us to choose a default page size for that database.

    I know of one scenario where 8K is too large for us. For block level replication via 3rd party vendors the 8K page size has been a thorn in our side. Random updates on a large table causes pages to replicate across the WAN multiple times which can quickly saturate our link. 2K or even 4K pages would really help us in this situation.

    Just my .02

  • SQL Server hasn't always been 8k. When it was ported/based on Sybase, it was 2K pages that we dealt with. I believe v7 jumped to the 8k pages, which was a huge step.

  • jparker2 (2/11/2010)


    This has already been resolved.

    If you want to work with strings greater than 8K then declare them as XML or TEXT.

    In Access use a Note data type.

    Text has been deprecated. varchar(max) is what you use, which allows 2^31 - 1 bytes. (http://msdn.microsoft.com/en-us/library/ms176089.aspx) That's a good change, but if all the code and functions we have don't handle that 8001st byte, that's an issue.

    XML doesn't work everywhere, and I'm not sure I think that's the answer.

  • I remember that and you are correct.

    I belive that 8060 bytes was selected as a performance measure becuase of the default disk block size and I/O for read/writes to disk.

    Similar to the post from the DBA that would get better wan performance with 4k pages since ethernet has a max of around 4k per packet.

  • IMO this discussion should involve hardware designers and consider their thoughts on the sizes of clusters on the drives we will have in the near future. As 32k clusters are becoming common, it would make sense to offer variable limit 8-32k, with a warning that if you go over the cluster size on your drive, you can expect serious performance degradation. That decision should be ours.

  • This is definitely a place where the OS and SQL should align the partitions and align the cluster/page size when you create a DB as well.

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply