billy-yons and billy-yons of rows

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 714668

    Comments posted to this topic are about the item billy-yons and billy-yons of rows

  • Danny Ocean

    SSCertifiable

    Points: 6098

    nice and easy question !!!

    Thanks
    Vinay Kumar
    -----------------------------------------------------------------
    Keep Learning - Keep Growing !!!
    www.GrowWithSql.com

  • M&M

    SSC-Insane

    Points: 21679

    Nice one. Steve.

    M&M

  • Toreador

    SSChampion

    Points: 11223

    I hope those who answered 32767 were joking!

  • Hugo Kornelis

    SSC Guru

    Points: 64645

    Toreador (1/20/2012)


    I hope those who answered 32767 were joking!

    Maybe they were misreading "rows" for "columns"?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • rfr.ferrari

    SSCertifiable

    Points: 6879

    easy question!!!

    Thanks Steve!


    [font="Times New Roman"]rfr.ferrari[/font]
    DBA - SQL Server 2008
    MCITP | MCTS

    remember is live or suffer twice!
    the period you fastest growing is the most difficult period of your life!
  • Richard Warr

    SSCertifiable

    Points: 6955

    I wonder if there are any limits peripheral to this though?

    For example, when I look at Table Properties in SSMS and select Storage it tells e how many rows are in the table. Presumably this is some sort of numeric field which has an upper boundary.

    Or from http://msdn.microsoft.com/en-us/library/ms175997.aspx we see that COUNT() returns an Integer. What happens to SELECT COUNT(1) FROM myTable when myTable contains more rows than an integer can store?

    _____________________________________________________________________
    MCSA SQL Server 2012

  • Hugo Kornelis

    SSC Guru

    Points: 64645

    Richard Warr (1/20/2012)


    Or from http://msdn.microsoft.com/en-us/library/ms175997.aspx we see that COUNT() returns an Integer. What happens to SELECT COUNT(1) FROM myTable when myTable contains more rows than an integer can store?

    You get an error message.

    So if you ever need to get a rowcount on a table that might exceed 2,147,483,647 rows, use COUNT_BIG() instead - this function returns bigint, with an upper limit of 9,223,372,036,854,775,807 rows. If you exceed that, I don't know what happens - but in that case, I would definitely like 0.01% of your hardware budget! 😉


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Richard Warr

    SSCertifiable

    Points: 6955

    Yes, I was forgetting COUNT_BIG.

    As for your last comment perhaps a better answer to the question would be "Limited by how much money you have" 😉

    _____________________________________________________________________
    MCSA SQL Server 2012

  • Paul White

    SSC Guru

    Points: 150341

    I like these sorts of questions, nice one Steve.

  • Hugo Kornelis

    SSC Guru

    Points: 64645

    Hugo Kornelis (1/20/2012)


    Richard Warr (1/20/2012)


    Or from http://msdn.microsoft.com/en-us/library/ms175997.aspx we see that COUNT() returns an Integer. What happens to SELECT COUNT(1) FROM myTable when myTable contains more rows than an integer can store?

    You get an error message.

    So if you ever need to get a rowcount on a table that might exceed 2,147,483,647 rows, use COUNT_BIG() instead - this function returns bigint, with an upper limit of 9,223,372,036,854,775,807 rows. If you exceed that, I don't know what happens - but in that case, I would definitely like 0.01% of your hardware budget! 😉

    Just for kicks, I decided to do the math. (See "Estimating the size of a clustered index" in Books Online)

    A relational table should always have a primary key to be useful. To store more than 9,223,372,036,854,775,807 with primary key violations, the primary key must be at least 8 bytes, so the smallest possible table to exceed the max of COUNT_BIG() has a single 8-byte (e.g bigint) column.

    The row size is the 8 bytes for the data, plus three for the NULL bitmap, 4 for row header overhead, and 2 for the row's entry in the slot array; that makes 14 total. This means that SQL Server can store at most 8,096 / 14 = 578 (rounded down) rows on a single data page. For 9,223,372,036,854,775,807 + 1 rows, that makes 15,957,391,067,222,796 (rounded up) pages for the leaf level of the clustered index.

    For the non-leaf pages, no NULL bitmap is used if the column is not nullable, but the overhead increases a bit - here, the size for each row is 8 (data) + 1 (overhead) + 6 (child page pointer) + 2 (slot array) = 17 bytes; this means SQL Server can fit 8,096 / 17 = 476 (rounded down) index rows on a page.

    For the first level above the leaf pages, we would need 15,957,391,067,222,796 / 476 = 33,523,930,813,494 (rounded up) pages.

    For the second level above the leaf pages, we would need 15,957,391,067,222,796 / (476 ^ 2) = 70,428,426,079 (rounded up) pages.

    For the third level above the leaf pages, we would need 15,957,391,067,222,796 / (476 ^ 3) = 147,958,879 (rounded up) pages.

    For the fourth level above the leaf pages, we would need 15,957,391,067,222,796 / (476 ^ 4) = 310,838 (rounded up) pages.

    For the fifth level above the leaf pages, we would need 15,957,391,067,222,796 / (476 ^ 5) = 653 (rounded up) pages.

    For the sixth level above the leaf pages, we would need 15,957,391,067,222,796 / (476 ^ 6) = 2 (rounded up) pages.

    And for the root level, seventh level above the leaf pages, we would need one more page.

    Adding all these numbers together gives me a total requirement of 15,990,985,574,732,742 8K pages to store this table. That is approximately 122,000,000,000 TB. If you want to build that using HP 6400/8400 Enterprise Virtual Array machines (see http://h10010.www1.hp.com/wwpc/us/en/sm/WF25a/12169-304616-304648-304648-304648-3900918.html?dnr=1 for specifications) at their maximum capacity of 628 TB a piece, you would need to have almost 200 million of those machines.

    Unfortunately, you would have to wait until Microsoft releases an improved an improved version of SQL Server - with its current supported maximum database size of "only" 524,272 terabytes, it seriously falls short for this kind of challenge.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • paul.knibbs

    SSCoach

    Points: 15270

    Hugo Kornelis (1/20/2012)


    with its current supported maximum database size of "only" 524,272 terabytes, it seriously falls short for this kind of challenge.

    What actually imposes that limit? It seems a kind of odd number--if we count 8k pages then a 46-bit number would hold enough for a database that size, and even if we counted bytes then 59-bit would be enough.

  • Richard Warr

    SSCertifiable

    Points: 6955

    Hugo Kornelis (1/20/2012)


    Just for kicks, I decided to do the math. ....................

    Thanks, best post I've seen all week. I love stuff like that. 🙂

    _____________________________________________________________________
    MCSA SQL Server 2012

  • Hugo Kornelis

    SSC Guru

    Points: 64645

    paul.knibbs (1/20/2012)


    Hugo Kornelis (1/20/2012)


    with its current supported maximum database size of "only" 524,272 terabytes, it seriously falls short for this kind of challenge.

    What actually imposes that limit? It seems a kind of odd number--if we count 8k pages then a 46-bit number would hold enough for a database that size, and even if we counted bytes then 59-bit would be enough.

    My guess is that this limitation is the logical consequence of two other limitations: the max size of 16 TB for each data file, and the maximum number of 32,767 data files. 16 * 32,767 = 524,272.

    The maximum number of data files is the maximum number in an unsigned 2-byte integer (smallint, in SQL Server terms).

    I don't know the exact cause of the 16 TB limit for each file. Maybe it has to do with the structure of special pages that apply to the entire file (like GAM, SGAM and PFS pages, or IAM pages). Or maybe it is simply a file size limitation imposed by the Windows OS or the NTFS internals?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • paul.knibbs

    SSCoach

    Points: 15270

    Hugo Kornelis (1/20/2012)


    I don't know the exact cause of the 16 TB limit for each file. Maybe it has to do with the structure of special pages that apply to the entire file (like GAM, SGAM and PFS pages, or IAM pages). Or maybe it is simply a file size limitation imposed by the Windows OS or the NTFS internals?

    Or it could be that a signed 32-bit integer would allow for 2147483647 pages, which is 16Tb minus a page...no idea if that's the case or not, though!

Viewing 15 posts - 1 through 15 (of 50 total)

You must be logged in to reply to this topic. Login to reply