Can SQL Server do ‘Big’?

  • Comments posted to this topic are about the item Can SQL Server do ‘Big’?

    Best wishes,
    Phil Factor

  • I have seen one of those limits breached a number of times. It's the number of parameters (2,100) which applies to any query.

    Amusingly I've had developers complain that this is not high enough and that there must be something wrong with SQL as it won't let them do what they want. It's not that big or complex a DB either (1TB or so, 700 tables).

    The queries themselves don't come via a stored proc but via nHibernate. What you typically see is a query being passed in as follows :

    SELECT blah FROM dbo.Table WHERE MyParam IN (@P0, @P1...@P2100)

    This all depends on how many params get passed in, so if I have 2,099 I get

    SELECT blah FROM dbo.Table WHERE MyParam IN (@P0, @P1...@P2099)

    and the hash of the query changes, so we compile a new execution plan and so on.

    Mix this in with another approach you see where it explicitly states the size of the values being passed in and you get another variation for the length of the parameter. If I'm passing in a surname with 7 characters, I get one execution plan and if I pass in a surname with 6 characters and I get another. You can get both together for extra fun !

  • I commonly talked to people at the NTSSUG that have very large SQL Server environments, (mainly medical and financial), but I personally haven't had a client that requires something of that scale.

    Where I encounter colleagues that have insanely huge data requirements are in the physics, weather, astronomy and biology science realm. They truly have "Big Data" issues. Most of their tools are not known in the Microsoft realm, (outside of folks like Buck Woody), but I believe they really could benefit someone with an open mind that loves data. There's some really cool software that's been developed to deal with their issues that I'm incorporating into the "normal" business data world.

  • 2,100 parameters for a stored procedure?

    I can't imagine writing a stored procedure that takes in 2,100 parameters. I haven't even written a C# method that takes more than 25 parameters. 2,100 parameters would make my head spin.

  • Nothing in my world that comes anywhere close to "BIG" so far....

  • When I worked for a pharmaceutical giant handling the Clinical Trials database it was close to that number but the company used Oracle so it doesn't really apply.

  • Only a few tenth of gigs but the foreign keys limit was reached which gave us some headache.

    Also the object naming of 128 characters is way too low (sysname). Having several tenth of thousands of objects, naming objects using their functionality are mandatory to find what we need rapidly.

  • Well, I can't say that I 've ever had the need for 2100 parameters in a stored procedure, but I have built and worked on some fairly large systems. One in particular is in a large manufacturing corporation, encompasses about 30 databases hosting around 12,000+ objects and spanning two different servers. Believe it or not, most of it is still running in SQL 2000. We talked about converting many times but the feeling of the IT department is "if it's not broke, don't fix it". The system is quite complex and has remote database queries that are used to ferry data between the SQL 2000 system and the SQL 2005 system. In SQL 2005, we have a very large data warehouse that is used to generate Cognos reports for data that is aggregated, pivoted, diced, sliced, massaged, and re-purposed in tables for quick access. Some of the tables contain 2 million plus records, so speed is a factor for the management relying on the reports.

    I thought our 20 GB reporting database was big. But then again, when I think of "big data", I'm thinking of Amazon or Facebook with their dozens if not hundreds of replicated servers running all over the globe with the speed requirements of a teenager with a smartphone. I'm always a bit mystified by how they accomplish these seemingly lightning fast data retrievals against such a behemoth system.

    Hats off to the analysts that have to keep all that straight. Documentation will only get you so far. The rest relies on specialized skills and experience.

    Jerry Boutot, MCAD MCP, MTA
    Jerry Boutot Official

  • I can appreciate domain-specific big data requirements in sciences or aggregating many streaming sensor data. However, I wonder if there exists a properly normalized schema that needs 253 foreign key references on a table. If you know how that happens, please do share - I'm sure it would make an interesting article (*nudge*)

    ryan.offord (9/20/2013)


    I have seen one of those limits breached a number of times. It's the number of parameters (2,100) which applies to any query.

    Amusingly I've had developers complain that this is not high enough and that there must be something wrong with SQL as it won't let them do what they want. It's not that big or complex a DB either (1TB or so, 700 tables).

    The queries themselves don't come via a stored proc but via nHibernate. What you typically see is a query being passed in as follows :

    SELECT blah FROM dbo.Table WHERE MyParam IN (@P0, @P1...@P2100)

    This all depends on how many params get passed in, so if I have 2,099 I get

    SELECT blah FROM dbo.Table WHERE MyParam IN (@P0, @P1...@P2099)

    and the hash of the query changes, so we compile a new execution plan and so on.

    Mix this in with another approach you see where it explicitly states the size of the values being passed in and you get another variation for the length of the parameter. If I'm passing in a surname with 7 characters, I get one execution plan and if I pass in a surname with 6 characters and I get another. You can get both together for extra fun !

    Of course that kind of automated error requires a framework. I haven't heard from developers who love nHibernate, but I hear from DBA who hate it. It seems to actively abuse the database.

    No doubt there are many other applications that "worked" as a prototype then scaled up to production that "sorta worked" as long as there's enough surplus resource in the server to accommodate the inefficient design/approach. Unfortunately the cost to re-engineer often exceeds the cost of more server resource, so the poorly-written applications end up with dedicated servers while those done-right-the-first-time applications lose deadlock contests (for example) to those apps that haven't have been isolated yet. 🙂

  • My largest database isn't big based on size, but I'd count it big based on number of objects. I'm pushing 120,000 tables with a minimum of 50,000 more in the near future. Before anyone criticizes design, it's a 3rd party. I only support it I didn't create it.

    Tim

  • Before anyone criticizes design, it's a 3rd party. I only support it I didn't create it.

    90% of all running software needs / run into maintenance / support (and this could get higher since the time I got that information)

    So most us of fall into this category (maintaining software which pass over several hands already)

    Don't be worry, we are almost all into the same situations 😉

  • I worked for a client that believe that repeatedly stated that their database is the biggest SQL Server installation in Europe.

    It isn't.

    Unfortunately it lead them to believe that they were "special". Which in turn lead to poor practices and a self-belief that any downtime was due to the extreme nature of their systems. The truth, in my never humble opinion, is that it is a large database, however, their systems are poorly designed and their database poorly maintained leading to use the relatively large size as an excuse.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • There's big by design and big by lack of design.

    Big doesn't necessarily mean complex or difficult it really does depend on your use case. For example, the Microsoft Fastrack implementation is great if what you need is a fast scan to satisfy a query on a dimensional mart that has been loaded sequentially. It isn't so great for a large number of joins.

    I have seen a 15,000+ table database and it was unwieldy from a human usability perspective. No-one is going to remember what that many tables are for!

    remember that some database systems have one database containing multiple schemas where as SQL Server can have many databases with multiple schemas. 15,000 tables across and enterprise doesn't look so big if all you have to work with is 1 database. If it is spread across 100 databases then a 150 table average isn't so bad.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply