Home Forums SQLServerCentral.com Editorials Guest Editorial: That ain't a Database, it's a Spreadsheet RE: Guest Editorial: That ain't a Database, it's a Spreadsheet

  • I don't think your editorial (like many I have seen along this topic line) is realistic. Sure, it would be great on Day One to "know" the size of your final production database - but even a statement such as that is highly flawed. Think about it - Is the size of a database ever fixed? No, always fluid. So what does "final production database" really mean? Is that the database on Day One, Day One Hundred, Day One Thousand?

    Maybe in telecommunications work you can say "We have X number of customers who make X number of calls..." and from that you extrapolate expected size and load - but let me pose a question that in my experience, is more the rule than the exception...

    Lets say you work for a large company - who cares what they do, but.... How many JPG image files are there RIGHT NOW on your entire company network? Do you know? Can you guess? Is it 100, maybe 1000, maybe 1 million, maybe 10 million - fact is, you don't know and likely cannot know easily. So how do you design and test a database to track them? How do you load up test records to "match" what that expected load is? Suppose you were counting these JPG image files - how many are actually duplicates of the same file? You cannot know that any better than you know the total number.

    Another example - Imagine if you got a job tomorrow in a company with a 250 node network, and you were asked to design a SQL database to track all Word documents in the company on every server and work station in the company. How would you proceed in assessing just how many records you would be dealing with to start? How many new Word documents are generated each day? Each month, year? You cannot simply "pick a number" with any degree of certainty.

    This is something our companies deal with all the time. We work in businesses and applications where you cannot simply "pick" a number and say "this is where we're going to be when we're running in production"... Its simplistic to assume most businesses are quantifiable - in fact, the majority are not and though I think good design and testing is vital - to present it as some sort of science where you just "figure" the total load you will be dealing with is just not real world thinking for the majority of businesses.

    Plan for the worst, build for the best, but don't presume you can "know" anything on Day One because 99.99% of the time, you will be wrong unless you are truly fortunate to be working in a field where these numbers are truly known on Day One.

    There's no such thing as dumb questions, only poorly thought-out answers...