Home Forums SQLServerCentral.com Editorials Guest Editorial: That ain't a Database, it's a Spreadsheet RE: Guest Editorial: That ain't a Database, it's a Spreadsheet

  • The statement has been made in essence that one cannot assume to know the real size of an intended data system during dev and test.I must disagree. The business of the client almost always defines the needs, whether those in charge recognize those needs or not. Again, the business, not the representative or owner of the business, defines the needs. In both examples given as examples in the previous post, one could have managed a small development spike (borrowing from Extreme lingo here) to assess an average of at least a percentage of the share involved.For example: I was involved in a 6 month contract to manage the entire data flow from man-hours, to production costs, to delivery bill-of-lading documents, for a specific buy-out and deployment of product to over 300 new locations. The company had no idea even if it would be a profitable venture, let alone how they would manage production output. So we took a sampling from about 10% of the locations, obtained data for those, assesed cost on each basic service, associated man-hours for each service, and was able to report the expected value, cost, and man-hours of the entire job, let alone the expected size of the data project itself. The data applied to both the database and controlled the actual output flow of the work, and gave the client specific estimates on what man-power was required to do the job.One can and must be able to estimate the general size of the project. One may not be able to put a real count on number of rows or relative disk size. However one, with enough experience should very well know the sheer scope of the information, given the client's business size and expected growth. Test beyond that margin. When in doubt, assume it will be larger than anticipated.So, instead of hard count estimates, I tend to lean on about 5 categorical sizes of small, medium, large, very large, and horrendously large. This consideration is given to every table in the test system. Some tables are lookup and as such remain small even in systems that are overall horrendously large. Some tables, such as transactional history on frequently changed data, are expected to be extremely high row count, but relatively narrow, so query time is affected, but inserts are not (especially frequent in many-to-many relationships). And then there are some tables that are so wide that one even considers breaking them into sections.The point is, there is no reason to believe one cannot make at least a very good educated decision about expected size of database, both in count and physical disk space, and add a little (or a lot) extra just to make sure the ironworks float. Know your client. Know the industry. Know generics about business data in general. If one does not know these, one should not be leading a project.The show stopper, more often than not, is that clients tend to fault on the side of too skimpy, especially when it comes to hardware, and then on both great DBA and Admin management. Many clients tend to not want to spend anything until it is failing. I have seen a little too often the lofty goals of a client get mangled by the infrastructure they did not think was critical, thinking they'd just hire someone when they got into a jam. It takes years off their growth margins, and lost opportunity costs.As an aside, in the example I gave above, that client had actually lost a major contract because their internal data guys could not give them the answers that would prove the job manageable, cost-worthy, nor the structure of the management flow of data. They simply could not see outside of the box of their current billing and production systems. A year and a half later, another similar job came up, and one of the owners decided get help. Within two weeks we gave them what they needed to make the necessary decisions, and get the job done. My contract was immediately extended from 2 week assessment to the full 6 months of deployment to manage the data flow. Trucks were loaded and delivered by the new system lading orders. They made millions. We made a year's salary in 6 months. It was win-win... but their internal IT people were never happy about it, but could say nothing.