﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / SQL Server 2008 / T-SQL (SS2K8)  / Best way to effienctly store hugh amounts of data / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Fri, 24 May 2013 17:36:08 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]aaron.reese (10/3/2012)[/b][hr]out of interest then;  if I was selling books (ISBN, Author, Pages), wine (bottle size, ABV,Vinyard,grape,acidity) and shirts (material,collar type, sleeve length), how would you recommend storing the attributes for those products so that they were both easy to insert and easy to extract.[/quote]This is an easy problem to run into... particularly if you want to help out, say, webcomic authors sell random stuff off their site.  Two mugs, a few shirts, some comics... etc.These guys (and gals) will sell anything that makes sense to them, it's their business.  How do you fit in Deelybobbers into that list when they fad again?  Semi-generic makes sense here.To the question itself, it's a question of where do you want to do the processing?  In the end, most of it will be on the front end.  To accurately display and explain the information coming from such disparate entities the front end needs to have some kind of brain to deal with it.At the DB level, you primarily store the core information.  Price, quantity/shipment, cost, etc.  What's generic to each and every item.  Then, for attributes that have no business affect, you EAV store (personal preference).  The reason I do it this way is that the database does NOT care about this information, it just stores it for the front end.But I want to know how many medium vs. large shirts I sell!!!  Well, that's great, you approach that backwards and get your medium shirt ProdIDs, Large Shirt Prod IDs, then drop them back to the main table and get your real information.  Yes, it's painful, and requires a ton of ad-hoc querying, and you probably just blew out performance, but you'll get there.The other option is the generic column description above, where the front end re-interprets the information.  I dislike these, as it's far too easy to end up with a mess if a single item is 'misplaced'.  Also, most of the EAV data is only required when you're listing descriptive information on a single product, thus a single call to the EAV table with the ProductID and Type returns everything the front end needs to display the product's information in an intelligent manner.  This is actually even WORSE for ad-hoc querying, because trying to decipher the generic columns for a particular product type usually has you going through reams of external documentation.  Depending on indexes however it may be faster once you're done with the development... of an ad-hoc.However, a lot of this discussion isn't about variable product information, it's about storing things like state in an EAV for a client.  That is poor use of the design.  EAV has its place, and primarily (in my opinion) it's for data the database won't care about except in one-off requests for the front end's programming.</description><pubDate>Thu, 04 Oct 2012 15:04:50 GMT</pubDate><dc:creator>Evil Kraig F</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]bteraberry (10/4/2012)[/b][hr][quote][b]TheSQLGuru (10/4/2012)[/b][hr]2) Having a table per object type sold is unworkable - how many different types of objects does Amazon sell??[/quote]So if you're selling products you have to design a schema that would work for Amazon.com?  What if you're only selling three different types of items?I disagree with your dogmatism.  The correct answer is that "it depends".  I've seen a lot of developers come up with very complex solutions to support a flexibility they don't really need and they end up wasting a lot of time in the process.[/quote]It is easy to show points by taking things to the extreme.  But in this case say you only sell 3 types of things and you code that.  Now you sell a 4th thing.  You will likely have to modify the database schema, load processes, application code to know about that additional product table.  Then a 5th, etc.  I imagine there are few entities of any size that sell things that don't add new products/product types (or at least want to)! :-)And I thought my dogmatism was actually anti-dogmatism, with several proposed solutions and guidances about potential pitfalls.  And I ABSOLUTELY agree with your statement that developers sometimes (often?) build solutions that are overly/unnecessarily complex (or use new/shiney "crap" to do so) and wind up with pain because of those decisions.  Hell - I make quite a bit of money cleaning up those messes!!  :hehe:</description><pubDate>Thu, 04 Oct 2012 14:18:54 GMT</pubDate><dc:creator>TheSQLGuru</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]TheSQLGuru (10/4/2012)[/b][hr]2) Having a table per object type sold is unworkable - how many different types of objects does Amazon sell??[/quote]So if you're selling products you have to design a schema that would work for Amazon.com?  What if you're only selling three different types of items?I disagree with your dogmatism.  The correct answer is that "it depends".  I've seen a lot of developers come up with very complex solutions to support a flexibility they don't really need and they end up wasting a lot of time in the process.</description><pubDate>Thu, 04 Oct 2012 10:57:23 GMT</pubDate><dc:creator>bteraberry</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]aaron.reese (10/3/2012)[/b][hr]out of interest then;  if I was selling books (ISBN, Author, Pages), wine (bottle size, ABV,Vinyard,grape,acidity) and shirts (material,collar type, sleeve length), how would you recommend storing the attributes for those products so that they were both easy to insert and easy to extract.[/quote]1) As I have said before, if you do EAV "properly" (which is VERY difficult/tedious), it really can be an efficient and "proper" design for this type of varied data collections.  What you currently have is neither efficient nor proper on a variety of fronts.2) Having a table per object type sold is unworkable - how many different types of objects does Amazon sell??3) Sparse Columns come to mind as a potential solution for what you are trying to do, especially if you "genericize" them and let the application logic know what column is what attribute for each type of object.  Again, that gets "difficult" for the app, but will SIGNIFICANTLY reduce the total columns you would need and also reduce the number of times over the life of the app that you will need to add columns to the table (unless you just start out with some very high number such as 1000).  For example, have 5 Attribute columns: Attribute1, Attribute2, Attribute3, Attribute4, Attribute5.  For Books, Attribute1 would be ISBN, Attribute2 would be Author, etc. But for Wine Attribute1 would be bottle size and Attribute2 would be ABV, etc.I note that Sparse columns do carry some downsides of primarily being xml-ish in nature and have quite a few caveats, limitations and provisos.  Also, if you did go with a generic set you might have few-enough of them to not really benefit from their overhead.</description><pubDate>Thu, 04 Oct 2012 08:03:02 GMT</pubDate><dc:creator>TheSQLGuru</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]Maddave (10/3/2012)[/b][hr]Thanks guys,  all very interesting stuff. I'm getting more in to how this application is implemented and it's becoming scary.  As Sean Lange mentioned, the developers have implemented the columns as varchar(max) and there is a separate table with each datatype listed for every specific column, so that it is known what data is being inserted.I have also found that there is a parent/child relationship for some types of data so, for one "product", lets say, it can have multiple child products, which get read from a table and inserted into another table.  When the product data is inserted into the databases, the parent is inserted first, then all child records for that item are inserted into the same table, but with a parent id column set to the parent.The next thing that happens is, each specific attribute for each child, is inserted into another table. So for example, the values, type = "addrLine1" and value = "123 5th avenue" are inserted, then, type = "addrLine2" and value = "New York" is inserted into another row of  the same table, with the product id of the child.  The "type" coluum is a foreign key of the only table the developers want to maintain, by adding new attributes when a different product comes along.  All this does, is lead to the one table having about 10 or more rows, (therefore inserts too) for each attribute of just one item! This is bad enough as it is, but the child records could be in their millions for just one product!  So what could potentially be a million single inserts, actually gets multiplied by the number of specific inserts for each child attributes!  For example, 17000 products where meant to be added to the databases yersterday, each had 1000's and some had millions of child items. This 17,000 transactions took 4 hours to complete!!Once I get the code out, I'll post it and see what you think! It's quite shocking.[/quote]The big question I have:  Are you doing anything at all with the attribute data other than storing it and returning it back to the application?  Are you using those attributes to filter queries, compile statistics, etc?  If not, and if all you are doing is passing all the attributes back to the application, I would look at storing that data in an XML or JSON format in a single column in the parent table.  (I prefer JSON since it is more efficient.)  We have an application that works with Twitter data and we keep all the data that we want to present but that we never filter by in a JSON string because the attributes included with the tweet record are highly variable based on a number of factors.  Queries are run based on attributes like username, date, etc, that are broken out into columns in the parent table and the rest of the data is conveyed via JSON to the application.If the attribute values [i]are[/i] relevant to querying and whatnot, something I would look at is the use of sparse columns.  Microsoft implemented that strategy in the database architecture that supports SharePoint.  They put all their "things" into a single table and have a boatload of all possible attributes.  Any individual item will only use a tiny fraction of the attributes and so the majority of the data in those columns is null.  Without knowing the specifics of what your system is doing, it sounds as though your needs may be similar.Good luck.</description><pubDate>Wed, 03 Oct 2012 12:04:08 GMT</pubDate><dc:creator>bteraberry</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]aaron.reese (10/3/2012)[/b][hr]out of interest then;  if I was selling books (ISBN, Author, Pages), wine (bottle size, ABV,Vinyard,grape,acidity) and shirts (material,collar type, sleeve length), how would you recommend storing the attributes for those products so that they were both easy to insert and easy to extract.[/quote]If you're going to try to cram such disparate entities into a single table you need to have a very very good reason to do so.  In your scenario I don't see any reason why you wouldn't have separate tables for the completely different items you're selling.  You can still query in such a way as to combine different items from the different tables without sacrificing sensible design.</description><pubDate>Wed, 03 Oct 2012 11:45:49 GMT</pubDate><dc:creator>bteraberry</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>out of interest then;  if I was selling books (ISBN, Author, Pages), wine (bottle size, ABV,Vinyard,grape,acidity) and shirts (material,collar type, sleeve length), how would you recommend storing the attributes for those products so that they were both easy to insert and easy to extract.</description><pubDate>Wed, 03 Oct 2012 09:53:07 GMT</pubDate><dc:creator>aaron.reese</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]Maddave (10/3/2012)[/b][hr]Thanks guys,  all very interesting stuff. I'm getting more in to how this application is implemented and it's becoming scary.  As Sean Lange mentioned, the developers have implemented the columns as varchar(max) and there is a separate table with each datatype listed for every specific column, so that it is known what data is being inserted.I have also found that there is a parent/child relationship for some types of data so, for one "product", lets say, it can have multiple child products, which get read from a table and inserted into another table.  When the product data is inserted into the databases, the parent is inserted first, then all child records for that item are inserted into the same table, but with a parent id column set to the parent.The next thing that happens is, each specific attribute for each child, is inserted into another table. So for example, the values, type = "addrLine1" and value = "123 5th avenue" are inserted, then, type = "addrLine2" and value = "New York" is inserted into another row of  the same table, with the product id of the child.  The "type" coluum is a foreign key of the only table the developers want to maintain, by adding new attributes when a different product comes along.  All this does, is lead to the one table having about 10 or more rows, (therefore inserts too) for each attribute of just one item! This is bad enough as it is, but the child records could be in their millions for just one product!  So what could potentially be a million single inserts, actually gets multiplied by the number of specific inserts for each child attributes!  For example, 17000 products where meant to be added to the databases yersterday, each had 1000's and some had millions of child items. This 17,000 transactions took 4 hours to complete!!Once I get the code out, I'll post it and see what you think! It's quite shocking.[/quote]YIKES!!!!! :w00t::w00t::w00t:It sounds like you may be able to write the next chapter in the article I posted. As a few others have said in this thread, EAV is useful for extremely specific situations. It is by no means a replacement for relational data in an OLTP system. I hope you are able to either talk some sense into these people. Your 17,000 product insert should take a few seconds...4 hours if unbelievable!!!</description><pubDate>Wed, 03 Oct 2012 08:21:19 GMT</pubDate><dc:creator>Sean Lange</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]Maddave (10/3/2012)[/b][hr]Thanks guys,  all very interesting stuff. I'm getting more in to how this application is implemented and it's becoming scary.  As Sean Lange mentioned, the developers have implemented the columns as varchar(max) and there is a separate table with each datatype listed for every specific column, so that it is known what data is being inserted.I have also found that there is a parent/child relationship for some types of data so, for one "product", lets say, it can have multiple child products, which get read from a table and inserted into another table.  When the product data is inserted into the databases, the parent is inserted first, then all child records for that item are inserted into the same table, but with a parent id column set to the parent.The next thing that happens is, each specific attribute for each child, is inserted into another table. So for example, the values, type = "addrLine1" and value = "123 5th avenue" are inserted, then, type = "addrLine2" and value = "New York" is inserted into another row of  the same table, with the product id of the child.  The "type" coluum is a foreign key of the only table the developers want to maintain, by adding new attributes when a different product comes along.  All this does, is lead to the one table having about 10 or more rows, (therefore inserts too) for each attribute of just one item! This is bad enough as it is, but the child records could be in their millions for just one product!  So what could potentially be a million single inserts, actually gets multiplied by the number of specific inserts for each child attributes!  For example, 17000 products where meant to be added to the databases yersterday, each had 1000's and some had millions of child items. This 17,000 transactions took 4 hours to complete!!Once I get the code out, I'll post it and see what you think! It's quite shocking.[/quote]Quite a bit of unfortunate stuff going on here.  Good luck with a) getting it to perform/scale and b) getting them to do a major refactor to allow a) to actually happen. Oh, and drop me a line if you need help with either of those!! :cool:</description><pubDate>Wed, 03 Oct 2012 08:10:57 GMT</pubDate><dc:creator>TheSQLGuru</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>Thanks guys,  all very interesting stuff. I'm getting more in to how this application is implemented and it's becoming scary.  As Sean Lange mentioned, the developers have implemented the columns as varchar(max) and there is a separate table with each datatype listed for every specific column, so that it is known what data is being inserted.I have also found that there is a parent/child relationship for some types of data so, for one "product", lets say, it can have multiple child products, which get read from a table and inserted into another table.  When the product data is inserted into the databases, the parent is inserted first, then all child records for that item are inserted into the same table, but with a parent id column set to the parent.The next thing that happens is, each specific attribute for each child, is inserted into another table. So for example, the values, type = "addrLine1" and value = "123 5th avenue" are inserted, then, type = "addrLine2" and value = "New York" is inserted into another row of  the same table, with the product id of the child.  The "type" coluum is a foreign key of the only table the developers want to maintain, by adding new attributes when a different product comes along.  All this does, is lead to the one table having about 10 or more rows, (therefore inserts too) for each attribute of just one item! This is bad enough as it is, but the child records could be in their millions for just one product!  So what could potentially be a million single inserts, actually gets multiplied by the number of specific inserts for each child attributes!  For example, 17000 products where meant to be added to the databases yersterday, each had 1000's and some had millions of child items. This 17,000 transactions took 4 hours to complete!!Once I get the code out, I'll post it and see what you think! It's quite shocking.</description><pubDate>Wed, 03 Oct 2012 02:10:27 GMT</pubDate><dc:creator>Maddave</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]bteraberry (10/2/2012)[/b][hr]One minor point ... even if you decide to use what they've established, you can make the inserts quite a bit more efficient by ditching the cursor loop.  The same result can be achieved in a set-based manner.  Every cursor that is instantiated has a performance cost and that cost will add up if you're looking at 300 transactions a second.[/quote]Agreed.  I would imagine that a table valued parameter would be the simple solution here for efficient batched, single-hit insert.</description><pubDate>Tue, 02 Oct 2012 13:00:39 GMT</pubDate><dc:creator>TheSQLGuru</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>One minor point ... even if you decide to use what they've established, you can make the inserts quite a bit more efficient by ditching the cursor loop.  The same result can be achieved in a set-based manner.  Every cursor that is instantiated has a performance cost and that cost will add up if you're looking at 300 transactions a second.</description><pubDate>Tue, 02 Oct 2012 12:19:20 GMT</pubDate><dc:creator>bteraberry</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>EAV "can" be very fast and efficient if you use it for single-table storage.  But that then requires EXTRAORDINARY effort on the application side to handle virtually all of the data processing needs and even when you do that it is useful for a pretty narrow class of apps.  There is at least one Web2.0 company doing this, and I have another client using EAV to very good effect as well in a system that uses EAV constructs for a few tables.If you are going to do pure EAV storage and jump through the required hoops on the application side to manage the data, it is honestly a waste to spend the money for SQL Server.  Use one of the many free storage systems available.</description><pubDate>Tue, 02 Oct 2012 11:31:17 GMT</pubDate><dc:creator>TheSQLGuru</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]Sean Lange (10/1/2012)[/b][hr][quote][b]Jeff Moden (10/1/2012)[/b][hr]Perhaps it's just the way I've been using them but I've had no such problems with a properly constructed EAV.  Certainly, I've not had to do a 20 or 30 table self join.[/quote]Unfortunately the OP has already eluded to at least 10 self joins. One of the big challenges in this type of system is RI. Foreign keys and constraints can be a serious PITA to deal with. I have also seen them have real challenges with datatypes because the value columns will just be a varchar(max) and then you have to cast everything into the correct datatype. I have see other implementations where there was a datatype column to indicate what the datatype of the value column is.I agree that properly constructed they can be useful and even preferred in some cases. The system described by the OP does not sound like that to me. It sounds like the beginning of a nightmare. That is just my $.02[/quote]I'll typically use SQL_Variant for such things.  Although DRI is a bit of a pain (I agree that FK's really aren't practical here), the problem that properly written and used EAVs (and NVPs) solve usually makes it worth it if you do indeed allow such a thing.  There are places, especially in reporting, where they pop up and most people dont even think of the tables as an EAV or NVP.</description><pubDate>Mon, 01 Oct 2012 15:44:46 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]Jeff Moden (10/1/2012)[/b][hr]Perhaps it's just the way I've been using them but I've had no such problems with a properly constructed EAV.  Certainly, I've not had to do a 20 or 30 table self join.[/quote]Unfortunately the OP has already eluded to at least 10 self joins. One of the big challenges in this type of system is RI. Foreign keys and constraints can be a serious PITA to deal with. I have also seen them have real challenges with datatypes because the value columns will just be a varchar(max) and then you have to cast everything into the correct datatype. I have see other implementations where there was a datatype column to indicate what the datatype of the value column is.I agree that properly constructed they can be useful and even preferred in some cases. The system described by the OP does not sound like that to me. It sounds like the beginning of a nightmare. That is just my $.02</description><pubDate>Mon, 01 Oct 2012 11:55:26 GMT</pubDate><dc:creator>Sean Lange</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>Perhaps it's just the way I've been using them but I've had no such problems with a properly constructed EAV.  Certainly, I've not had to do a 20 or 30 table self join.</description><pubDate>Mon, 01 Oct 2012 11:45:32 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>This will be an interesting one!I can understand why EAV is not popular, but what other setup will let folks enter new attributes without defining new table columns? I know one application I've encountered uses this setup so that new attributes and their values can be defined without a DBA having to update tables. What other method will work in this situation if EAVs are not used?</description><pubDate>Mon, 01 Oct 2012 11:10:04 GMT</pubDate><dc:creator>patrickmcginnis59</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>Thanks guys.  That article was scary reading.  Very familiar and even some of the table names mentioned are the same as the ones the devs have deployed.I really need to understand the application more before coming up with a solution, but I am glad others agree that the current deployment is completely not right!Thanks.</description><pubDate>Fri, 28 Sep 2012 08:58:29 GMT</pubDate><dc:creator>Maddave</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>[quote][b]Sean Lange (9/28/2012)[/b][hr]What you are describing is what is known as EAV (Entity Value Attribute). This is a horrible design in almost every sense of the word. There are a few cases where this type of pattern is ok but in general avoid it like the plague. You are absolutely right that it is going to fall on its face.Check out this article explaining a generic EAV system, sounds like it is somewhat similar to what you have going on. [url=http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/]http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/[/url]It doesn't take a rocket surgeon to figure out that once you have 20-30 attributes performance is a joke when you have to create queries with dynamic sql and 20-30 self joins. The best thing that can be done for performance is a complete rewrite. That isn't always possible of course but you are in for a long battle to get it straightened out.[/quote]+1I had the misfortune to work on one of these EAV systems some years ago. Everything bad you read about these happened - poor performance, integrity issues, spiralling query complexity ... avoid if possible. Rewriting will be your best option.</description><pubDate>Fri, 28 Sep 2012 08:19:52 GMT</pubDate><dc:creator>Mark-101232</dc:creator></item><item><title>RE: Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>What you are describing is what is known as EAV (Entity Value Attribute). This is a horrible design in almost every sense of the word. There are a few cases where this type of pattern is ok but in general avoid it like the plague. You are absolutely right that it is going to fall on its face.Check out this article explaining a generic EAV system, sounds like it is somewhat similar to what you have going on. [url=http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/]http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/[/url]It doesn't take a rocket surgeon to figure out that once you have 20-30 attributes performance is a joke when you have to create queries with dynamic sql and 20-30 self joins. The best thing that can be done for performance is a complete rewrite. That isn't always possible of course but you are in for a long battle to get it straightened out.</description><pubDate>Fri, 28 Sep 2012 07:36:12 GMT</pubDate><dc:creator>Sean Lange</dc:creator></item><item><title>Best way to effienctly store hugh amounts of data</title><link>http://www.sqlservercentral.com/Forums/Topic1365688-392-1.aspx</link><description>Would be interested to hear on some of the best ways to approach this situation.I've recently started at a company as a DBA that has lot's of developers, but has never had a DBA or database developer. They have recently started developing products which receive huge amounts of data, very quickly (300 transactions a second), which needs to be stored, and queried for 5 years.  They had already created the database before I came along and are now looking to test it and see if improvements can be made.One way they have implemented the insert process of the main bulk of the transactions is as follows.  In an effort to keep the solution "generic", so that they can adapt it to fit multiple solutions, they have the following process flow.A top level products table receives a product line information. An Id, product type etc.Each product has attributes which are associated with it.  A lookup is done on the product attributes table to get the attributes just for the product type.Then a row is inserted into the productdetails table for each product attribute in a cursor loop process. For each attribute type, insert the corresponding value for that attribute type of the product into the table.Summary:Product header inserted.select all attribute types on attributes table based product type.For each attribute type, insert into productdetails table the products value of that attribute.The idea is that if a new product comes along with a whole new set of unique attributes,, then they just add the new attribute types to the attributes table and the information can be stored.To me this seems very inefficient as one product, with lots of details, is being split to on average 10 inserts into one table.  They predict there will be about 9 billion transactions a year, so on average that is going to be 90 billion rows in one table!! This has to be queried too.I'm thinking it's gonna fall on it's face!I realise this is a very open question with not much information, but any pointers on the best way to improve this would be great!</description><pubDate>Fri, 28 Sep 2012 02:46:57 GMT</pubDate><dc:creator>Maddave</dc:creator></item></channel></rss>