Large Table Design

  • Hello everyone and thanks in advance for you help!

     

    I’m creating a table which will have around 100 columns, consisting unique records based on a RecordID field.  I will be importing roughly 80,000 to 100,000 records per day from CSV files into a temp table using a stored procedure executing a BULK INSERT command.  After importing records to a temp table I will be altering some of the field data for easier querying such as converting date strings to dates, Y/N to bit fields, and then pushing them into my Big table.  Once entered, there will be no change to any of the records.  The data will be queried by multiple users with a stored procedure using 20 possible parameters.  I’ve entered 20 WHERE clause filters such as:

    WHERE  @NameID IS NULL or @NameID = tblBig.NameID,

    AND @DeptID IS NULL or @DeptID = tblBig.DeptID,

    AND @StartDate IS NULL or tblBig.StartDate >= @StartDate,

    AND @EndDate IS NULL or tblBig.EndDate <= @EndDate,…

     

    My Big table does not have any indexes but does have a unique RecordID field nvarchar(44) which I cannot alter.

    After entering over 9 million test records, 100,000 at a time, I am unhappy with the speed of both the importing process and querying stored procedures.

     

    How can I design the table and stored procedures so that I can get optimal performance on both importing and querying?

    Will a Fill Factor of 100 slow down the importing or querying?

    Should I place an index on all 20 columns to be used in the WHERE clause, and would this slow down the importing to a crawl?

    I’ve seen some interesting code using dynamic sql for WHERE clauses.  Is that the direction to follow?

     

    Thanks for any ideas!

    -Rob

  • You may drop index before bulk-inserting data, then rebuild the index. FillFactor may also be changed to other value, depending on the percentage of import data next time.

    It is hard to give comments on your queries. Retrieving data from a large resource is always time-consuming. You may run index tuning wizard, or estimated execution plan to optimize your queries.

  • Rob

    If you don't have any indexes on your table then you can't have a fill factor.  I would recommend starting off by creating a clustered index at the very least.  You will have to make a judgement about which column(s) to create it on.  Perhaps one of the date columns would be a good idea if a lot of the queries involve searching on date ranges.  Have a read about covering indexes and see if one or more of these would work for you.  As SQL ORACLE says, it may be quicker to drop your indexes before the import and recreate them afterwards.  If you do that then don't set your fill factor(s) to anything other than 100.

    I don't recommend dynamic SQL - it causes more problems than it solves.  For more information on that, search this site or the internet for The Curse and Blessings of Dynamic SQL.

    Another thing to consider, if you understand the use of the parameters to your stored procedure, is to create several versions of the stored procedure, each one with a different parameter set.  That way, the query optimizer stands a better chance of choosing the best execution plan.

    John

  • The answer to your problem is data partitioning, possibly vertical and horizontal, you're talking of adding about 30 million rows a month - that needs partitioning

    You also need indexes to avoid table scanning when running queries. Each index added will slow insert performance, however.

    Disk subsystem performance will have serious effects on performance, increase spindles on the data array and use raid 10 to avoid this. I'm not going into the maths but I suggest you examine your disk subsystem with regard to your performance.

    You could manage the import and transform in one hop,probably, which would help. 

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply