How a new partition Function apply for Current Data

  • yesssssss

    Index on shceme . (Not on table)

    I found the command .

    CREATE NONCLUSTERED INDEX [NonClusteredIndex-MydateFiled]

    ON [dbo].[MyTable] ([MydateFiled]

    ON PSchemaLog ([MydateFiled])

    Do I choose the right fields ? Whats is the relation between Fields for table and for scheme in this command ?

    Should I remove cluster index befor this ? (Because cluster index is on id and partitioning is on datefield)

  • MotivateMan1394 (4/21/2015)


    yesssssss

    Index on shceme . (Not on table)

    No! Table on partition scheme.

    If you want to partition the table, you need to partition the actual table, not one index on it. If the table has a clustered index, that means rebuilding the clustered index on the partition scheme.

    I would strongly suggest at this point you go back to the documentation and do a lot more reading on partitioning before you go any further.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • This is just an example.

    Take note of the columns chosen as the indexing column, as well as partitioning column.

    Be very clear that you know the difference between the partitioning and indexing column.

    I chose to make my partitioning column the same across the table and the indexes to ensure the tables and indexes are partition aligned.

    In the example, the partitions are all on the same filegroup since we did not need disk throughput, just needed to be able to drop old partitions (easy way of getting rid of old data).

    CREATE PARTITION FUNCTION DateKeyRangeMyTable(INT) AS

    RANGE RIGHT FOR VALUES

    ( 20100101,20110101,20120101,20130101,20140101,20150101,20160101,20170101,20180101,

    20190101,20200101,20210101,20220101)

    GO

    CREATE PARTITION SCHEME CustomerScheme AS

    PARTITION DateKeyRangeMyTable ALL TO

    ([DatamartTableFileGroup])

    CREATE PARTITION SCHEME IX_CustomerScheme AS

    PARTITION DateKeyRangeMyTable ALL TO

    ([DatamartIndexFileGroup])

    CREATE TABLE [dbo].[CustomerPurchaseDetail](

    [CUSTMRID] [bigint] Not NULL,

    [DATE_CYCLE] [int] NULL,

    [HighcardinalityID] [bigint] NULL

    ) ON CustomerScheme (DATE_CYC) ;

    CREATE CLUSTERED INDEX CLIX_Customer ON CustomerPurchaseDetail

    (DATE_CYC) WITH (FILLFACTOR = 100) ON CustomerScheme(DATE_CYCLE)

    CREATE UNIQUE NONCLUSTERED INDEX [idx_CUSTMRID] ON [dbo].[CustomerPurchaseDetail]

    (

    [CUSTMRID] ASC

    )WITH (FILLFACTOR = 100) ON IX_CustomerScheme(DATE_CYCLE)

    GO

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

  • You should probably demonstrate your database structure and the type of queries you are running.

    As a Data Warehouse company with a 3TB database, we installed SQL Server 2014 and configured our primary tables as Column Store Indexes and this was a HUGE performance boost for us. 30min queries will take 10secs for example.

    Column Store Indexes has a lot of limitations, but it works really well if you use it correctly.

  • Hi again

    Thank you All

    I Did partitioning successfuly.

    ------------------------------------

    But Unlike the proposed regulation I could have a better performance.

    Database 1 : (Old db Without Partitioning - and 2 index : key cluster index on id - and non cluster on date )

    select * from Table1 where RecDate between '2014-12-01' And '2014-12-05' And ClickUrl Like '%Tablet%'

    Result :

    Time = 1 Hour And 16 Mins

    output = 584,224 Records

    Database 2 : (New db With Partitioning - and 2 index : key Non cluster index on id - and cluster Index on Partition schema) - Partition on date - Every Month in a Partition

    select * from Table1 where RecDate between '2014-12-01' And '2014-12-05' And ClickUrl Like '%Tablet%'

    Result :

    Time = 27 Mins !!!!

    output = 584,224 Records

    I Think this is Better Perfromance. (About One hour less time)

  • I think the improvement is solely due to the changing of the clusterd index to be the date key as the non clustered index on date probably would not have been used due to select * lookup cost being too great for very large amount of rows.

    So I think the improvement is due to the clustered index key all by itself and not to do with the partitioning concept.

    You might find that an unpartitioned table would have been faster if clustered on the date key due to better use of paralellism.

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

  • MadAdmin (5/3/2015)


    I think the improvement is solely due to the changing of the clusterd index to be the date key as the non clustered index on date probably would not have been used

    This, exactly.

    If you want to claim the performance improvement is due to partitioning, then the clustered index and nonclustered index keys need to be the same before and after. Otherwise you're comparing apples and lemons.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • MotivateMan1394 (5/3/2015)


    Hi again

    Thank you All

    I Did partitioning successfuly.

    ------------------------------------

    But Unlike the proposed regulation I could have a better performance.

    Database 1 : (Old db Without Partitioning - and 2 index : key cluster index on id - and non cluster on date )

    select * from Table1 where RecDate between '2014-12-01' And '2014-12-05' And ClickUrl Like '%Tablet%'

    Result :

    Time = 1 Hour And 16 Mins

    output = 584,224 Records

    Database 2 : (New db With Partitioning - and 2 index : key Non cluster index on id - and cluster Index on Partition schema) - Partition on date - Every Month in a Partition

    select * from Table1 where RecDate between '2014-12-01' And '2014-12-05' And ClickUrl Like '%Tablet%'

    Result :

    Time = 27 Mins !!!!

    output = 584,224 Records

    I Think this is Better Perfromance. (About One hour less time)

    I agree with Gail and the MadAdmin. This performance improvement isn't because of partitioning. It's because of the change to the Clustered Index you did to support partitioning and the query itself. The reason why it's still painfully slow even with a Clustered Index Seek present is because the seek is only for the first row, which it would probably have been able to do more quickly in a monolithic table. After that, it still has to read through (scan) all of the of the Clustered Index at the leaf level to do the non-sargable comparison to find things that are LIKE %Tablet%.

    The SELECT * is probably returning more data than you need, as well. Are you sure that you need all of the columns from the table?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Again

    Why do you know every thing ? :crazy:

    And Why am I careless? 😀 :blush:

    Yes You Are right. I changed the Index and These are The results : (MeanWhile Last Time my machine was very busy And results were unreal)

    In Partition Case :

    1 min and 16 sec

    In Non Partition Case : (with the same index)

    2 min and 22 sec

    There are nearly equal . Index was very important.

    But at the end :

    What was the main purpose of the partitioning? If it couldn't help me in a 300 GB Database, What is the use ?

  • The purpose of partitioning is manageability. Some examples:

    You want to archive old data, then you switch out the partition containing the old data with a metadata only operation, without executing huge delete queries.

    You want to append new data, then you bulk insert to a new partition and you switch it in with a metadata only operation (same as above).

    You can have performance benefits from partitioning when all the queries that access a partitioned table use the partitioning key as a filter predicate: the optimizer will eliminate the partitions filtered out by the predicate and work on a single partition rather than the whole set of partitions.

    Partitioning without ensuring that partition elimination can happen is often a no-go. Moreover, to ensure index alignment, you have to add the partitioning key to the indexes, often making them bigger, hence slower to scan/seek/update.

    Long story short: if your application was not designed from the start for partitioning, it will likely perform worse on a partioned table.

    -- Gianluca Sartori

  • MotivateMan1394 (5/6/2015)


    What was the main purpose of the partitioning? If it couldn't help me in a 300 GB Database, What is the use ?

    From my first reply to this thread:

    GilaMonster (4/15/2015)


    partitioning is for maintenance and fast data loads/deletes

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply