Fragmented index

  • zach_john (5/26/2011)


    Ninja's_RGR'us (5/25/2011)


    I'm no SSIS guru but I know this for sure, you can commit the whole process in batches and unless you specifically ask for an ordered insert you won't get any order in the insert. And with 1.5 M you can defenitely fragment the heck out of that table..

    True but the data ID created is incremented and inserted in order in 5K batches, the next 5K batch sequentially followed.

    I am swimming (treading water really) in the deep end here...but I still think we need more info before we can rule it out. A batch of 5K containing a set of IDs, even if they are a complete set of the next 5K in the sequence from the MAX in the table to MAX+5K, if inserted in an order other than the order of the clustered index will still create fragmentation.

    Are you using SSIS, OLE DB Destinations with FastLoad, and are are not setting the ORDER in the FastLoadOptions to match the order of the clustered index?

    Some demo code I created to see the out-of-order inserts creating fragmentation:

    SET NOCOUNT ON ;

    GO

    USE tempdb

    GO

    -- re/build some objects

    IF OBJECT_ID(N'tempdb..#tmp') > 0

    DROP TABLE #tmp ;

    GO

    IF OBJECT_ID(N'dbo.a') > 0

    DROP TABLE dbo.a ;

    GO

    CREATE TABLE dbo.a

    (

    id INT NOT NULL

    PRIMARY KEY WITH ( FILLFACTOR = 100 )

    ) ;

    GO

    -- fill a temp table with a bunch of ids

    WITH cte ( test_num )

    AS (

    SELECT TOP 100000

    o1.name

    FROM master.sys.objects o1

    CROSS JOIN master.sys.objects o2

    CROSS JOIN master.sys.objects o3

    )

    SELECT IDENTITY( INT,1,1 ) AS id

    INTO #tmp

    FROM cte

    ORDER BY test_num ;

    DECLARE @i INT = 0 ;

    WHILE ( @i < 6 )

    BEGIN

    -- insert a batch of ids into our concrete table

    INSERT INTO dbo.a

    (

    id

    )

    SELECT id

    FROM #tmp

    WHERE id BETWEEN ( @i * 10000 ) + 1 AND ( @i * 10000 ) + 10000

    -- in the order of the clustered index

    ORDER BY id ;

    SELECT 'after ordered insert' AS insert_type,

    --@i AS i,

    --( @i * 10000 ) + 1 AS low_range,

    --( @i * 10000 ) + 10000 AS high_range,

    --index_type_desc,

    index_depth,

    CAST(avg_fragmentation_in_percent AS DECIMAL(8, 6)) AS avg_fragmentation_in_percent,

    fragment_count,

    CAST(avg_fragment_size_in_pages AS DECIMAL(8, 6)) AS avg_fragment_size_in_pages,

    page_count

    FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.a'), 1, 1, NULL) ;

    SET @i += 1 ;

    END

    -- insert a batch of ids into our concrete table

    INSERT INTO dbo.a

    (

    id

    )

    SELECT id

    FROM #tmp

    WHERE id BETWEEN ( @i * 10000 ) + 1 AND ( @i * 10000 ) + 10000

    -- in random order

    ORDER BY NEWID() ;

    SELECT 'UNORDERED INSERT!!!!' AS insert_type,

    --@i AS i,

    --( @i * 10000 ) + 1 AS low_range,

    --( @i * 10000 ) + 10000 AS high_range,

    --index_type_desc,

    index_depth,

    CAST(avg_fragmentation_in_percent AS DECIMAL(8, 6)) AS avg_fragmentation_in_percent,

    fragment_count,

    CAST(avg_fragment_size_in_pages AS DECIMAL(8, 6)) AS avg_fragment_size_in_pages,

    page_count

    FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.a'), 1, 1, NULL) ;

    SET @i += 1 ;

    -- insert a batch of ids into our concrete table

    INSERT INTO dbo.a

    (

    id

    )

    SELECT id

    FROM #tmp

    WHERE id BETWEEN ( @i * 10000 ) + 1 AND ( @i * 10000 ) + 10000

    -- in random order

    ORDER BY NEWID() ;

    SELECT 'UNORDERED INSERT!!!!' AS insert_type,

    --@i AS i,

    --( @i * 10000 ) + 1 AS low_range,

    --( @i * 10000 ) + 10000 AS high_range,

    --index_type_desc,

    index_depth,

    CAST(avg_fragmentation_in_percent AS DECIMAL(8, 6)) AS avg_fragmentation_in_percent,

    fragment_count,

    CAST(avg_fragment_size_in_pages AS DECIMAL(8, 6)) AS avg_fragment_size_in_pages,

    page_count

    FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.a'), 1, 1, NULL) ;

    SET @i += 1 ;

    -- insert a batch of ids into our concrete table

    INSERT INTO dbo.a

    (

    id

    )

    SELECT id

    FROM #tmp

    WHERE id BETWEEN ( @i * 10000 ) + 1 AND ( @i * 10000 ) + 10000

    -- in random order

    ORDER BY NEWID() ;

    SELECT 'UNORDERED INSERT!!!!' AS insert_type,

    --@i AS i,

    --( @i * 10000 ) + 1 AS low_range,

    --( @i * 10000 ) + 10000 AS high_range,

    --index_type_desc,

    index_depth,

    CAST(avg_fragmentation_in_percent AS DECIMAL(8, 6)) AS avg_fragmentation_in_percent,

    fragment_count,

    CAST(avg_fragment_size_in_pages AS DECIMAL(8, 6)) AS avg_fragment_size_in_pages,

    page_count

    FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('dbo.a'), 1, 1, NULL) ;

    GO

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Perhaps this is a multi-threaded insert by SSIS, in which case rows can be interspersed and thus lead to nasty frag? Try forcing single thread.

    Also, given width of actual data, CHAR(xx) would be more appropriate datatype than varCHAR(xx).

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Can you modify the table? Where I work, I have complete control of the database. I would separate the key out to three different fields and use a composite index.

    Something along the lines of this...

    DECLARE @Input VARCHAR(32) = 'xvp20110525000000000000001601205'

    SELECT

    @Input AS [Original]

    ,CAST(SUBSTRING(@Input,1,3) AS CHAR(3))

    ,CASE WHEN ISDATE(SUBSTRING(@Input,4,8)) = 1 THEN CAST(SUBSTRING(@Input,4,8) AS DATE) END

    ,CASE WHEN ISNUMERIC(SUBSTRING(@Input,13,LEN(@Input)))=1 THEN CAST(SUBSTRING(@Input,13,LEN(@Input)) AS INT) END

  • john 50727 (5/27/2011)


    Can you modify the table? Where I work, I have complete control of the database. I would separate the key out to three different fields and use a composite index.

    Something along the lines of this...

    DECLARE @Input VARCHAR(32) = 'xvp20110525000000000000001601205'

    SELECT

    @Input AS [Original]

    ,CAST(SUBSTRING(@Input,1,3) AS CHAR(3))

    ,CASE WHEN ISDATE(SUBSTRING(@Input,4,8)) = 1 THEN CAST(SUBSTRING(@Input,4,8) AS DATE) END

    ,CASE WHEN ISNUMERIC(SUBSTRING(@Input,13,LEN(@Input)))=1 THEN CAST(SUBSTRING(@Input,13,LEN(@Input)) AS INT) END

    That's a nice idea in terms of being able to retrieve the pieces of key but it's not going to help him solve the fragmentation issues he is seeing. Whether the key is a one-column-string with three parts or three separate columns with proper data types if it's in the same order the fragmentation will still occur as he is seeing it occur unless he fixes how it's being loaded to be inserted in key-order.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • john 50727 (5/27/2011)


    Can you modify the table? Where I work, I have complete control of the database. I would separate the key out to three different fields and use a composite index.

    The key is a design to be used in a number of databases. Interesting or even more confusing when combined with a separate date field as PK on another table, same number of records, bigger data, we have no fragmentation. The date is added on that table to accomplish partitioning. In a single batch of 1.5m rows on that table all rows are loaded in the same partition.

    john 50727 (5/27/2011)

    Something along the lines of this...

    DECLARE @Input VARCHAR(32) = 'xvp20110525000000000000001601205'

    SELECT

    @Input AS [Original]

    ,CAST(SUBSTRING(@Input,1,3) AS CHAR(3))

    ,CASE WHEN ISDATE(SUBSTRING(@Input,4,8)) = 1 THEN CAST(SUBSTRING(@Input,4,8) AS DATE) END

    ,CASE WHEN ISNUMERIC(SUBSTRING(@Input,13,LEN(@Input)))=1 THEN CAST(SUBSTRING(@Input,13,LEN(@Input)) AS INT) END


    John Zacharkan

  • Ever get to the bottom of it?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (6/2/2011)


    Ever get to the bottom of it?

    I've been working with the client to cleanup the schema. Short answer is no, but I've reduced my work load so the reindexing won't have the impact. I will have more time to test this shortly.


    John Zacharkan

  • Sorry to keep pinging...just REALLY interested in whether you found a root cause :Whistling:

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 8 posts - 31 through 37 (of 37 total)

You must be logged in to reply to this topic. Login to reply