Fragmented index

  • Hello all,

    I have a table with 1.5m rows see schema the PK is varchar(32) with data loaded in ID as below

    'arw20110519000000000000000000001'

    'arw20110519000000000000000000002'

    ..

    ....

    ........

    'arw20110519000000000000001487001'

    There are no gaps in the data, it's loaded sequentially so why would this table be fragment?

    Thanks,

    John

    CREATE TABLE [dbo].[person](

    [id] [varchar](32) NOT NULL,

    [submission_date] [datetime] NOT NULL,

    [source] [int] NOT NULL,

    [parent_guid] [varchar](32) NULL,

    [guid_source] [int] NOT NULL,

    [fname] [varchar](25) NOT NULL,

    [lname] [varchar](50) NOT NULL,

    [mname] [varchar](25) NULL,

    [role] [smallint] NOT NULL,

    [cell_phone] [varchar](15) NULL,

    [dob] [date] NULL,

    [gender] [tinyint] NOT NULL,

    [status] [smallint] NOT NULL,

    [race] [tinyint] NOT NULL,

    [ethnicity] [tinyint] NOT NULL,

    [language_code] [tinyint] NULL,

    [ssn] [varchar](11) NULL,

    [email_address] [varchar](75) NULL,

    [xsa_file_id] [int] NULL,

    [creation_date] [date] NULL,

    CONSTRAINT [person_PK] PRIMARY KEY CLUSTERED

    (

    [guid] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [SECONDARY]

    ) ON [SECONDARY]


    John Zacharkan

  • How fragmented is it?

    Also, what's the fill factor?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • How is the data inserted / updated?

    If the application generates the next PK (insert) and then does an update and fills up all the rest of the columns you're going to get a lot of age splits and cause heavy frangmentation.

  • Thanks for replying back.

    The index showed 75% fragmentation and indexdepth of 4

    The server default fillfactor is still set 100% as is the index.

    Data is loaded using datastage and the next ID is provided by them and is always the next sequential number.

    ALTER TABLE [dbo].[vehicle_person] ADD CONSTRAINT [vehicle_person_PK] PRIMARY KEY CLUSTERED

    ([id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [SECONDARY]

    GO


    John Zacharkan

  • There is some weirdness going on. Your initial post says the PK consists of one column named guid but there is no column in the DDL.

    Then your last post is adding an index to a table named vehicle_person and the original post is for a table named person.

    Me thinks I know what's going on but am not getting the whole story...can you please post the complete DDL for both tables including all the constraints and indexes?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • My initial take, based on the column name, was that you have clustered GUIDs, which are absolutely prone to fragmentation, but based on what you're telling us so far, it's still unclear.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • This also stands out as a point of interest:

    ...the next ID is provided by them and is always the next sequential number.

    Who is "them" and what programming mechanism are they using to get "the next sequential number"?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (5/25/2011)


    This also stands out as a point of interest:

    ...the next ID is provided by them and is always the next sequential number.

    Who is "them" and what programming mechanism are they using to get "the next sequential number"?

    More importantly... how do they fill the data (only 1 insert or 1 insert and n updates?) The later will DEFINITELY cause massive fragmentation.

  • Agreed, and it will be exacerbated due to the 100% fill factor.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (5/25/2011)


    Agreed, and it will be exacerbated due to the 100% fill factor.

    The one case I had in mind the guy was still having problem at 40% fill factor, maybe even 20%. Don't remember where I read that but it was in the last few weeks on this site.

  • Apologize for the confusion. I needed to clean up the schema for viewing and didn't follow through on pk in the first post.

    GUID and ID are the same field.

    The mechanism generating the IDS regardless is generating them sequentially in order there's nothing random about the ID.


    John Zacharkan

  • The idea of having an app generate a sequential GUID outside the database and provide it to SQL Server as the clustering key for a table is great on paper...but I have never had the privilege of seeing where the technique was implemented properly.

    Can't find the post, but not long ago someone on SSC brought to the light that Microsoft CRM uses the technique. It never came out whether they did it sequentially in the parent table, but IIRC they were having problems with performance in one of the child tables.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • The data commits after every 5K records each record populates 5 tables. I have other tables in using a similiar ID but along with a date as a PK and they didn't fragment.

    My data is PHI/HIPPA so I'm unable to post it here. The madeup sample data was insufficient in vetting out this issue.

    John


    John Zacharkan

  • opc.three (5/25/2011)


    The idea of having an app generate a sequential GUID outside the database and provide it to SQL Server as the clustering key for a table is great on paper...but I have never had the privilege of seeing where the technique was implemented properly.

    Can't find the post, but not long ago someone on SSC brought to the light that Microsoft CRM uses the technique. It never came out whether they did it sequentially in the parent table, but IIRC they were having problems with performance in one of the child tables.

    We were both talking about the same thread... mine ironically enough :w00t:

    http://www.sqlservercentral.com/Forums/FindPost1107898.aspx

  • zach_john (5/25/2011)


    The data commits after every 5K records each record populates 5 tables. I have other tables in using a similiar ID but along with a date as a PK and they didn't fragment.

    My data is PHI/HIPPA so I'm unable to post it here. The madeup sample data was insufficient in vetting out this issue.

    John

    So which index gets fragmented?

Viewing 15 posts - 1 through 15 (of 37 total)

You must be logged in to reply to this topic. Login to reply