Insert is Slow

  • I am having 10 Master table with 10 million records each.

    For unique(primary key) i have use uniqueidentifier as column data type.

    While inserting more records on this table insert is slow.

    Is this because of uniqueidentifier column?

    if so then what can be the alternative.

    My table consists of 5 float 10 varchar,3 bit, 8 int,2 uniqueidentifier,4 char, 1 bigint.

    uniqueidentifier column is fk for other tables , while joining the columns have used uniqueidentifier column as a primary key.

    say

    TableA C1 uniqueidentifier (PK), C2 uniqueidentifier, C3 ......

    TableB C1 uniqueidentifier , C2 uniqueidentifier (FK_TableA_uniqueidentifier ), C3 ......

  • yuvipoy (3/4/2014)


    I am having 10 Master table with 10 million records each.

    For unique(primary key) i have use uniqueidentifier as column data type.

    While inserting more records on this table insert is slow.

    Is this because of uniqueidentifier column?

    if so then what can be the alternative.

    My table consists of 5 float 10 varchar,3 bit, 8 int,2 uniqueidentifier,4 char, 1 bigint.

    uniqueidentifier column is fk for other tables , while joining the columns have used uniqueidentifier column as a primary key.

    say

    TableA C1 uniqueidentifier (PK), C2 uniqueidentifier, C3 ......

    TableB C1 uniqueidentifier , C2 uniqueidentifier (FK_TableA_uniqueidentifier ), C3 ......

    You said your uniqueidentifier is your primary key. Is it also you clustered index? I am not a fan of using uniqueidentifiers as a primary key for almost every situation. But if there is a different column used as the clustered index they aren't horrible. A clustered index on a uniqueidentifier will exceed 90% fragmentation in as few as 1000 rows. I am guessing this is what you are running into. Find another column(s) to contain your clustered index and your performance will improve greatly.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks for your quick replay.

    uniqueidentifier column(with primary key (now)) has been used in table joining in current design.

    so if i remove the current primary key on uniqueidentifier, will the join perfomace degrade or grade(improve).

  • yuvipoy (3/4/2014)


    Thanks for your quick replay.

    uniqueidentifier column(with primary key (now)) has been used in table joining in current design.

    so if i remove the current primary key on uniqueidentifier, will the join perfomace degrade or grade(improve).

    I realize it is your primary key. That as a primary key is another topic. Is it also the clustered index? That is the problem here.

    I would recommend NOT removing your primary key on several 10 million row tables. Your performance will be even worse.

    What does this query return?

    select *

    from sys.indexes

    where OBJECT_NAME(object_id) = 'YourBaseTable' --whatever the name of your table is

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • It is Non-Clusterindex

    object_id name index_id type type_desc is_unique data_space_id ignore_dup_key is_primary_key is_unique_constraint fill_factor is_padded is_disabled is_hypothetical allow_row_locks allow_page_locks has_filter filter_definition

    1890105774 PK_XXXXXX 2 2 NONCLUSTERED 1 1 0 1 0 80 0 0 0 1 1 0 NULL

  • yuvipoy (3/4/2014)


    It is Non-Clusterindex

    object_id name index_id type type_desc is_unique data_space_id ignore_dup_key is_primary_key is_unique_constraint fill_factor is_padded is_disabled is_hypothetical allow_row_locks allow_page_locks has_filter filter_definition

    1890105774 PK_XXXXXX 2 2 NONCLUSTERED 1 1 0 1 0 80 0 0 0 1 1 0 NULL

    You aren't giving me a lot to go on here. Is this table a heap or is there a clustered index? How many rows are you trying to insert? What is the source of the data for the insert?

    You have to remember that I can't see your screen and have no idea what your system is like. I am shooting blind in the dark.

    Maybe you should take a look at this article that explains how to post questions with performance problems. http://www.sqlservercentral.com/articles/SQLServerCentral/66909/[/url]

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • As i said

    TableA C1 uniqueidentifier (PK), C2 uniqueidentifier, C3 ......

    PK is a cluster index

    TableB C1 uniqueidentifier(PK) , C2 uniqueidentifier (FK_TableA_uniqueidentifier ), C3 ......

    Column c1 PK is a cluster index and C2 is non clusterindex

    Trying to insert some million of rows

    C++ is the used to insert the data.

    In C++ the "sql statements" are used to insert the data.

    Since there where no stored procedure used , unique identifer is kept as source for uniquness in the table.

  • yuvipoy (3/6/2014)


    As i said

    TableA C1 uniqueidentifier (PK), C2 uniqueidentifier, C3 ......

    PK is a cluster index

    TableB C1 uniqueidentifier(PK) , C2 uniqueidentifier (FK_TableA_uniqueidentifier ), C3 ......

    Column c1 PK is a cluster index and C2 is non clusterindex

    Trying to insert some million of rows

    C++ is the used to insert the data.

    In C++ the "sql statements" are used to insert the data.

    Since there where no stored procedure used , unique identifer is kept as source for uniquness in the table.

    I am trying to help you here but you just aren't providing details. I asked several questions previously none of which you answered.

    I think you are inserting data into TableA which has a uniqueidentifier as the clustered index? That is source of your problem. The index fragmentation is going to be through the roof.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I think you are inserting data into TableA which has a uniqueidentifier as the clustered index? That is source of your problem. The index fragmentation is going to be through the roof.

    How it can be index fragmentation ? it is uniqueidentifier right , we should not put cluster index to uniqueidentifier?

    for child tables it is non cluster-index only.

  • Thanks Sean!

    I got it we should not go with cluster index for guid since this will cause physical ordering of the data on the disk which is pointless one.

    One more thing

    how about Selecting the statement

    Select <set of columns> from table where Col_uniqueidentifier={uniqueidentifier}

    say now if the column is with py key non cluster index with 10 million record will the query will be faster to return data ?

    or the query with col_bigint column instead of col_uniqueidentifier

    Select <set of columns> from table where col_bigint ={bigint }

    say now if the column is with py key (non) cluster index with 10 million record will the query will be faster to return data ?

    Can you suggest on this

  • yuvipoy (3/6/2014)


    Thanks Sean!

    I got it we should not go with cluster index for guid since this will cause physical ordering of the data on the disk which is pointless one.

    It is the logical order not the physical storage order. Using a guid for a clustered index produces a very high frequency of page splits which is why the index gets so fragmented.

    One more thing

    how about Selecting the statement

    Select <set of columns> from table where Col_uniqueidentifier={uniqueidentifier}

    say now if the column is with py key non cluster index with 10 million record will the query will be faster to return data ?

    or the query with col_bigint column instead of col_uniqueidentifier

    Select <set of columns> from table where col_bigint ={bigint }

    say now if the column is with py key (non) cluster index with 10 million record will the query will be faster to return data ?

    Can you suggest on this

    In general it will be a bit faster because a guid is 16 bytes where a bigint is only 8 bytes. It is not likely to make a huge difference.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • In general it will be a bit faster because a guid is 16 bytes where a bigint is only 8 bytes. It is not likely to make a huge difference.

    You mean to say that guid with non cluster index and bigint with cluster index we behave in similar manner.

    how about if there is huge volume of data say about 10 million records will the query fetching time will be same

  • yuvipoy (3/6/2014)


    In general it will be a bit faster because a guid is 16 bytes where a bigint is only 8 bytes. It is not likely to make a huge difference.

    You mean to say that guid with non cluster index and bigint with cluster index we behave in similar manner.

    how about if there is huge volume of data say about 10 million records will the query fetching time will be same

    That depends. Are you returning the guid column in that 10 million rows? If so, it will be slower because the amount of data being returned is more with a guid. If not, then it would be about the same.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • At present one of my column is having uniqueidentifier with clustered, unique, primary key located on PRIMARY

    The data is inserted in the below manner

    66B72949-658B-4D77-A813-01AC3278AD53

    066C863E-1A14-4F0C-A4E6-053D02F98CB8

    63C1E1C9-FABF-444E-AE01-0752A8C9A7DD

    AE6CCB63-1764-42C6-A7E2-08F04A2CA4DB

    F2D58665-301B-467F-B0A4-08F46A4B71A3

    7CC0A364-C455-4FD0-A4F8-0951693312C0

    Why it is not in logical order as :

    066C863E-1A14-4F0C-A4E6-053D02F98CB8

    63C1E1C9-FABF-444E-AE01-0752A8C9A7DD

    66B72949-658B-4D77-A813-01AC3278AD53

    AE6CCB63-1764-42C6-A7E2-08F04A2CA4DB

    F2D58665-301B-467F-B0A4-08F46A4B71A3

    7CC0A364-C455-4FD0-A4F8-0951693312C0

    here logical order i mean to say as order by alpha characters 0-9 and a-z

    or the inserted data is in correct formate only if so how ?

  • GUID ordering is not alphanumeric. There's articles on the web which describe how GUIDs are ordered, not that it usually matters much for users.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply