Clustered Index

  • Clustered Index

    I recently added a clustered index based on 5 columns to a fact table in SQL2k. This table gets approx 120k rows added every day over a 12 hour period. The DB slowed down to where it is unusable. Inserts that took hours to complete now do not complete in the 12 hour period.

    The DB is used as only for reporting.

    Some ?

    1. When a clustered index is removed is the effect immediate (I deleted the clustered index and the performace is still terrible)

    2. If I order my insert data using the same order as the clustered index will this improve performance?

    3. What is the best way to get the data into the fact table given that it arrives in dribbles over a 12 hour period (I clense the data in a staging DB on the same server).

    TIA

    Mark Firth

  • 1. When a clustered index is removed is the effect immediate (I deleted the clustered index and the performace is still terrible)

    While SQL Server is no longer required to physically sort according to the way you had your clustered index, the data already there isn't automatically unsorted.

    2. If I order my insert data using the same order as the clustered index will this improve performance?

    SQL Server should be smart enough to figure out how to insert the data, after all, the clustered index is a physical sort. Out of curiousity, can you give us some idea of the data types you were sorting on?

    3. What is the best way to get the data into the fact table given that it arrives in dribbles over a 12 hour period (I clense the data in a staging DB on the same server).

    I hate to give this answer, but it's the only one appropriate without knowing how the data looks: it depends.

    K. Brian Kelley
    @kbriankelley

  • Brian

    Thanks for the response.

    I am trying to understand why my server suddenly died and the most likely culprit is the clustered indexes I added recently. I have removed them, restarted SQL and got back only a small performace increase.

    The index was made up of datetine, varchar by 3 then int by 2 (6 cols all up)

    "it depends" is the expected answer - this is a typical DW/reporting system for a bank.

    Look like I will have to get hold of a real DBA.

  • Generally it's a really bad idea to do indexing against a varchar field. Out of curiousity, when selecting the clustered index, did you do so based on what's getting inserted, based on queries being run, etc.?

    K. Brian Kelley
    @kbriankelley

  • How large is the clusered index (average size of one row of the index). How many non clustered indexes do you have (you know that they keep a refference to the clustered index). That can mean a LOT of maintenance for the server when you had more rows.

  • As the DB is in UAT we can guage the report requirements so the indexes were based in the queries.

    I intend to defrag and reindex all the tables and see if that helps as well. The scheduled integrity check failed and I can't find the details of why.

  • Sorry Remi, I deleted the clustered index so I have no data on the size. And yes there were a number of additional indexes but they were not related to the clustered index.

  • All I need is the average width of the columns in the index.

    BTW, all indexes keep a refference to the clustered index so even if you don't include it explicitly, it's there... so the larger the index is, the costlier it is to maintain.

    Also, I'd be curious to see how many more indexes we're talking about here. It would be nice to see the DDL of the table and indexes.

  • Yes, complete DDL and some sample data would be nice. In any case having a clustered index based on 6 columns, including 3 varchars, is never a good idea, even without any nonclustered indexes. With some nonclustered indexes the performance will be horrendous.

  • Remi

    DDL - thanks

    CREATE TABLE [dbo].[FactSensi] (

    [FileMapID] [numeric](10, 0) NOT NULL ,

    [SourceID] [numeric](10, 0) NOT NULL ,

    [ProductFamily] [varchar] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,

    [RiskType] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,

    [ProcessDate] [datetime] NOT NULL ,

    [DataDate] [datetime] NOT NULL ,

    [Portfolio] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [Currency] [varchar] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [Tenor] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [USD] [numeric](25, 10) NULL ,

    [LocalCurrency] [numeric](25, 10) NULL ,

    [USDView] [numeric](25, 10) NULL ,

    [ViewCurrency] [varchar] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [Product] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [CurveType] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [CurrencyPair] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [Instrument] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [Rate] [numeric](22, 10) NULL ,

    [BaseOffice] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [BaseValue] [numeric](22, 10) NULL ,

    [CurrencyMap] [varchar] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [PortfolioID] [int] NULL ,

    [ProductID] [int] NULL ,

    [SGD] [numeric](22, 10) NULL ,

    [InstrumentID] [int] NULL

    ) ON [PRIMARY]

    GO

    CREATE INDEX [ixFileMapID] ON [dbo].[FactSensi]([FileMapID]) WITH FILLFACTOR = 90 ON [PRIMARY]

    GO

    CREATE INDEX [ix_Portfolio] ON [dbo].[FactSensi]([PortfolioID]) ON [PRIMARY]

    GO

    CREATE INDEX [FactSensi_Index_1] ON [dbo].[FactSensi]([DataDate]) WITH FILLFACTOR = 90 ON [PRIMARY]

    GO

    ALTER TABLE [dbo].[FactSensi] ADD

    CONSTRAINT [FK_FactSensi_FileMap] FOREIGN KEY

    (

    [FileMapID]

    ) REFERENCES [dbo].[FileMap] (

    [FileMapID]

    ),

    CONSTRAINT [FK_FactSensi_ImportSource] FOREIGN KEY

    (

    [SourceID]

    ) REFERENCES [dbo].[ImportSource] (

    [SourceID]

    ),

    CONSTRAINT [FK_FactSensi_Portfolio] FOREIGN KEY

    (

    [PortfolioID]

    ) REFERENCES [dbo].[Portfolio] (

    [PortfolioID]

    ),

    CONSTRAINT [FK_FactSensi_Product] FOREIGN KEY

    (

    [ProductID]

    ) REFERENCES [dbo].[Product] (

    [ProductID]

    ),

    CONSTRAINT [FK_FactSensi_ProductFamily] FOREIGN KEY

    (

    [ProductFamily]

    ) REFERENCES [dbo].[ProductFamily] (

    [ProductFamily]

    ),

    CONSTRAINT [FK_FactSensi_RiskType] FOREIGN KEY

    (

    [RiskType]

    ) REFERENCES [dbo].[RiskType] (

    [RiskType]

    ),

    CONSTRAINT [FK_FactSensi_Tenor] FOREIGN KEY

    (

    [Tenor]

    ) REFERENCES [dbo].[Tenor] (

    [Tenor]

    )

    GO

  • Which columns were in the primary key (in the same order)?

  • Remi

    The index was not unique, only clustered. I have since removed the index and replaced it with the following clustered index

    Datadate DESC, PortfolioID

    These being the 2 most used fields for reporting purposes.

    What should I do about varchar indexes - I never use char (I'm a developer after all) but will I get better performance from char fields (currency for instance is always either 3 or 4 char and I could safely create it as a 4 char field).

  • Indexing strategy is impossible to do without knowing what queries are run on the server and their frequncy/execution plan... you should really look into hireing a dba or consultant if it's a big concern for your company. I'm afraid we can only skim the surface of this problem under the current circumstances.

  • I was only after a general rule, most developers will use varchar because it is easier to handle in UI and business layers, but if it is going to give me better performace on indexing I am willing to change to char on the key fields and manage the trailing spaces in the business layer. 

    The indexing itself is not a real problem, the only reason it is on the radar is that the server went from processing 400-600 files per hour to 30-60 files last Thursday. I am groping around attempting to find the culprit while waiting for the outsourced support team to get involved.

    I suspect it is a problem with the server rather than SQL but I am alway interested in scrounging more info on gaining performace, without reading every whitepaper on the net .

    The clustered index Datadate, PortfolioID has worked well.

    Thanks for your support

  • When you say you updated the statistics, do you have auto-update on or do you do it manually, It could be that the table has reached the limit where the default value isn't good enough anymore for the faster query plan and you need to start doing it with a larger sample rate.

    Just a word of warning, if you do an update stats with a higher sample rate but leave the autoupdate setting 'on' it will go back to doing the default when it next runs.

    We hit something similar to this last year and now update stats manually each evening rather than let SQL do it as it thinks is needed.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply