Performance of a View - Rows as Columns

  • Hi,

    We have the following view (SQL Server 2005):

    SELECT E.id

    , E.postedOn

    , E.code

    , E.title

    , X1.value AS field1

    , X2.value AS field2

    , X3.value AS field3

    , X4.value AS field4

    , X5.value AS field5

    , X6.value AS field6

    , X7.value AS field7

    , X8.value AS field8

    , X9.value AS field9

    , X10.value AS field10

    FROM EntitySharedValue AS E

    LEFT OUTER JOIN dbo.EntityCustomValue AS X1 ON E.id = X1.instanceId AND X1.extensionRef = 'field1'

    LEFT OUTER JOIN dbo.EntityCustomValue AS X2 ON E.id = X2.instanceId AND X2.extensionRef = 'field2'

    LEFT OUTER JOIN dbo.EntityCustomValue AS X3 ON E.id = X3.instanceId AND X3.extensionRef = 'field3'

    LEFT OUTER JOIN dbo.EntityCustomValue AS X4 ON E.id = X4.instanceId AND X4.extensionRef = 'field4'

    LEFT OUTER JOIN dbo.EntityCustomValue AS X5 ON E.id = X5.instanceId AND X5.extensionRef = 'field5'

    LEFT OUTER JOIN dbo.EntityCustomValue AS X6 ON E.id = X6.instanceId AND X6.extensionRef = 'field6'

    LEFT OUTER JOIN dbo.EntityCustomValue AS X7 ON E.id = X7.instanceId AND X7.extensionRef = 'field7'

    LEFT OUTER JOIN dbo.EntityCustomValue AS X8 ON E.id = X8.instanceId AND X8.extensionRef = 'field8'

    LEFT OUTER JOIN dbo.EntityCustomValue AS X9 ON E.id = X9.instanceId AND X9.extensionRef = 'field9'

    LEFT OUTER JOIN dbo.EntityCustomValue AS X10 ON E.id = X10.instanceId AND X10.extensionRef = 'field10'

    WHERE E.tenantId = '8fc19a93-e95c-4b1a-9dd5-c3bd7af19193'

    Context: The view is recreated each time a new field is added. The number of joins varies with the number of extension fields (1 ext. field = 1 join, 10 ext. field = 10 joins)

    This performs really bad with more and more extension fields.

    Questions:

    1)How do we improve the perf of the view?

    Possible answers:

    * Create a clustered index on EntitySharedValue(tenantd, id). Adding id to the index, combined with an index on instanceId makes SQL work easier.

    * Create a Clustered index on EntityCustomValue(instanceId, extensionRef). You need that all the values with the same instance Id be placed in the same Page (SQL storage unit) so all records for a give EntitySharedValue.id need just a single Page read.

    * Create the “relation” (Foreign key) between EntitySharedValue.id <- EntityCustomValue instanceId will improve performance

    2)Can you think of a different approach assuming the schema won't change? (e.g. flatten the data on a temp table and select from temp, a new feature in sql 2005 that transpose rows)

    [The database schema is attached]

    Thanks in advance

    Matias

  • You want to take a look at the fancy new PIVOT feature that "flattens and turns" your data. All in one shot. There are several options as to how to write this either dynamically or make some occasional minor changes to the T-SQL, to add in your new columns, but the performance really shouldn't change all that much.

    PIVOT fun starts here:

    http://msdn2.microsoft.com/en-us/library/ms177410.aspx

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Thanks Matt. So what you are saying is that the perf of this query as it is cannot be improved unless we move to something like PIVOT?

    Thanks

    Matias

  • Try :

    SELECT E.id

    , E.postedOn

    , E.code

    , E.title

    , MIN(CASE WHEN X1.extensionRef = 'field1' THEN X1.value END ) AS field1

    , MIN(CASE WHEN X1.extensionRef = 'field2' THEN X1.value END ) AS field2

    .....

    FROM EntitySharedValue AS E

    LEFT OUTER JOIN dbo.EntityCustomValue AS X1 ON E.id = X1.instanceId

    WHERE E.tenantId = '8fc19a93-e95c-4b1a-9dd5-c3bd7af19193'

    GROUP BY E.id

    , E.postedOn

    , E.code

    , E.title


    * Noel

  • What I'm saying is that each time you're adding a new column, you're adding a new join to the mix, which adds a lot of effort for not a lot of gain (for one thing - you're reading the same table 12 times over, then 13 times over, etc....). PIVOT, or the older syntax described by noeld, keeps the # of tables and joins constant, just adds new data points. Much lower effort to the server.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Hi

    I think Pivot will help in your case. Pivot has its own performace implications and is generally found to be slower than conventional way of doing the same thing.

    Since your fields can keep increasing yuo can give Pivot a try.

    Did u try putting clustered index on the view... i would be interested to know the results.

    "Keep Trying"

  • Apparently the OP abandoned this thread.


    * Noel

  • Sorry for the delay on getting back to you.

    First of all thanks for your replies. They were really helpful.

    The approach suggested by Noeld is the most performant.

    We did a performance test with 25 concurrent users. The test queries the view retrieving a random set of 50 rows. EntitySharedValue had 100,000 records and EntityCustomValue had 500,000 records (for 5 extension fields), 900,000 (for 9 extension fields) and 2,000,000 (for 20 extension fields).

    The results are the following:

    Noeld syntax:

    Ext. Fields | Tx/Sec | Avg Response Time

    5 fields 90.8 0.27

    9 fields 51 0.48

    20 fields 15.2 1.62

    PIVOT syntax

    Ext. Fields | Tx/Sec | Avg Response Time

    5 fields 43.6 0.56

    9 fields 23.1 1.05

    20 fields 8.09 3.03

    PIVOT basically performs 2x slower.

    Here are the queries

    1)

    SELECT

    E.id, E.postedOn, E.code, E.title,

    MIN(CASE WHEN X.extensionRef = 'field1' THEN CAST(X.value as VARCHAR(MAX)) END) AS field1,

    MIN(CASE WHEN X.extensionRef = 'field2' THEN CAST(X.value as VARCHAR(MAX)) END) AS field2,

    ...

    FROM

    EntitySharedValue as E

    LEFT OUTER JOIN

    dbo.EntityCustomValue AS X ON E.id = X.instanceId

    GROUP BY

    E.id, E.postedOn, E.code, E.title

    2)

    SELECT

    id, postedOn, code, title,

    [field1], [field2], [field3],

    ...

    FROM

    (SELECT

    ESV.id AS id,

    ESV.tenantId AS tenantId,

    ESV.code AS code,

    ESV.postedOn AS postedOn,

    ESV.title as title,

    ESV.extensionRef AS extensionRef,

    ESV.value as extensionFieldValue

    FROM

    dbo.EntitySharedValue ESV

    LEFT OUTER JOIN

    dbo.EntityCustomValue ECV ON ESV.id = ECV.instanceId) v

    PIVOT

    (MIN(extensionFieldValue) FOR extensionRef

    IN ([field1], [field2], [field3])) AS PT

    Is there anything that I could be missing (functionally) to discard PIVOT?

    Thanks

    Matias

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply