Better method for computed column?

  • Hi all.

    Below I am creating 3 tables. Order_hdr (OH), Order_dtl (OD) and Product_Stat (PS). I had a static field in PS to hold the quantity of the product that was on any open orders as defined by type = order, not shipped and not cancelled. I had triggers on the OD table to update onorder when the quantity field changed on an order. Every so often, the trigger failed to run. The number (onorder) was not correct.

    My idea was to change it to a computed column calling a scalar function to sum the quantity for a specific product #. When I run the ufn function below - for one product - it is very fast. When I run it for all products it takes 12 seconds to get 9150 rows.

    Would a persisted computed column be faster? Is there a better way to do this?

    Thanks,

    Mike

    CREATE TABLE [dbo].[order_hdr](

    [order_hdr_id] INT NOT NULL,

    [customer_id] INT NOT NULL,

    [date] DATE NULL,

    [order_type] CHAR(20) NULL,

    [ship] DATE NULL,

    [cancel_fl] BIT NULL

    CONSTRAINT [order_hdr_pkey] PRIMARY KEY CLUSTERED

    (

    [order_hdr_id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    CREATE NONCLUSTERED INDEX [order_hdr_type_ship_cancel] ON [dbo].[order_hdr]

    (

    [order_type] ASC,

    [ship] ASC,

    [cancel_fl] ASC

    )

    INCLUDE ( [date],

    [customer_id]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    CREATE TABLE [dbo].[order_dtl](

    [order_dtl_id] INT NOT NULL,

    [order_hdr_id] INT,

    [product_id] INT,

    [quan] [decimal](8, 2) NULL,

    [p_nmbr] varCHAR(20) NULL,

    [ship_qty] [decimal](8, 2) NULL,

    [price] MONEY

    CONSTRAINT [order_dtl_pkey] PRIMARY KEY NONCLUSTERED

    (

    [order_dtl_id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    /****** Object: Index [order_dtl_product_id] Script Date: 7/25/2016 10:09:28 AM ******/

    CREATE NONCLUSTERED INDEX [order_dtl_product_id] ON [dbo].[order_dtl]

    (

    [product_id] ASC

    )

    INCLUDE ( [order_hdr_id],

    [quan],

    [ship_qty],

    [price]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 40) ON [PRIMARY]

    GO

    CREATE FUNCTION [dbo].[ufn_product_onorder]

    (

    -- Add the parameters for the function here

    @product_id int

    )

    RETURNS int

    AS

    BEGIN

    -- Declare the return variable here

    DECLARE @onOrder int

    -- Add the T-SQL statements to compute the return value here

    SELECT @onOrder = SUM( quan)

    FROM Order_Hdr oh

    INNER JOIN Order_Dtl od ON oh.order_hdr_id = od.order_hdr_Id

    WHERE oh.order_Type = 'order'

    AND oh.ship is NULL

    AND oh.cancel_fl = 0

    AND product_id = @product_id

    -- Return the result of the function

    SET @onOrder = ISNULL( @onOrder, 0)

    RETURN @onOrder

    END

    GO

    CREATE TABLE [dbo].[product_stat](

    [product_stat_id] INT NOT NULL,

    [product_id] INT NULL,

    [onhand] AS (dbo.ufn_product_onorder( product_id))

    CONSTRAINT [product_stat_pkey] PRIMARY KEY NONCLUSTERED

    (

    [product_stat_id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

  • Scalar functions for computed columns or constraints will hurt performance too. For one, queries that reference the table that uses a scalar udf for the computed column will likely be restricted to a serial plan.

    Have you considered an indexed view against your product_stat table that incorporates the logic of ufn_product_onorder for your computed column (onhand)? I think that would be the way to go.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Unfortunately, your suggestion is greek to me. Can you explain a bit more? And yes, it is killing performance.

    Thank you.

  • I agree overall with Alan's comments. If you're willing to convert existing code/queries, also look into an itvf to do the computation.

    However, for now, let's tune what you have. Here's my recommendations:

    1) Filter the order_hdr table index so that it contains only the rows you need to use for this computation.

    2) Cluster the order_dtl table on order_hdr_id first, then on order_dtl_id to make it unique, although clustering on (order_hdr_id, product_id ) would be even better, even if not 100% unique (as a bonus, this should help performance of many other queries as well). [Btw, completely get away from the notion of (automatically) clustering every table on an identity column, that's a terrible idea!]

    3) Get rid of the unnecessary local variable in the function itself, as a minor performance enhancement (and add table aliases to every column, just for clarity).

    --1)

    CREATE NONCLUSTERED INDEX [order_hdr_type_ship_cancel] ON [dbo].[order_hdr]

    ( [order_type] ASC, [ship] ASC, [cancel_fl] ASC )

    INCLUDE ( [customer_id], [date], [order_hrd_id] )

    WHERE order_Type = 'order'

    AND oh.ship is NULL

    AND oh.cancel_fl = 0

    WITH ( /*DATA_COMPRESSION = ROW, */ FILLFACTOR = 99, SORT_IN_TEMPDB = ON ) ON [PRIMARY]

    --2)

    CREATE TABLE [dbo].[order_dtl]( ...,

    CONSTRAINT [order_dtl_pkey] PRIMARY KEY NONCLUSTERED

    ( order_hdr_id, order_dtl_id ) /* ( order_hdr_id, product_id ) */

    WITH ( /*DATA_COMPRESSION = ROW, */ FILLFACTOR = 99, SORT_IN_TEMPDB = ON ) ON [PRIMARY]

    --3)

    CREATE FUNCTION [dbo].[ufn_product_onorder]

    (

    @product_id int

    )

    RETURNS int

    AS

    BEGIN

    RETURN (

    SELECT SUM( od.quan )

    FROM Order_Hdr oh

    INNER JOIN Order_Dtl od ON oh.order_hdr_id = od.order_hdr_Id

    WHERE oh.order_Type = 'order'

    AND oh.ship is NULL

    AND oh.cancel_fl = 0

    AND od.product_id = @product_id

    )

    END

    Edit: Minor editing of typos.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I use the index for other queries as well. Can I create a new filtered index? How do I make the function use the new index? To change the clustered index for order detail - do I use an alter command?

    Thank you for the help.

    Mike

  • What are you using the product_stat table for? Will you have more than a row per product? Will it keep history? If it'll always show the products in orders, maybe a view is more suitable for this.

    CREATE VIEW product_stat

    AS

    SELECT product_id,

    SUM( quan) AS onhand

    FROM Products p --Change accordingly and use only if all products are needed.

    LEFT JOIN ( SELECT product_id, quan

    FROM Order_Dtl od

    INNER JOIN Order_Hdr oh ON oh.order_hdr_id = od.order_hdr_Id

    WHERE oh.order_Type = 'order'

    AND oh.ship is NULL

    AND oh.cancel_fl = 0 ) o ON p.product_id = o.product_id;

    GO

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (7/25/2016)


    What are you using the product_stat table for? Will you have more than a row per product? Will it keep history? If it'll always show the products in orders, maybe a view is more suitable for this.

    CREATE VIEW product_stat

    AS

    SELECT product_id,

    SUM( quan) AS onhand

    FROM Products p --Change accordingly and use only if all products are needed.

    LEFT JOIN ( SELECT product_id, quan

    FROM Order_Dtl od

    INNER JOIN Order_Hdr oh ON oh.order_hdr_id = od.order_hdr_Id

    WHERE oh.order_Type = 'order'

    AND oh.ship is NULL

    AND oh.cancel_fl = 0 ) o ON p.product_id = o.product_id;

    GO

    Dang - Luis beat me to it... After looking at this more I was thinking the same thing. Forget what I said about the indexed view (SUM will kill it).

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • mike 57299 (7/25/2016)


    Unfortunately, your suggestion is greek to me. Can you explain a bit more? And yes, it is killing performance.

    Thank you.

    I was going to suggest that you create a view that looks like what Luis put together. Then add an index to the view. That would not work, however, because you can't create an index on a view that uses SUM. That said, the view that Luis suggested will utilize the indexes on your tables and should perform quite nicely.

    The reason you want to avoid scalar functions at all costs (even for constraints and computed columns) is that they force SQL Server to generate a poorer execution plan than it would otherwise.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • mike 57299 (7/25/2016)


    I use the index for other queries as well. Can I create a new filtered index? How do I make the function use the new index? To change the clustered index for order detail - do I use an alter command?

    Thank you for the help.

    Mike

    Yes, you could create a new filtered index, but another index = more overhead. If this index can be filtered for all its uses, such as ship IS NULL, then filter on only that. If not, perhaps put ship first, as that should still reduce rows enough to make the query in q more efficient (presumably the vast majority of orders have already shipped).

    For order detail, since it's a PK constraint, you'd have to drop the constraint and create a new clustered index or constraint. There's no real need to make it a PK, but making it unique is best if it is guaranteed unique. If product_id could appear multiple times for the same order, make it a nonunique clus index. If product_id can appear only once for each order, you could make it a PK or a unique clus index.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply