Netezza Query takes forever and eventually times out, can this be re-written to run quicker?

  • Eirikur Eiriksson - Wednesday, March 20, 2019 5:32 AM

    Jeff Moden - Wednesday, March 20, 2019 5:21 AM

    Jonathan AC Roberts - Tuesday, March 19, 2019 4:02 PM

    cubangt - Tuesday, March 19, 2019 3:26 PM

    Jonathan AC Roberts - Monday, March 18, 2019 8:28 PM

    It's not going to make much difference to the performance but and SS.QTY_RECEIVED is not null is not required because
    and SS.QTY_RECEIVED <> 0 will filter out the nulls.

    This index will increase the speed a lot:
    CREATE NONCLUSTERED INDEX IX_SHIPMENT_SHIPMENT_QTY_RECEIVED_INC_ITEM_PROCESS_DATE
         ON [dbo].[SHIPMENT] (SHIPMENT, QTY_RECEIVED)
    INCLUDE (ITEM,PROCESS_DATE)

    When running the create index we get this error
    I get the error that indexes are not available in this release.
    After a little research found that Netezza does not use indexes like regular SQL.

    Yes, I don't think Netezza uses SQL Server.
    They have a CREATE INDEX statement that doesn't have an INCLUDE. So you would need to create an index like this:

    CREATE INDEX IX_SHIPMENT_SHIPMENT_QTY_RECEIVED_ITEM_PROCESS_DATE
        ON [dbo].[SHIPMENT] (SHIPMENT, QTY_RECEIVED, ITEM, PROCESS_DATE)

    Again, I'm no expert on it but I believe they refer to it as "Netezza SQL"... it appears to have some commonality with ANSI SQL but a whole lot that's apparently necessarily proprietary.

    Netezza SQL is very similar to the T-SQL on Azure Data Warehouse for the DML, only a handful of NTSQL functions that do not exist in T-SQL, slight difference on the DDL but not much.
    😎
    Currently migrating a large Netezza estate to Azure so I had to look a bit into this😉

    Since you have an opportunity that most will never have, it would be interesting to see what the performance differences will be.  Rumor has it that unless you're into premium packages, the performance of Azure leaves quite a bit to be desired.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jonathan AC Roberts - Wednesday, March 20, 2019 5:34 AM

    Jeff Moden - Wednesday, March 20, 2019 5:21 AM

    Jonathan AC Roberts - Tuesday, March 19, 2019 4:02 PM

    cubangt - Tuesday, March 19, 2019 3:26 PM

    Jonathan AC Roberts - Monday, March 18, 2019 8:28 PM

    It's not going to make much difference to the performance but and SS.QTY_RECEIVED is not null is not required because
    and SS.QTY_RECEIVED <> 0 will filter out the nulls.

    This index will increase the speed a lot:
    CREATE NONCLUSTERED INDEX IX_SHIPMENT_SHIPMENT_QTY_RECEIVED_INC_ITEM_PROCESS_DATE
         ON [dbo].[SHIPMENT] (SHIPMENT, QTY_RECEIVED)
    INCLUDE (ITEM,PROCESS_DATE)

    When running the create index we get this error
    I get the error that indexes are not available in this release.
    After a little research found that Netezza does not use indexes like regular SQL.

    Yes, I don't think Netezza uses SQL Server.
    They have a CREATE INDEX statement that doesn't have an INCLUDE. So you would need to create an index like this:

    CREATE INDEX IX_SHIPMENT_SHIPMENT_QTY_RECEIVED_ITEM_PROCESS_DATE
        ON [dbo].[SHIPMENT] (SHIPMENT, QTY_RECEIVED, ITEM, PROCESS_DATE)

    Again, I'm no expert on it but I believe they refer to it as "Netezza SQL"... it appears to have some commonality with ANSI SQL but a whole lot that's apparently necessarily proprietary.

    I'd never actually heard of Netezza until I saw this question but apparently it's based on PostgreSQL but has forked off since then. 
    https://en.wikipedia.org/wiki/Netezza

    Interesting... thanks, Jonathan.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, March 20, 2019 5:37 AM

    Eirikur Eiriksson - Wednesday, March 20, 2019 5:32 AM

    Jeff Moden - Wednesday, March 20, 2019 5:21 AM

    Jonathan AC Roberts - Tuesday, March 19, 2019 4:02 PM

    cubangt - Tuesday, March 19, 2019 3:26 PM

    Jonathan AC Roberts - Monday, March 18, 2019 8:28 PM

    It's not going to make much difference to the performance but and SS.QTY_RECEIVED is not null is not required because
    and SS.QTY_RECEIVED <> 0 will filter out the nulls.

    This index will increase the speed a lot:
    CREATE NONCLUSTERED INDEX IX_SHIPMENT_SHIPMENT_QTY_RECEIVED_INC_ITEM_PROCESS_DATE
         ON [dbo].[SHIPMENT] (SHIPMENT, QTY_RECEIVED)
    INCLUDE (ITEM,PROCESS_DATE)

    When running the create index we get this error
    I get the error that indexes are not available in this release.
    After a little research found that Netezza does not use indexes like regular SQL.

    Yes, I don't think Netezza uses SQL Server.
    They have a CREATE INDEX statement that doesn't have an INCLUDE. So you would need to create an index like this:

    CREATE INDEX IX_SHIPMENT_SHIPMENT_QTY_RECEIVED_ITEM_PROCESS_DATE
        ON [dbo].[SHIPMENT] (SHIPMENT, QTY_RECEIVED, ITEM, PROCESS_DATE)

    Again, I'm no expert on it but I believe they refer to it as "Netezza SQL"... it appears to have some commonality with ANSI SQL but a whole lot that's apparently necessarily proprietary.

    Netezza SQL is very similar to the T-SQL on Azure Data Warehouse for the DML, only a handful of NTSQL functions that do not exist in T-SQL, slight difference on the DDL but not much.
    😎
    Currently migrating a large Netezza estate to Azure so I had to look a bit into this😉

    Since you have an opportunity that most will never have, it would be interesting to see what the performance differences will be.  Rumor has it that unless you're into premium packages, the performance of Azure leaves quite a bit to be desired.

    Azure is like the Missuses hair colour, changes every day 😀
    😎
    The latest  generation of Azure SQL Data Warehouse performs very well and cost wise, it is very competitive.

  • Eirikur Eiriksson - Wednesday, March 20, 2019 6:14 AM

    Jeff Moden - Wednesday, March 20, 2019 5:37 AM

    Eirikur Eiriksson - Wednesday, March 20, 2019 5:32 AM

    Jeff Moden - Wednesday, March 20, 2019 5:21 AM

    Jonathan AC Roberts - Tuesday, March 19, 2019 4:02 PM

    cubangt - Tuesday, March 19, 2019 3:26 PM

    Jonathan AC Roberts - Monday, March 18, 2019 8:28 PM

    It's not going to make much difference to the performance but and SS.QTY_RECEIVED is not null is not required because
    and SS.QTY_RECEIVED <> 0 will filter out the nulls.

    This index will increase the speed a lot:
    CREATE NONCLUSTERED INDEX IX_SHIPMENT_SHIPMENT_QTY_RECEIVED_INC_ITEM_PROCESS_DATE
         ON [dbo].[SHIPMENT] (SHIPMENT, QTY_RECEIVED)
    INCLUDE (ITEM,PROCESS_DATE)

    When running the create index we get this error
    I get the error that indexes are not available in this release.
    After a little research found that Netezza does not use indexes like regular SQL.

    Yes, I don't think Netezza uses SQL Server.
    They have a CREATE INDEX statement that doesn't have an INCLUDE. So you would need to create an index like this:

    CREATE INDEX IX_SHIPMENT_SHIPMENT_QTY_RECEIVED_ITEM_PROCESS_DATE
        ON [dbo].[SHIPMENT] (SHIPMENT, QTY_RECEIVED, ITEM, PROCESS_DATE)

    Again, I'm no expert on it but I believe they refer to it as "Netezza SQL"... it appears to have some commonality with ANSI SQL but a whole lot that's apparently necessarily proprietary.

    Netezza SQL is very similar to the T-SQL on Azure Data Warehouse for the DML, only a handful of NTSQL functions that do not exist in T-SQL, slight difference on the DDL but not much.
    😎
    Currently migrating a large Netezza estate to Azure so I had to look a bit into this😉

    Since you have an opportunity that most will never have, it would be interesting to see what the performance differences will be.  Rumor has it that unless you're into premium packages, the performance of Azure leaves quite a bit to be desired.

    Azure is like the Missuses hair colour, changes every day 😀
    😎
    The latest  generation of Azure SQL Data Warehouse performs very well and cost wise, it is very competitive.

    Cool feedback.  Thanks, Eirikur.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Your SQL query itself is straightforward. Other than a SQL interface, Netezza is very different architecturally from SQL Server, more similar to Azure SQL Warehouse. It uses a distributed file system similar to HDFS, and from what I've read, it doesn't support b-tree indexes but something like materialized views and statistics. Since what you're struggling with is performance, you'll get more useful advice from a Netezza specific user forum or the vendor.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I would ask a dB admin.

    You are picking above a certain shipment number for some business reason.

    It is not clear if there are multiple items on the same shipment.

    And you ignore negative qty although depending on how the table is structured and corrections are handled may not be giving you the expected results.

    It might be as simple as creating an inner query without the max date, where an outer query selects the max date for item shipment from the sub query.

    Be sure to have a clear business description of what the query is intended to do.

    And if the results of this would lead to some other questions, describe those.

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply