performance issue causing application failed to load

  • Good day,

    In a project I am working on a database with only one table: the table has 100 millions records

    The performance issue is causing the front-end failed to load, restricted by the application, I have to create a virtual table (like staging table) to get the data from the original table. The way to create that staging table is to use select query and only so-called standard select query is allowed (no CTE, no declare, no Order By...), here is the query I am using for now: the query works but like I said, performance is no good and is most likely causing the app failed to load:

    select

    date_time as DateAndTime

    ,CASE

    WHEN CHARINDEX(CHAR(46),REVERSE(url),CHARINDEX(CHAR(46),REVERSE(url),1) + 1) = 0 and ISNUMERIC(REPLACE(url,CHAR(46),'')) = 0 THEN url

    WHEN CHARINDEX(CHAR(46),REVERSE(url),CHARINDEX(CHAR(46),REVERSE(url),1) + 1) > 0 and ISNUMERIC(REPLACE(url,CHAR(46),'')) = 0 THEN RIGHT(url,CHARINDEX(CHAR(46),REVERSE(url),CHARINDEX(CHAR(46),REVERSE(url),1) + 1) - 1)

    END As NewURL

    ,

    ,SUM(bytes_sent) as TotalSent

    ,SUM(bytes_received) as TotalReceived

    ,SUM(CASE WHEN disposition <> 'Category1' AND disposition <> 'Category2' THEN 1 ELSE 0 END) AS TotalPermitted

    ,SUM(CASE WHEN disposition = 'Category1' OR disposition = 'Category2' THEN 1 ELSE 0 END) AS TotalBlocked

    From

    dbo.SHN_WSS_LogDetails

    Where category = 'Bandwidth - PNSB' AND is not null and url is not null

    Group By

    date_time

    ,

    , CASE

    WHEN CHARINDEX(CHAR(46),REVERSE(url),CHARINDEX(CHAR(46),REVERSE(url),1) + 1) = 0 and ISNUMERIC(REPLACE(url,CHAR(46),'')) = 0 THEN url

    WHEN CHARINDEX(CHAR(46),REVERSE(url),CHARINDEX(CHAR(46),REVERSE(url),1) + 1) > 0 and ISNUMERIC(REPLACE(url,CHAR(46),'')) = 0 THEN RIGHT(url,CHARINDEX(CHAR(46),REVERSE(url),CHARINDEX(CHAR(46),REVERSE(url),1) + 1) - 1)

    END

    Please note the CASE clause is needed to sort out the urls in different formats like

    ('www.bbc.co.uk')

    ,('bbc.co.uk')

    ,('part.of.bbc.co.uk')

    ,('www.microsoft.com')

    ,('part.of.microsoft.com')

    ,('intel.com')

    ,('rai.it')

    ,('192.168.2.321')

    ,('m.mbl.is');

    to two parts only domain:

    co.uk

    co.uk

    co.uk

    microsoft.com

    microsoft.com

    intel.com

    rai.it

    mbl.is

    But, will it be more efficient to use a user-defined function for that CASE clause?

    I do not have privilege to test it out on the original database/server, I want to hear your input on this.

    Thanks to everyone for the discussion, any clue is appreciated.

  • No. A udf in that case clause would not be better. You are grouping by something that needs to be "calculated."

    Instead, why not dump that information into a cte, or into a temp table. Then select from the temp table or CTE. In that final select, you can do your grouping on the column name rather than the case statement.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • unfortunately CTE is not allowed in this particular staging table creation, and even if it is allowed, 10 million records would be a lengthy dump.:crazy:

  • halifaxdal (8/18/2014)


    unfortunately CTE is not allowed in this particular staging table creation, and even if it is allowed, 10 million records would be a lengthy dump.:crazy:

    So use a temp table.

    Using that case statement (or a udf) in the group by is a recipe for causing your query to perform terribly.

    Dumping to a temp table will often solve that.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • SQLRNNR (8/18/2014)


    halifaxdal (8/18/2014)


    unfortunately CTE is not allowed in this particular staging table creation, and even if it is allowed, 10 million records would be a lengthy dump.:crazy:

    So use a temp table.

    Using that case statement (or a udf) in the group by is a recipe for causing your query to perform terribly.

    Dumping to a temp table will often solve that.

    Sorry I should have mentioned that temp is also not allowed: only "select", no declare, no CTE, no temp, no variables, no ORDER

  • halifaxdal (8/18/2014)


    SQLRNNR (8/18/2014)


    halifaxdal (8/18/2014)


    unfortunately CTE is not allowed in this particular staging table creation, and even if it is allowed, 10 million records would be a lengthy dump.:crazy:

    So use a temp table.

    Using that case statement (or a udf) in the group by is a recipe for causing your query to perform terribly.

    Dumping to a temp table will often solve that.

    Sorry I should have mentioned that temp is also not allowed: only "select", no declare, no CTE, no temp, no variables, no ORDER

    And why would you want to handcuff yourself like that?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • And why would you want to handcuff yourself like that?

    I like the word "handcuff", that's not by me, it's by the platform itself, I am building a dashboard using Dundas Dashboard:

    When you do not have database privilege, you end up building virtual table through manual method: which means only "pure" select query is allowed with lots of restrictions, and yes, exactly like being "handcuffed":w00t::crying:

  • halifaxdal (8/18/2014)


    And why would you want to handcuff yourself like that?

    I like the word "handcuff", that's not by me, it's by the platform itself, I am building a dashboard using Dundas Dashboard:

    When you do not have database privilege, you end up building virtual table through manual method: which means only "pure" select query is allowed with lots of restrictions, and yes, exactly like being "handcuffed":w00t::crying:

    A dashboard that needs to consume 10 million records? That seems like a (to be frank) useless venture. Nobody reviews 10 million records on a report or in a dashboard. If it is a dashboard, it should be pre-aggregated and quickly consumable into a high level style report.

    I'd recommend working with the DBAs who do have access. Show them the query and the table definition that would work. Dump your data into a staging table. Select from that staging table (with the re-defined urls) and group on that.

    Otherwise, try a subquery in a join. But to be fair on that one, you are still consuming 10 million records in a dashboard. People expect that to be fairly snappy. Since you are consuming 10 million records, there will be some wait involved.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • SQLRNNR (8/18/2014)


    halifaxdal (8/18/2014)


    And why would you want to handcuff yourself like that?

    I like the word "handcuff", that's not by me, it's by the platform itself, I am building a dashboard using Dundas Dashboard:

    When you do not have database privilege, you end up building virtual table through manual method: which means only "pure" select query is allowed with lots of restrictions, and yes, exactly like being "handcuffed":w00t::crying:

    A dashboard that needs to consume 10 million records? That seems like a (to be frank) useless venture. Nobody reviews 10 million records on a report or in a dashboard. If it is a dashboard, it should be pre-aggregated and quickly consumable into a high level style report.

    I'd recommend working with the DBAs who do have access. Show them the query and the table definition that would work. Dump your data into a staging table. Select from that staging table (with the re-defined urls) and group on that.

    Otherwise, try a subquery in a join. But to be fair on that one, you are still consuming 10 million records in a dashboard. People expect that to be fairly snappy. Since you are consuming 10 million records, there will be some wait involved.

    In reality there are cases that you have to work with certain restrictions, and I am glad at least I can seek help from Internet when there is no DBA available beside you.

    And yes, no kidding 10 million records, the dashboard is to report highly summarized info, I have other queries to calculate numbers not posted here.

    I am more interested to see if who can do some magic work on my query to make it faster.

    Thank you.

  • halifaxdal (8/18/2014)


    SQLRNNR (8/18/2014)


    halifaxdal (8/18/2014)


    And why would you want to handcuff yourself like that?

    I like the word "handcuff", that's not by me, it's by the platform itself, I am building a dashboard using Dundas Dashboard:

    When you do not have database privilege, you end up building virtual table through manual method: which means only "pure" select query is allowed with lots of restrictions, and yes, exactly like being "handcuffed":w00t::crying:

    A dashboard that needs to consume 10 million records? That seems like a (to be frank) useless venture. Nobody reviews 10 million records on a report or in a dashboard. If it is a dashboard, it should be pre-aggregated and quickly consumable into a high level style report.

    I'd recommend working with the DBAs who do have access. Show them the query and the table definition that would work. Dump your data into a staging table. Select from that staging table (with the re-defined urls) and group on that.

    Otherwise, try a subquery in a join. But to be fair on that one, you are still consuming 10 million records in a dashboard. People expect that to be fairly snappy. Since you are consuming 10 million records, there will be some wait involved.

    In reality there are cases that you have to work with certain restrictions, and I am glad at least I can seek help from Internet when there is no DBA available beside you.

    And yes, no kidding 10 million records, the dashboard is to report highly summarized info, I have other queries to calculate numbers not posted here.

    I am more interested to see if who can do some magic work on my query to make it faster.

    Thank you.

    So no DBA available, and nobody with access to create a table into which you could dump your results to then aggregate it in this query?

    Have you tried the subquery option that I proposed?

    Nobody can really do much more to help since you haven't provided sample table structures and consumable data to try a different solution.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Is this still an ongoing issue. I wouldn't mind DDL statements with some sample data if so. Are the columns in your where clause properly indexed and statistics current? Could total permitted = count(*) - total blocked? That may help as well. Are you getting the data across a network? If so SSIS is faster in these instances that a linked server connection and if you can build the transformation logic there as well. Otherwise trying to push 100 million records into ram is brutal.

    ----------------------------------------------------

  • MMartin1 (11/4/2014)


    Is this still an ongoing issue. I wouldn't mind DDL statements with some sample data if so. Are the columns in your where clause properly indexed and statistics current? Could total permitted = count(*) - total blocked? That may help as well. Are you getting the data across a network? If so SSIS is faster in these instances that a linked server connection and if you can build the transformation logic there as well. Otherwise trying to push 100 million records into ram is brutal.

    Thanks for your reply but the case is closed as the task was completed differently.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply