improve the query performance

  • there is a kind of call log table(with 450k rows)

    I would like to select only the lastest call record for each phone number

    somehow the query processing time is too long, is there any way to improve it

    select *

    from tblCallLog r

    where DateCall=

    (select max(DateCall)

    from tblCallLog

    where phNu=r.phNu

    group by phNu) and logStatus in(@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)

    and DateCall >= @callFrom and DateCall < @callTo

    order by DateCall

    thx!

  • [font="Comic Sans MS"][/font]

    WITH S As (select max(DateCall) MaxDateCall from tblCallLog where phNu=r.phNu group by phNu)

    S1 As ((select @a as status1 union select @b-2 union select @b2 union select @C union select @d union select @d2 union select @d3 union select @n union select @n2 union select @n3 union select @n4 union select @r union select @R2 union select @s-2 union select @s2 union select @U union select @w union select @n5)

    select

    *

    from

    tblCallLog r

    inner join S1 on S1.status1 = r.logStatus

    where

    DateCall = S.MaxDateCall

    And DateCall in between @callFrom and @callTo

    order by DateCall

    Try this one......If you send the query to create the tables and fill data in those then it will be quite easier to analyse the problem.

    Regards

    Deeptiprasad Nayak

  • thank you for the reply

    but something are not understand

    like the union part, and why inner join with the logStatus.

    WITH S As (select max(DateCall) MaxDateCall from tblCallLog where phNu=r.phNu group by phNu)

    S1 As ((select @a as status1 union select @b-2 union select @b2 union select @C union

    select @d union select @d2 union select @d3 union select @n union select @n2 union

    select @n3 union select @n4 union select @r union select @R2 union select @s-2 union select @s2 union select @U union select @w union select @n5)

    select *

    from

    tblCallLog r

    inner join S1 on S1.status1 = r.logStatus

    where

    DateCall = S.MaxDateCall

    And DateCall in between @callFrom and @callTo

    order by DateCall

  • We need to minimize the records on which we are going to apply where conditions

    Like:

    Case-1: For max datecall, We are going to get only one record. But in ur query u r checking the same condition with almost all records.

    Same as for logstatus. Evenif the status required is not there we are doing a check on that. If we filter it out before applying where clause then it will be faster.

    one more thing u have to check....which query part is taking too much time.....

    1. retrieving max datecall

    2. logstatus

    3. Date in between condition.

    U need to check each part independently for analysis.

    Can u provide the table with data???

  • Please post ddl and sqlplans.

    Please see this article http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



    Clear Sky SQL
    My Blog[/url]

  • Please provide the Table structure and what exact output you want.

  • Your query is using hidden RBAR which is why it is inefficient. You're better off using the Row_Number() function with the Partition By option to find the last call for each number. It wasn't clear whether you wanted to first apply the date filter and then find the max or vice versa.

    WITH Calls AS (

    SELECT *, Row_Number() OVER( PARTITION BY phNu ORDER BY DateCall DESC ) AS Sequence

    FROM tblCallLog

    WHERE logStatus IN (@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)

    AND DateCall >= @callFrom

    AND DateCall < @callTo

    )

    SELECT *

    FROM Calls

    WHERE Sequence = 1

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • below is the table and some data

    tblCallLog

    id (int, not null)PK

    DateCall(datetime, null)

    logStatus(nchar(10), null)

    phNu(int, not null)

    pplName(nchar(100), null)

    id, DataCall, logStatus, phNu, pplName

    400123, 2011-01-10 15:15:00.000, n, 31695555, b johnson

    400124, 2011-01-10 15:16:00.000, b, 46555501, c white

    400125, 2011-01-10 15:17:00.000, b, 38654926, k may

    .........

    .........

    401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson

    output

    400124, 2011-01-10 15:16:00.000, b, 46555501, c white

    400125, 2011-01-10 15:17:00.000, b, 38654926, k may

    .........

    .........

    401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson

    one phone number may appear one or few times in the log table

    we are always want to know the lastest record for each phone number only

    and we want to know the lastest records in the rang of date

    also we're only interesting in some of log status

    so i need to know all the leastest records first, then choose the rang of date and log status

    it will be great, if anyone can tell me how to check which query part is taking too much time.....

    -_-'' i just know the server i am working on is 2000, not 2005, so i can't use WITH in the query

  • Is it safe to say that the latest call for any phone number will also correspond to the highest "id" for that phone number?

    SELECT

    log.id,

    log.DateCall,

    log.logStatus,

    log.phNu,

    log.pplName

    FROM

    tblCallLog AS log

    JOIN

    (

    SELECT

    MAX(grp.id) AS id

    FROM

    tblCallLog AS grp

    GROUP BY grp.phNu

    ) gri

    ON

    log.id = gri.id

    WHERE

    log.logStatus in(@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)

    AND

    log.DateCall >= @callFrom and log.DateCall < @callTo

    ORDER BY log.DateCall

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • mister.magoo (1/19/2011)


    Is it safe to say that the latest call for any phone number will also correspond to the highest "id" for that phone number?

    quote]

    No, the high 'id' doesn't mean the lastest call, in this case.

    it named callLog, but the log is not added automatically, the only way to know is by dateCall

    what is the different to use 'join' and 'where' in this?

    thx!

  • dlam 18073 (1/19/2011)


    below is the table and some data

    tblCallLog

    id (int, not null)PK

    DateCall(datetime, null)

    logStatus(nchar(10), null)

    phNu(int, not null)

    pplName(nchar(100), null)

    id, DataCall, logStatus, phNu, pplName

    400123, 2011-01-10 15:15:00.000, n, 31695555, b johnson

    400124, 2011-01-10 15:16:00.000, b, 46555501, c white

    400125, 2011-01-10 15:17:00.000, b, 38654926, k may

    .........

    .........

    401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson

    output

    400124, 2011-01-10 15:16:00.000, b, 46555501, c white

    400125, 2011-01-10 15:17:00.000, b, 38654926, k may

    .........

    .........

    401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson

    one phone number may appear one or few times in the log table

    we are always want to know the lastest record for each phone number only

    and we want to know the lastest records in the rang of date

    also we're only interesting in some of log status

    so i need to know all the leastest records first, then choose the rang of date and log status

    it will be great, if anyone can tell me how to check which query part is taking too much time.....

    -_-'' i just know the server i am working on is 2000, not 2005, so i can't use WITH in the query

    Since you are only interested in the latest calls, per phone number, and only if the latest calls are within a certain range and have certain statuses, then any number whose latest call is later than the top of your range isn't even worth considering -- in fact any number with any call falling later than you upper bound isn't worth considering.

    Given that, there is no need to find the latest call for every phone number (nor even, any phone number). You just need to find the latest calls per phone number from within the subset of calls which meet your criteria. So, try this:

    select *

    from tblCallLog r

    Where DateCall =

    (

    select max(DateCall)

    from tblCall r1

    where r1.phNu=r.phNu

    and logStatus in (@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)

    and DateCall >=@callFrom and DateCall < @callTo

    and not exists

    (

    select 1

    FROM tblCallLog

    WHERE phNu=r1.phNu

    AND DateCall > @callTo

    )

    group by phNu

    )

  • yes, ur right i only want the phone number within the rang,

    so DateCall >=@callFrom and DateCall < @callTo

    is alreadly set the rang,

    one thing here iam not understand, do we need to use

    not exists

    (

    select 1

    FROM tblCallLog

    WHERE phNu=r1.phNu

    AND DateCall > @callTo

    )

    to set the upbound again?, or they are doing different things?

  • dlam 18073 (1/20/2011)


    yes, ur right i only want the phone number within the rang,

    so DateCall >=@callFrom and DateCall < @callTo

    is alreadly set the rang,

    one thing here iam not understand, do we need to use

    not exists

    (

    select 1

    FROM tblCallLog

    WHERE phNu=r1.phNu

    AND DateCall > @callTo

    )

    to set the upbound again?, or they are doing different things?

    They are doing different things. without that clause, you will find all of the latest calls within that range for each phone number which has a call within that range, but that will include phone numbers which have calls which are later than that range -- using the clause ensures that you only get phone numbers whose latest calls are within the range.

    For example, say there are only three phone calls in your database -- two from one number and one from another number:

    400123, 2011-01-10 15:15:00.000, n, 31695555, b johnson

    400124, 2011-01-10 15:16:00.000, b, 46555501, c white

    401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson

    Now, let's say that the date range you are interested in is: 2011-01-10 00:00:00 to 2011-01-11 00:00:00

    With that clause in there, your query should return only one call entry:

    400124, 2011-01-10 15:16:00.000, b, 46555501, c white

    No entry would be returned for phone number 31695555 because, though it had a call within your specified range, it's latest call is outside your range.

    If you remove that clause, you should get two entries:

    400123, 2011-01-10 15:15:00.000, n, 31695555, b johnson

    400124, 2011-01-10 15:16:00.000, b, 46555501, c white

    Both calls are within the range you specified and both are, for their respective phone numbers, the latest calls within that range. But you indicated that you were only interested in the latest calls for the phone numbers, and then only if they met the other criteria. The latest call for 31695555 is:

    401100, 2011-01-12 10:00:00.000, s, 31695555, b johnson

    which falls outside of the range -- as I understood your posts, you would not be interested in seeing any entries for that number because its latest call is not within the range.

    - Les

  • DateCall >=@callFrom and DateCall < @callTo <<< does this part of code already set the range of dateCall

    without

    not exists

    (

    select 1

    FROM tblCallLog

    WHERE phNu=r1.phNu

    AND DateCall > @callTo

    )

    also it select all of the max date call with in the range over all of the max date call

    for example:

    test data

    402111, 2010-01-10 15:15:00.000, n, 32226001, b smith

    402222, 2010-03-21 15:16:00.000, n, 32226001, b smith

    402333, 2010-10-19 15:17:00.000, n, 32226001, b smith

    402444, 2010-11-21 16:01:00.000, n, 32226001, b smith

    with this query

    select *

    from tblCallLog r

    Where DateCall =

    (

    select max(DateCall)

    from tblCall r1

    where r1.phNu=r.phNu

    and logStatus in (@a, @b-2, @b2,@c,@d,@d2,@d3,@n,@n2,@n3,@n4,@r,@r2,@s,@s2,@u,@w,@n5)

    and DateCall >=@callFrom and DateCall < @callTo

    group by phNu

    )

    call date range

    DateCall >='2010-10-01' and DateCall < '2010-11-30 23:59'

    the outcome is >> 402444, 2010-11-21 16:01:00.000, n, 32226001, b smith

    but is there any different between puting the where clause in the sub-query and out side the sub-query

  • dlam 18073 (1/20/2011)


    but is there any different between puting the where clause in the sub-query and out side the sub-query

    Absolutely. The WHERE clause in the main query would be equivalent to a HAVING clause in the subquery. In both cases, the filter is applied on the aggregate, but the WHERE clause in the subquery the filter is applied before aggregating.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Viewing 15 posts - 1 through 15 (of 19 total)

    You must be logged in to reply to this topic. Login to reply