enhancing sql command

  • table have 10 million records

    the following my statement :

    (SELECT COUNT(DISTINCT(RecordedCalls.ID)) AS CallsCount, RecordedCalls.Channel AS [Name],

    RecordedCalls.ServerName FROM RecordedCalls INNER JOIN Servers ON RecordedCalls.ServerName = Servers.Name LEFT OUTER JOIN

    Tags INNER JOIN RecordedCallsTags ON Tags.ID = RecordedCallsTags.TagID ON RecordedCalls.ID = RecordedCallsTags.CallID

    WHERE RecordedCalls.ID <= '9369907' AND

    (RecordedCalls.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime ) AND

    cast ( '01 Jan 2210 00:00:00:000' as datetime )) AND

    (RecordedCalls.Duration BETWEEN 0 AND 1000000) AND RecordedCalls.AgentID NOT IN('1000010000') AND RecordedCalls.IsDeleted='false'

    GROUP BY RecordedCalls.Channel, RecordedCalls.ServerName

    ORDER BY RecordedCalls.Channel)

    Thanks in advance

  • I'm curious about some of the conditions, like

    RecordedCalls.CallDate BETWEEN cast('01 Jan 1910 00:00:00:000' as datetime)

    AND cast('01 Jan 2210 00:00:00:000' as datetime)

    Do you really need this? Or is this part of your schema just as an example?

    Similarly the

    RecordedCalls.Duration BETWEEN 0 AND 1000000

    What is the purpose of these checks?

    Also, you have an inner join with RecordedCallsTags. Are you using this to narrow the return row set? If not, then why is it there? (you are not referencing it)

    What is the datatype of RecordedCalls.ID? You are comparing it with a string literal. Is it numeric?

    Regards,

    Andras


    Andras Belokosztolszki, MCPD, PhD
    GoldenGate Software

  • -the condition date, Duration and another conditions based on User criteria i got it from executing another SP inside my SP then i concat the result inside my statement put i sent the result statement (where conditions from executing another sp)

    - i am using distinct RecordedCalls.ID FK with RecorededCallsTags.CallID

    - RecordedCalls.ID [int]

    * I need each relation in the statement in case i got another conditions

  • anyone!!

  • You're not going to get anywhere by shouting. I don't even see a question in your original post, so what is it that you're asking?

    Please start off by posting the DDL of your table(s) in the form of CREATE TABLE statements, and some sample data in the form of INSERT statements. If your query is producing the wrong result set, then please show what you would like it to do instead.

    John

  • first things to check:

    - this query comes within brackets ... is it a kind of subquery or nested table expression ?

    SELECT COUNT(DISTINCT ( RecordedCalls.ID )) AS CallsCount

    , RecordedCalls.Channel AS [Name]

    , RecordedCalls.ServerName

    FROM RecordedCalls

    INNER JOIN Servers

    ON RecordedCalls.ServerName = Servers.Name

    LEFT OUTER JOIN Tags

    INNER JOIN RecordedCallsTags

    ON Tags.ID = RecordedCallsTags.TagID

    ON RecordedCalls.ID = RecordedCallsTags.CallID

    WHERE RecordedCalls.ID <= '9369907'

    AND ( RecordedCalls.CallDate BETWEEN cast('01 Jan 1910 00:00:00:000' as datetime)

    AND cast('01 Jan 2210 00:00:00:000' as datetime) )

    AND ( RecordedCalls.Duration BETWEEN 0 AND 1000000 )

    AND RecordedCalls.AgentID NOT IN ( '1000010000' )

    AND RecordedCalls.IsDeleted = 'false'

    GROUP BY RecordedCalls.Channel

    , RecordedCalls.ServerName

    ORDER BY RecordedCalls.Channel

    - Get rid of unneeded predicates / joins

    - provide indexes for all join predicates where appropriate

    - within a sproc... use parameters of the correct datatype (match your tables /views) ( avoid casting in your query ! )

    - maybe indexes for some extra columns are needed:

    .... RecordedCalls.CallDate

    .... RecordedCalls.AgentID

    .... RecordedCalls.ServerName + RecordedCalls.Channel

    - try to replace the 'not in' with <> or !=

    if the in-list is long, maybe you'd be better of replacing it with a

    @tmpTb using a split-function (examples at SSC ) using a "not exists"

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • 1) Assuming RecordedCalls is the table with 10 million records, and given this table has a DateTime column (CallDate) that you're filtering on in your WHERE clause, why don't you convert RecordedCalls to a partitioned table with the partition based on CallDate's year?

    2) I noticed you're doing a LEFT OUTER JOIN on the Tags table without using an ON clause. This effectively performs a cross join between RecordedCalls and Tags. Is this really what you want to do?

    3) Good luck! You'll need it!

  • His joins all have an ON clause. He has something like join-on-join-join-on-on.

  • what purpose does this serve?

    LEFT OUTER JOIN Tags

    INNER JOIN RecordedCallsTags

    ON Tags.ID = RecordedCallsTags.TagID

    ON RecordedCalls.ID = RecordedCallsTags.CallID

    You are not selecting anything from it, and the DISTINCT might not be needed if you remove the extra joins.

  • All (especially Ralph):

    There are three JOIN clauses and only two ON clauses. Each JOIN clause should have a corresponding ON clause.

    Again, good luck all!!!!

  • Thanks for response

    The table RecordedCalls contains 1000,000,000 Records, the other tables used in the join will not contain more than 15 Records for each one.

    The table lookups contains like 200 Records so that’s why I put the records I want from it in a temp table (#tempLookUps) before joining it with the table RecordedCalls.

    there is a clustered index (Primary key) on the RecordedCalls.ID and non-clustered indexes on the columns that are used in the Where statement and the group by field (CallType), I can’t remove any join with other tables or any condition on the where statements hence it is very dynamic and concatenated from other strored procedure and i can't remove the DISTINCT Word.

    I found that when using temp tables to put the results in then imply joining on then is more efficient than using Derived Tables.

    and i tired to partition the recordedcalls table based on CallDate's year but i coud'nt coz i am using sql a standard edition, i got error msg tell that the partition should be on enterprise edition is there any help here!!

    the following SP takes around 1:15 minutes to finish execution against SQL Server 2005.

    DECLARE @max-2 int

    SELECT @max-2 = MAX(RecordedCalls.ID) FROM RecordedCalls

    CREATE TABLE #tempLookups (ID int identity(0,1),Code NVARCHAR(100),NameE NVARCHAR(500),NameA NVARCHAR(500))

    CREATE TABLE #tempTable (ID int identity(0,1),TypesCount INT,CallsType NVARCHAR(50))

    INSERT INTO #tempLookups SELECT Code, NameE, NameA FROM lookups WHERE [Type] = 'CALLTYPES' ORDER BY Ordering ASC

    INSERT INTO #tempTable SELECT COUNT(DISTINCT(RecordedCalls.ID)) As TypesCount,RecordedCalls.CallType as CallsType

    FROM Servers INNER JOIN RecordedCalls ON Servers.Name = RecordedCalls.ServerName

    LEFT OUTER JOIN Tags INNER JOIN RecordedCallsTags ON Tags.ID = RecordedCallsTags.TagID

    ON RecordedCalls.ID = RecordedCallsTags.CallID

    WHERE RecordedCalls.ID <= @max-2

    AND (RecordedCalls.CallDate BETWEEN CAST ('01 Jan 1910 00:00:00:000' AS DATETIME ) AND CAST('01 Jan 2210 00:00:00:000' AS DATETIME ))

    AND (RecordedCalls.Duration BETWEEN 0 AND 1000000)

    AND RecordedCalls.AgentID NOT IN('1000010000')

    AND RecordedCalls.IsDeleted='FALSE'

    GROUP BY RecordedCalls.CallType

    SELECT IsNull(#tempTable.TypesCount, 0) AS TypesCount, CASE('English')

    WHEN 'Arabic' THEN #tempLookups.NameA

    ELSE #tempLookups.NameE

    END AS CallsType FROM

    #tempTable RIGHT OUTER JOIN #tempLookups ON #tempTable.CallsType = #tempLookups.Code

    DROP TABLE #tempLookups

    DROP TABLE #tempTable

  • JLSSCH (3/4/2008)


    All (especially Ralph):

    There are three JOIN clauses and only two ON clauses. Each JOIN clause should have a corresponding ON clause.

    Again, good luck all!!!!

    Nope, the joins are correctly specified, Ralph is spot on...

    SELECT

    COUNT(DISTINCT(RecordedCalls.ID)) AS CallsCount,

    RecordedCalls.Channel AS [Name],

    RecordedCalls.ServerName

    FROM RecordedCalls

    INNER JOIN Servers

    ON RecordedCalls.ServerName = Servers.[Name]

    INNER JOIN RecordedCallsTags

    ON RecordedCalls.ID = RecordedCallsTags.CallID

    LEFT OUTER JOIN Tags

    ON Tags.ID = RecordedCallsTags.TagID

    WHERE RecordedCalls.ID <= '9369907'

    AND (RecordedCalls.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime ) AND cast ( '01 Jan 2210 00:00:00:000' as datetime ))

    AND (RecordedCalls.Duration BETWEEN 0 AND 1000000)

    AND RecordedCalls.AgentID NOT IN('1000010000')

    AND RecordedCalls.IsDeleted='false'

    GROUP BY RecordedCalls.Channel, RecordedCalls.ServerName

    ORDER BY RecordedCalls.Channel

    ... and he's also correct in suggesting that the TAGS table could be omitted since it's not referenced in the SELECT, and it's a left join.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • asmaa.sbaihi, you should run some experiments to establish exactly what you need in your query.

    Something along the lines of...

    This is your baseline, run it and record the results:

    SELECT COUNT(DISTINCT(RecordedCalls.ID)) AS CallsCount,

    RecordedCalls.Channel AS [Name],

    RecordedCalls.ServerName

    FROM RecordedCalls

    INNER JOIN Servers

    ON RecordedCalls.ServerName = Servers.[Name]

    INNER JOIN RecordedCallsTags

    ON RecordedCalls.ID = RecordedCallsTags.CallID

    LEFT OUTER JOIN Tags

    ON Tags.ID = RecordedCallsTags.TagID

    WHERE RecordedCalls.ID <= '9369907'

    AND (RecordedCalls.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime )

    AND cast ( '01 Jan 2210 00:00:00:000' as datetime ))

    AND (RecordedCalls.Duration BETWEEN 0 AND 1000000)

    AND RecordedCalls.AgentID NOT IN('1000010000')

    AND RecordedCalls.IsDeleted='false'

    GROUP BY RecordedCalls.Channel, RecordedCalls.ServerName

    ORDER BY RecordedCalls.Channel

    Then comment out the Tags table and run again...

    SELECT COUNT(DISTINCT(RecordedCalls.ID)) AS CallsCount,

    RecordedCalls.Channel AS [Name],

    RecordedCalls.ServerName

    FROM RecordedCalls

    INNER JOIN Servers

    ON RecordedCalls.ServerName = Servers.[Name]

    INNER JOIN RecordedCallsTags

    ON RecordedCalls.ID = RecordedCallsTags.CallID

    --LEFT OUTER JOIN Tags

    --ON Tags.ID = RecordedCallsTags.TagID

    WHERE RecordedCalls.ID <= '9369907'

    AND (RecordedCalls.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime )

    AND cast ( '01 Jan 2210 00:00:00:000' as datetime ))

    AND (RecordedCalls.Duration BETWEEN 0 AND 1000000)

    AND RecordedCalls.AgentID NOT IN('1000010000')

    AND RecordedCalls.IsDeleted='false'

    GROUP BY RecordedCalls.Channel, RecordedCalls.ServerName

    ORDER BY RecordedCalls.Channel

    Experiment with the transaction isolation level, it may not affect your results, and if not, your query will run a little faster...bring in table aliases too, it usually makes the code more readable

    SELECT COUNT(DISTINCT(rc.ID)) AS CallsCount,

    rc.Channel AS [Name],

    rc.ServerName

    FROM RecordedCalls rc (NOLOCK)

    INNER JOIN Servers s (NOLOCK)

    ON rc.ServerName = s.[Name]

    INNER JOIN RecordedCallsTags rct (NOLOCK)

    ON rc.ID = rct.CallID

    --LEFT OUTER JOIN Tags

    --ON Tags.ID = RecordedCallsTags.TagID

    WHERE RecordedCalls.ID <= '9369907'

    AND (rc.CallDate BETWEEN cast ('01 Jan 1910 00:00:00:000' as datetime )

    AND cast ( '01 Jan 2210 00:00:00:000' as datetime ))

    AND (rc.Duration BETWEEN 0 AND 1000000)

    AND rc.AgentID NOT IN('1000010000')

    AND rc.IsDeleted='false'

    GROUP BY rc.Channel, rc.ServerName

    ORDER BY rc.Channel

    Comment out the other two joins one at a time and again compare results with the original query. Run the original query again because your result may have changed while you've been working.

    This is just a start but hopefully it will point you in the right direction, as Andras and others have shown, there are several issues with this query and each needs to be addressed individually.

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Have you looked at the explain plan, statistics, Memory usage, etc.

    Actually have you looked at the performance of the machine? This isn't the most complex query I have ever seen, but you are talking about a fair amount of data.

    You mentioned that it was standard edition, but CPU's, Memory, Disks? How much. How is the system configured.

    In general, this query isn't really written poorly assuming that the underlying system supports it.

    the DDL of the table was requested. 100,000,000 rows in a clustered table could be 10g of data, or it could be 500G of data. 500G grouped sorted etc without indexing in 1:15 is not that unreasonable.

  • The more I look at this, there is almost nothing that you are going to be able to do to help this query. The only suggestion I have for you is

    an index

    Create index xxx on RecordedCalls(CallType,ID)

    SELECT COUNT(DISTINCT(RecordedCalls.ID)) As TypesCount,RecordedCalls.CallType as CallsType

    ...

    group by RecordedCalls.CallType

    That's about the only thing that you can count on being useful since this is obviously a dynamically generated query.

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply