Alternative to using "IN" or table data types?

  • I have a stored proc "spEmailList_Search" that takes two parameters:

    --If CampaignID is passed, we only search for those email addresses that were part

    --of a certain email campaign

    DECLARE @CampaignID INT

    --If email is passed, we want information about only the one email

    DECLARE @Email VARCHAR(100)

    Originally I had this:

    SELECT*

    FROM EmailList el WITH (NOLOCK)

    WHERE (@CampaignID IS NULL OR el.EmailID IN (

    --emails from a campaign

    SELECT DISTINCT(eic.EmailID)

    FROM emailsInCampaigns eic

    WHERE eic.CampaignID = @CampaignID))

    AND (@Email IS NULL OR el.Email = @Email)

    But with a large email list and several campaigns, it's pretty slow. I tried to use a table var with improved results:

    DECLARE @emailsFilteredByCampaigns TABLE(EmailID INT PRIMARY KEY)

    --All emails OR those emailed from a certain campaign

    IF (@CampaignID IS NULL)

    BEGIN

    INSERT INTO @emailsFilteredByCampaigns

    SELECT EmailID FROM EmailList

    END

    ELSE

    BEGIN

    INSERT INTO @emailsFilteredByCampaigns

    SELECT DISTINCT(eic.EmailID)

    FROM emailsInCampaigns eic WITH (NOLOCK)

    WHERE eic.CampaignID = @CampaignID

    END

    SELECT *

    FROM

    (--Email List is a list of email addresses

    SELECT *

    FROM EmailList WITH (NOLOCK)

    WHERE (@Email IS NULL OR Email = @Email)

    )

    AS emails

    INNER JOIN @emailsFilteredByCampaigns efbc ON emails.EmailID = efbc.EmailID

    But does anyone know of any other way I can achieve better performance without using table vars? I've already put some much needed indexes in place.

  • It's really two different queries. I'd split them into two procedures, even if you call them from this single procedure, have it call the other two procedures depending on the parameters passed in:

    IF @CampaignId IS NOT NULL

    BEGIN

    EXEC @err = GetCamaign @CampaignId

    END

    ELSE IF @Email IS NOT NULL

    BEGIN

    EXEC @err = GetEmail @Email

    END

    But if you have to keep them together (and I'm not sure why you would, but let's assume for a second) I'd try this:

    SELECT...

    FROM...

    WHERE CampaignId = @CampaignId

    UNION

    SELECT ....

    FROM...

    WHERE Email = @Email

    This will kind of stink since it's going to run both queries every time, but it will return the right values and you should get a good execution plan. I'd go with the idea of splitting the queries. You'll get to distinct execution plans and only one of them will be run at a time.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • You could code up something like this (i assume exactly one of the params needs to be null):

    IF (@CampaignID IS NOT NULL AND @Email IS NOT NULL)

    OR @CampaignID IS NULL AND @Email IS NULL) RAISERROR('One argument needs to be null',16,1)

    IF @CampaignID IS NULL AND @Email IS NOT NULL

    SELECT EL.* FROM dbo.EmailList EL WHERE EL.EmailID = @Email

    ELSE

    SELECT EL.* FROM dbo.EmailListEL WHERE EXISTS(

    SELECT * FROM dbo.emailInCampaigns EIC

    WHEREEIC.campaignID = @Campaign AND EIC.EmailID = EL.EmailID

    )

    Best Regards,

    Chris Büttner

  • There are two ways I would suggest testing out. One is Grant's suggestion of two separate procs, called by a master proc. The other is a Union All statement.

    SELECT *

    FROM EmailList el WITH (NOLOCK)

    WHERE (@Email IS NULL

    and el.EmailID IN (

    --emails from a campaign

    SELECT DISTINCT(eic.EmailID)

    FROM emailsInCampaigns eic

    WHERE eic.CampaignID = @CampaignID))

    union all

    SELECT *

    FROM EmailList el WITH (NOLOCK)

    where el.Email = @Email

    and @CampaignID is null

    That will use indexes, but might end up with a poor execution plan, depending on statistics and which query is used first. Test both, see which works better for various inputs.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • For what it's worth - using DISTINCT inside of a IN statement is just wasting cycles. You'll get the same results, and faster if you simply skip Distinct from what I've seen.

    I don't know why - but EXISTS seems to work even faster than IN more often than not. Might be worth trying out....

    SELECT *

    FROM EmailList el WITH (NOLOCK)

    WHERE (@Email IS NULL

    and EXISTS (

    --emails from a campaign

    SELECT null FROM emailsInCampaigns eic

    WHERE eic.CampaignID = @CampaignID and eic.emailid=el.emailID))

    union all

    etc....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I think I may be approaching this the wrong way by using a stored proc. I wanted it to handle all types of search parameters (possibly more in the future), but now I'm thinking maybe dynamic in-line sql in the code-behind is best suited for what I'm doing.

    Sorry for not mentioning the context before, but I have a search page where a user can get a list of emails and their stats (without having to specify any parameters), and they can filter on the results more if they want to. So the parameters are not required.

    Want I really want is a dreamy conditional join (which I know doesn't exist) like so:

    SELECT * FROM EmailList

    IF @CampaignID NOT IS NULL

    INNER JOIN (subquery...)

  • Well, optional parameters are a good thing to have, but they should usually be applied next to mandatory parameters, unless you are working with a small data set.

    Dynamic SQL doesnt really give you any performance gain. What is the issue with hard coding the various parameter sets?

    Best Regards,

    Chris Büttner

  • Btw, congrats, Matt to your 4444th point 🙂

    Best Regards,

    Chris Büttner

  • Christian Buettner (6/2/2008)


    Btw, congrats, Matt to your 4444th point 🙂

    Thanks! but it's going to be 4445 once I post this:)

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Ok, back to work then 😀

    Best Regards,

    Chris Büttner

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply