Email broadcast optimization

  • Hi,

    I really need to optimize the email broadcast system on our admin website. We send out email newsletters once or twice a week.

    This is what happens on the front end. Admin users do a search on our clients from the admin website. The search result has all the clients to whom the email should be sent. Usually, the row size is between 40,000 to 50,000. And when they click submit, I get all the userids in a comma seperated string. I split this string into smaller strings of size 8000 characters. I insert these (40 to 50) small strings into a database table - broadcastemails {id INT, useridstring VARCHAR(8000)}.

    I have a windows service which runs every 10 minutes that sends out the emails. It checks the above table to see if there are any rows. If there is, it retrieves one row. I have a function that splits the useridstring and returns the userids in a table. I join this table with the users table and return the results - userid, userfname, userlname, usertoemail. I loop through the resultset and for every row I send out an email.

    I have been receiving tons of timeout error emails from this above windows service. If I run the sproc from the query analyzer, the run time varies a lot. It takes anywhere from close less than 1 minute to more than 8 minutes.

    Does this above method to send emails seem ok? Should I just work on optimizing the queries? Or is there any other method to send out emails more effectively?

    Thanks.

  • At first glance, that shouldn't be too bad. Have you checked the query plans and the indexing? Perhaps your function should create an index on the temp table or maybe use a "staging" table and put the data in there with an index on the fields you join with the user table.

  •  

    here's my suggestions, and they might be calling  for a design change:

    create a table something like what is below; i  would probably suggest putting their actual emails AND the newsletter text in this table instead, but you get the idea:

    CREATE TABLE EmailCampaign(UserId int ,NewsLetterID int ,SentSuccessfully bit default 0)

    so whenever the admin user selects users for a specific newsletter, it goes straight into this table. no squishing userid's into a varchar and then re-extracting them.

    then have a scheduled job go thru this table and attempt to email WHERE SentSuccessfully =0; if the job is successful, update the SentSuccessfully column to 1. I'm not sure if you are BCC'ing 50 at a time, or row by row...that's up to you.

    for performance reasons, you might have  multiple jobs running at the same time, which doing the same thing, but different groups of  Userid's like  odd or even ID's , or some other criteria...

    with this design, you would not need to worry about timeout errors, because it runs to completion, instead of an app timing out. an application might timeout, but a scheduled job will not.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • You can definitely make this significantly faster.

    Is your read query being blocked by your write query?

    How often is your broadcastemails table being updated?

    What locks are being acqired?

    Are you using wrapping your insert loop in a transaction?

    Your queue approach to sending emails is definitely sound. SQL server is not great at string manipulation but because the email is asynchronous, it shouldn't matter.

    At this point, there are a number of possible causes for your problems. The quickest way to get your problem resolved would be to supply us with more information.

    Could you please post your code to retrieve, process and store the email addresses as well as full DDL for your tables (including indexes etc)

    Thanks!

    SQL guy and Houston Magician

  • This is the sproc that runs from the windows service, from where I get the timeout errors.

    CREATE PROCEDURE [dbo].[proc_broadcastemailsget]

    AS

    DECLARE @clientstable TABLE (userid INT PRIMARY KEY)

    DECLARE @bid INT

    DECLARE @useridstring VARCHAR(8000)

    IF EXISTS (SELECT 1 FROM broadcastemails)

    BEGIN

     BEGIN TRAN

     

     SELECT TOP 1 @bid = broadcastid FROM broadcastemails

     SELECT @useridstring = clientids FROM broadcastemails WHERE broadcastid = @bid

     

     INSERT INTO @clientstable SELECT intnumber FROM dbo.fn_splitstringtoint(@useridstring)

     IF @@error <> 0

     BEGIN

      ROLLBACK TRAN

      RETURN

     END

     DELETE FROM broadcastemails WHERE broadcastid = @bid

     IF @@error <> 0

     BEGIN

      ROLLBACK TRAN

      RETURN

     END

     SELECT userid, fname, lname, toemail FROM users, @clientstable CT WHERE users.userid = CT.userid

     IF @@error <> 0

     BEGIN

      ROLLBACK TRAN

      RETURN

     END

     COMMIT TRAN

    END

  • Hi Lowell,

    "no squishing userid's into a varchar and then re-extracting them." I am not sure if that will work. Please let me know if I am wrong or is there a work around.

    If the email is being sent to every user in our database, it will work. But most of the time, they perform a particular search (location, income, ...). And the client search results are displayed in an ASP.Net datagrid. And there is a check box for every row, that the admin user can select or deselect if the email is to be/not to be sent to that particular user. So when the submit button is clicked, I loop through each row and get all the userid values from the rows with check box selected.

    "an application might timeout, but a scheduled job will not." Actually the timeout error is from inside the windows service, the sproc I pasted above.

  • i guess what i was suggesting is that when you are looping thru the datagrid, I would have put the userid's that had checkboxes checked in their associated data row, I personally would do an .ExecuteNonQuery("insert into EmailCampaign....") that way the web service or job could go thru the table on demand, and re-run anything that raised email errors when sending.

    but Robert Cary had an excellent point, that since it's async, there's not as much of a load on the database. no real need to redesign just to have the userid's separated.

    remember the web service will create a connection to the database...usually with a default of .CommandTimeout of 30 seconds, and .ConnectionTimeout of 30 seconds. you could simply increase them so that it is not an issue.

     

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • minor, maybe do to a copy/paste/edit, but i think one variable is not declared, and another is never initailized:

    CREATE PROCEDURE [dbo].[proc_broadcastemailsget]

    AS

    DECLARE @clientstable TABLE (userid INT PRIMARY KEY)

    DECLARE @id INT

    DECLARE @useridstring VARCHAR(8000)

    IF EXISTS (SELECT 1 FROM broadcastemails)

    BEGIN

     BEGIN TRAN

     

     SELECT TOP 1 @id = id FROM broadcastemails

     SELECT @stringids = clientids FROM broadcastemails WHERE id = @id

     

     INSERT INTO @clientstable SELECT id FROM dbo.fn_splitstringtoint(@useridstring)

     IF @@error <> 0

     BEGIN

      ROLLBACK TRAN

      RETURN

     END

     DELETE FROM broadcastemails WHERE id = @id

     IF @@error <> 0

     BEGIN

      ROLLBACK TRAN

      RETURN

     END

     SELECT userid, fname, lname, toemail FROM users, @clientstable CT WHERE users.userid = CT.userid

     IF @@error <> 0

     BEGIN

      ROLLBACK TRAN

      RETURN

     END

     COMMIT TRAN

    END

    my rewrite suggestion, very very minor:

    CREATE PROCEDURE [dbo].[proc_broadcastemailsget]

    AS

    DECLARE @clientstable TABLE (userid INT PRIMARY KEY)

    DECLARE @id INT

    DECLARE @useridstring VARCHAR(8000)

    IF EXISTS (SELECT 1 FROM broadcastemails)

    BEGIN

     SET XACT_ABORT ON

    --if any error occurs, XACT_ABORT rolls back automatically

     BEGIN TRAN

     

     SELECT TOP 1 @id = id FROM broadcastemails

     SELECT @useridstring = clientids FROM broadcastemails WHERE id = @id

     

     INSERT INTO @clientstable SELECT id FROM dbo.fn_splitstringtoint(@useridstring)

     

     DELETE FROM broadcastemails WHERE id = @id

     

     SELECT userid, fname, lname, toemail FROM users, @clientstable CT WHERE users.userid = CT.userid

     

     COMMIT TRAN

    END

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Its a copy paste error. The actual var is @stringids, I thought @useridstring would be easier to understand for others. There were couple of other copy/paste errors as well. I have corrected all of them.

    The users table has a primary key index on userid and a clustered index on username/email. The broadcastemails table has a primary key index on broadcastid.

    I can increase the timeout period. Just curious to find out why would the sproc execute in less than 1min, and some times it takes more than 8 minutes.

    Thanks.

  • i think it might have to do with the table variable;

     SELECT userid, fname, lname, toemail FROM users, @clientstable CT WHERE users.userid = CT.userid

    i would look at the execution plan, but since a table variable would not have an index, the plan might be reverting to a table scan on the user table to get all values before it joins against the clientstable table variable.

    if it was a #table, you could throw an index on it prior to the select above, and maybe improve performance.

     

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply