delete a large amount of rows in a table

  • Please help me make this query more efficient because i been waiting for more then 1 hour for query to complete and it did not finish and i got like 25 large tables to delete from:

    this is what i wrote it like but it is very slow ..

    SELECT * INTO WORKTBL FROM [CHARACTER_01_DBF].[dbo].SKILLINFLUENCE_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL] WHERE End_Time > '20130123015613')

    DROP TABLE [CHARACTER_01_DBF].[dbo].SKILLINFLUENCE_TBL

    SELECT * INTO [CHARACTER_01_DBF].[dbo].SKILLINFLUENCE_TBL from WORKTBL

    DROP TABLE WORKTBL

    Do you know any better method ? much faster then hours and hours of waiting for it to complete ?

  • What percentage of the rows are you keeping and how many rows are there in the tables?

    At some point its easier just to loop a delete statement depending on how much data you are getting rid of.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • 133.723 rows total and i wanna keep 77.646

    i wanna delete the custom amount according to the query u see above but i tried to run a count for the remaining rows it also take a lot to do :S

    i am really confused on the best solution for this...

    please help me how ever you can

    my main problem is i got 10 tables with 133k from what i wanna keep 77k from each so this will take ages unless i find a solution or u guys tell me one

  • insanityflyff (2/27/2013)


    133.723 rows total and i wanna keep 77.646

    i wanna delete the custom amount according to the query u see above but i tried to run a count for the remaining rows it also take a lot to do :S

    i am really confused on the best solution for this...

    please help me how ever you can

    my main problem is i got 10 tables with 133k from what i wanna keep 77k from each so this will take ages unless i find a solution or u guys tell me one

    133k rows, is that the size of [CHARACTER_01_DBF].[dbo].SKILLINFLUENCE_TBL?

    at that size it may just be easier to run a simple delete. Correct me if i am wrong but it looks like your initial query is selecting rows that are in SKILLINFLUENCE_TBL but not in the logging table? So if an m_idplayer exists in the logging table we can delete it from the main table?


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].CHARACTER_TBL WHERE m_idPlayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    this query should delete all the m_idplayers that dont appear in the LOG_LOGIN_TBL right ?

  • it should.

    How many rows are in [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL]?

    Your sub query could be causing the issue if there are only 133k rows in [CHARACTER_01_DBF].[dbo].SKILLINFLUENCE_TBL and its taking hours.

    if the sub query is the issue any of the deletes using the sub query will take a while as well.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • by sub query u mean WHERE End_Time > '20130123015613' right ?

    this means it will delete all data that is below 2013/23/01/01:56:03

    year day month hour minute second format

    and that sub query makes the issue? 🙁

  • thats what i mean by sub query.

    However how many rows are in the logging table and what indexes there are will affect this a lot. Also is End_Time a DateTime column or Varchar? might be having an issue with your date format as well.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • CapnHector (2/27/2013)


    thats what i mean by sub query.

    However how many rows are in the logging table and what indexes there are will affect this a lot. Also is End_Time a DateTime column or Varchar? might be having an issue with your date format as well.

    the End_Time is a char(14) date type

    the log_login_tbl has 981.057 rows

  • In general, when I have a lot of rows to delete, I do it in a loop with a "wait". The advantages for me are:

    Small chunks so I can see it progress.

    Does not hog the system completely

    Can be stopped & restarted without starting over

    It may not be the fastest method, but in my case I needed to run it on a production server with minimal impact on others. Experiment with delay time & delete size

    basically i set it up like this:

    DeleteMore:

    waitfor delay 5 seconds (to give others some cpu)

    delete top 10000 from tableA where ......

    if @@rowcount > 0 goto DeleteMore

  • homebrew01 (2/27/2013)


    In general, when I have a lot of rows to delete, I do it in a loop with a "wait". The advantages for me are:

    Small chunks so I can see it progress.

    Does not hog the system completely

    Can be stopped & restarted without starting over

    It may not be the fastest method, but in my case I needed to run it on a production server with minimal impact on others. Experiment with delay time & delete size

    basically i set it up like this:

    DeleteMore:

    waitfor delay 5 seconds (to give others some cpu)

    delete top 10000 from tableA where ......

    if @@rowcount > 0 got DeleteMore

    can you write me a good one for this query ?

    DELETE FROM [CHARACTER_01_DBF].[dbo].CHARACTER_TBL WHERE m_idPlayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    i gotta wipe from all this tables :

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].BANK_EXT_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].BANK_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].BILING_ITEM_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].CHARACTER_TBL WHERE m_idPlayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].GUILD_MEMBER_TBL WHERE m_idPlayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].INVENTORY_EXT_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].INVENTORY_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].ITEM_SEND_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].PARTY_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].RAINBOWRACE_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].SECRET_ROOM_MEMBER_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].SKILLINFLUENCE_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].TAG_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].TASKBAR_ITEM_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].TASKBAR_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblCampusMember WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblFactionMember WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblHousing WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblHousing_Visit WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblMaster_all WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblMultiServerInfo WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblPocket WHERE idPlayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblPocketExt WHERE idPlayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblMessenger WHERE idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblSkillPoint WHERE PlayerID NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    DELETE FROM [CHARACTER_01_DBF_OLD].[dbo].tblRestPoint WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    so i would appreciate a formula u are using to make it faster and better for my server not to stay at 100% processor usage

  • Since you have multiple tables using the same exclusion data set i would create a work table with that data set and index it, then use a LEFT JOIN tbl b WHERE b IS NULL.

    That may be a little faster and would work in a loop as well.

    As far as writing the loop for the delete, homebrew01 has all ready provided the way to do it.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • CapnHector (2/27/2013)


    Since you have multiple tables using the same exclusion data set i would create a work table with that data set and index it, then use a LEFT JOIN tbl b WHERE b IS NULL.

    That may be a little faster and would work in a loop as well.

    As far as writing the loop for the delete, homebrew01 has all ready provided the way to do it.

    still i dont know sql query so please tell me if this would work :

    if @@rowcount > 0

    WAITFOR DELAY '00:00:05';

    delete top (10000) FROM [CHARACTER_01_DBF_OLD].[dbo].CHARACTER_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

  • insanityflyff (2/27/2013)


    CapnHector (2/27/2013)


    Since you have multiple tables using the same exclusion data set i would create a work table with that data set and index it, then use a LEFT JOIN tbl b WHERE b IS NULL.

    That may be a little faster and would work in a loop as well.

    As far as writing the loop for the delete, homebrew01 has all ready provided the way to do it.

    still i dont know sql query so please tell me if this would work :

    if @@rowcount > 0

    WAITFOR DELAY '00:00:05';

    delete top (10000) FROM [CHARACTER_01_DBF_OLD].[dbo].CHARACTER_TBL WHERE m_idplayer NOT IN (SELECT DISTINCT(m_idPlayer) COLLATE DATABASE_DEFAULT FROM [LOGGING_01_DBF].[dbo].[LOG_LOGIN_TBL])

    first you dont need the distinct in the NOT IN, its redundant. http://sqlinthewild.co.za/index.php/2011/01/18/distincting-an-in-subquery/

    Second for something like this you may get better performance using NOT EXISTS. Here is a blog that goes into detail on the Subject http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/

    Second i bolded a section that says danger to me. The query you wrote will only delete the top 10000 records and not loop. Re read homebrew01's post.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • You may also want to read this article: http://www.sqlservercentral.com/articles/T-SQL/67898/.

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply