|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Wednesday, February 13, 2013 2:51 PM
Points: 8,
Visits: 20
|
|
I'm new to the TSQL world (coming from visual basic) and am working with large tables (125+ million rows) and having perfomance issues. This sp takes over an hour to run. Is there a better way to optimize this code? Table2 has 40,525,850 rows and Table1 ends up with 125,350,605 rows.
I'm creating two indexes within the code: CREATE NONCLUSTERED INDEX IDX_Table2 ON Table2 (Batch,Flag,Num,Date1 ) INCLUDE (Id); CREATE NONCLUSTERED INDEX IDX_Table1 ON Table1 (Batch,Flag,Num);
INSERT INTO Table5 SELECT DISTINCT CD.Desc AS Desc, SD.Num AS Num, SD.Batch AS Batch, CD.CodeId AS CodeId, SD.Type AS Type, CH.Id AS Id, FROM Table1 SD JOIN Table2 CH ON CH.Num=SD.Num AND CH.Batch= @Batch AND CH.Flag= 0 AND SD.Batch = @Batch AND SD.Flag= 0 JOIN Table3 M ON M.Code=SD.Code AND (Map=1 OR Map=@Map) AND CH.Date1 BETWEEN M.Date2 AND M.Date3 JOIN Table4 CD ON M.CodeId=CD.CodeId WHERE SD.Code IS NOT NULL AND RTRIM(LTRIM(SD.Code ))<>'';
Also, will the following code produce the same result? Or will moving the filters to the where clause make the join take longer? How does SQL process the code?
INSERT INTO Table5 SELECT DISTINCT CD.Desc AS Desc, SD.Num AS Num, SD.Batch AS Batch, CD.CodeId AS CodeId, SD.Type AS Type, CH.Id AS Id, FROM Table1 SD JOIN Table2 CH ON CH.Num=SD.Num AND CH.Batch= SD.Batch JOIN Table3 M ON M.Code=SD.Code AND (Map=1 OR Map=@Map) AND CH.Date1 BETWEEN M.Date2 AND M.Date3 JOIN Table4 CD ON M.CodeId=CD.CodeId WHERE SD.Code IS NOT NULL AND RTRIM(LTRIM(SD.Code ))<>'' AND CH.Flag= 0 AND SD.Batch = @Batch AND SD.Flag= 0
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Today @ 5:05 AM
Points: 274,
Visits: 785
|
|
1. Using
RTRIM(LTRIM(SD.Code ))<>'' will be very slow - if you've got an index on SD.Code it probably won't be used. Can you tidy the data in the table so you don't need RTRIM(LTRIM() & add an index?
2. You could consider using
FROM Table1 SD WITH (NOLOCK) JOIN Table2 WITH (NOLOCK) etc. as long as the table isn't being updated - this saves time as no read locks are issued.
3. Not sure about the pros & cons of including the filters in the WHERE clause...
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 5:43 AM
Points: 37,678,
Visits: 29,933
|
|
For inner joins there is no difference whatsoever. For outer joins, moving the filter from where to join changes the logic of the query. Hence it's not a matter of performance, it's a matter of which gives you the correct results.
Gail Shaw Microsoft Certified Master: SQL Server 2008, MVP SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
We walk in the dark places no others will enter We stand on the bridge and no one may pass
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 5:43 AM
Points: 37,678,
Visits: 29,933
|
|
laurie-789651 (9/24/2012)
2. You could consider using FROM Table1 SD WITH (NOLOCK) JOIN Table2 WITH (NOLOCK) etc. as long as the table isn't being updated - this saves time as no read locks are issued.
And incorrect results are possible, not just dirty reads, duplicate or missing rows. If there are no changes being made, there's little gain from nolock, the overhead of taking locks is not that high, and it introduces the potential for incorrect results when changes are being made.
See - http://blogs.msdn.com/b/davidlean/archive/2009/04/06/sql-server-nolock-hint-other-poor-ideas.aspx
Gail Shaw Microsoft Certified Master: SQL Server 2008, MVP SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
We walk in the dark places no others will enter We stand on the bridge and no one may pass
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Wednesday, February 13, 2013 2:51 PM
Points: 8,
Visits: 20
|
|
So your saying that this code still produces the same result in with the same efficiency? (I took out another join and moved it to the where clause) INSERT INTO Table5 SELECT DISTINCT CD.Desc AS Desc, SD.Num AS Num, SD.Batch AS Batch, CD.CodeId AS CodeId, SD.Type AS Type, CH.Id AS Id, FROM Table1 SD JOIN Table2 CH ON CH.Num=SD.Num JOIN Table3 M ON M.Code=SD.Code AND (Map=1 OR Map=@Map) AND CH.Date1 BETWEEN M.Date2 AND M.Date3 JOIN Table4 CD ON M.CodeId=CD.CodeId WHERE SD.Code IS NOT NULL AND RTRIM(LTRIM(SD.Code ))<>'' AND CH.Batch = @Batch AND CH.Flag= 0 AND SD.Batch = @Batch AND SD.Flag= 0
|
|
|
|
|
SSC Eights!
      
Group: General Forum Members
Last Login: Yesterday @ 7:51 PM
Points: 958,
Visits: 1,919
|
|
You'll have better performance if you change the following statement
WHERE SD.Code IS NOT NULL AND RTRIM(LTRIM(SD.Code ))<>'' with
WHERE SD.Code <> '' The results are the same and you can test them if you want. Here's a script.
DECLARE @Table table( mystring char(15))
INSERT @Table VALUES( ''),( ' '),( ' '), ( 'a'),( ' b'),( 'c '), ( ' '), (NULL)
SELECT * FROM @Table WHERE mystring <> ''
Luis C. Please don't trust me, test the solutions I give you before using them. Forum Etiquette: How to post data/code on a forum to get the best help
|
|
|
|
|
SSC-Addicted
      
Group: General Forum Members
Last Login: Today @ 5:32 AM
Points: 485,
Visits: 2,130
|
|
Here is a quick example of how moving the filter from the join to the WHERE clause can have an effect on the result of your queries (if you're using OUTER JOINS):
Although there is a filter in the first query on the Forename, SQL still includes the other records. This is because SQL server does the inner join internally and applies the filter then adds all missing rows from the left table.
Hope this makes sense.
CREATE TABLE #Employee (ID INT IDENTITY(1, 1), Forename VARCHAR(20))
INSERT INTO #Employee(Forename) SELECT 'Abu Dina' UNION ALL SELECT 'SQL4n00bs' UNION ALL SELECT 'Test' UNION ALL SELECT '1337'
CREATE TABLE #Location (ID INT IDENTITY(1, 1), Employee_ID INT, Location VARCHAR(50))
INSERT INTO #Location(Employee_ID, Location) SELECT 1, 'UK' UNION ALL SELECT 2, 'Manchester'
SELECT a.* , b.* FROM #Employee AS a LEFT JOIN #Location AS b ON a.id = b.employee_id and a.forename = 'Abu Dina' SELECT a.* , b.* FROM #Employee as a LEFT JOIN #Location as b ON a.id = b.employee_id WHERE a.forename = 'Abu Dina' DROP TABLE #Employeea DROP TABLE #Location
----------------------------------- http://www.SQL4n00bs.com
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 5:43 AM
Points: 37,678,
Visits: 29,933
|
|
newbie2 (9/24/2012) AND RTRIM(LTRIM(SD.Code ))<>'';
The RTRIM and LTRIM are unnecessary here. SQL ignores trailing spaces when checking string equality, hence '' = ' ', true no matter how many spaces you have.
That predicate can be reduced to AND SD.Code != ''. That != will also eliminate nulls, so you can remove the SD.Code IS NOT NULL as well.
Neither of the nonclustered indexes that you've created look optimal. Rather consider
Table1: Index key (Code, Batch, Flag, Num) Include (Type) Table2: Index key (Flag, Batch, Num, Date1) Include (ID)
Gail Shaw Microsoft Certified Master: SQL Server 2008, MVP SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
We walk in the dark places no others will enter We stand on the bridge and no one may pass
|
|
|
|
|
SSCrazy Eights
        
Group: General Forum Members
Last Login: Yesterday @ 2:45 PM
Points: 8,567,
Visits: 8,218
|
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Wednesday, February 13, 2013 2:51 PM
Points: 8,
Visits: 20
|
|
| Thank you all for your help!
|
|
|
|