Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

SSIS performance issue Expand / Collapse
Author
Message
Posted Tuesday, August 12, 2014 4:15 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, August 28, 2014 7:21 AM
Points: 79, Visits: 91
Thanks I will have a look at what you said, check the settings etc.

The cursors we use look for a table table and field name, and for example if a certain table was two data fields that are both nulls we delete the entry(line). We are doing it in batches of checking through a million rows at a a time. The code is working perfectly though in SQL Mngmnt Studio, when we use the same code, pull it in through from a file location and run it as a SQL Query, it takes 4 time longer to run, that is our main concern, there are probably better ways to do it but as we are working with about 500 tables in total, we found it better to keep it generic code as far as possible, rather than writing separate code for each table. I will so what the settings are in SSMS and SSIS, but I do not think that is the issue, the normal queries, that also run from a gile location, example that set up primary and foreign keys, create indexes etc are all running within what we are expecting.

At this stage it looks like getting away from the cursors seems to be the way to go, but that is a lot of extra work to put in, seeing as I am also trying to get the ETL more acceptable through SSIS than running code manually as it has been done before. We would ideally alos like to keep the code the same as we might have clients where the ETL's rather need to be done through SSMS because of server capacity.
Post #1602208
Posted Tuesday, August 12, 2014 4:46 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:50 AM
Points: 5,369, Visits: 9,913
Andre 425568 (8/12/2014)
I will so what the settings are in SSMS and SSIS, but I do not think that is the issue, the normal queries, that also run from a gile location, example that set up primary and foreign keys, create indexes etc are all running within what we are expecting.

Yes, but DDL operations don't (usually) return result sets and so network latency is much less likely to be a factor. Also, DML is more likely to be sensitive to options and regional settings. Would you be able to post a snippet of your code (or all of it), please?

John
Post #1602214
Posted Tuesday, August 12, 2014 4:57 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, August 28, 2014 7:21 AM
Points: 79, Visits: 91
This is the cursor specifically where I picked up the problem

DECLARE @object_id INT
DECLARE @table_name VARCHAR(128)
DECLARE @col_name VARCHAR(128)
DECLARE @exec_sql_string VARCHAR(500)
DECLARE @exec_sql NVARCHAR(500)
DECLARE @max_id NVARCHAR(4000)
DECLARE @OutputParameter NVARCHAR(4000)
DECLARE @id_count NUMERIC(38, 0)
DECLARE @id_count_to NUMERIC(38, 0)

SET NOCOUNT ON

DECLARE recs CURSOR DYNAMIC
FOR
SELECT object_id
,name
FROM sys.tables ts
WHERE object_id IN (
SELECT object_id
FROM sys.all_columns
WHERE name = 'INDEX0'
)
ORDER BY name

OPEN recs

FETCH
FROM recs
INTO @object_id
,@table_name

WHILE @@FETCH_STATUS = 0
BEGIN
SET @exec_sql_string = ''

DECLARE recs1 CURSOR DYNAMIC
FOR
SELECT name
FROM sys.all_columns
WHERE object_id = @object_id
AND name != 'ID'
AND name != 'INDEX0'
AND (
name NOT LIKE 'MAS%'
AND name NOT LIKE 'Z%'
)

OPEN recs1

FETCH
FROM recs1
INTO @col_name

WHILE @@FETCH_STATUS = 0
BEGIN
IF @exec_sql_string = ''
BEGIN
SET @exec_sql_string = 'DELETE FROM ' + @table_name + ' WHERE ' + @col_name + ' IS NULL '
END
ELSE
BEGIN
SET @exec_sql_string = @exec_sql_string + 'AND ' + @col_name + ' IS NULL '
END

FETCH
FROM recs1
INTO @col_name
END

SET @exec_sql = 'SELECT @OutputParameter = MAX(ID) FROM ' + @table_name

EXEC @max_id = sp_executesql @exec_sql
,N'@OutputParameter nvarchar(4000) output'
,@OutputParameter OUTPUT

IF LTRIM(RTRIM(@OutputParameter)) = ''
OR @OutputParameter IS NULL
BEGIN
SET @OutputParameter = '0'
END

SET @id_count = 1

IF @OutputParameter <> '0'
BEGIN
WHILE @id_count < @OutputParameter
BEGIN
SET @id_count_to = @id_count + 1000000

EXEC ('BEGIN TRANSACTION ' + @exec_sql_string + ' AND ID >= ' + @id_count + ' AND ID <= ' + @id_count_to + ' COMMIT TRANSACTION')

PRINT('FROM: ' + CAST(@id_count as varchar(150)) + ' TO: ' + CAST(@id_count as varchar(150)))

SET @id_count = @id_count + 1000000
END

SET @id_count_to = @id_count + 1000000

EXEC ('BEGIN TRANSACTION ' + @exec_sql_string + ' AND ID >= ' + @id_count + ' AND ID <= ' + @id_count_to + ' COMMIT TRANSACTION')

PRINT('FROM: ' + CAST(@id_count as varchar(150)) + ' TO: ' + CAST(@id_count as varchar(150)))
END

CLOSE recs1

DEALLOCATE recs1

FETCH
FROM recs
INTO @object_id
,@table_name
END

CLOSE recs

DEALLOCATE recs
GO
Post #1602218
Posted Tuesday, August 12, 2014 5:14 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 9:50 AM
Points: 5,369, Visits: 9,913
I can't see anything there that would obviously run differently under different settings. What I advise you to do is to capture each event with a trace, and capture the execution plans as well. You can then compare the slow one to the faster one. Are you confident that they are doing exactly the same thing and it's just the speed that's different?

One thing I did notice - I think your DELETEs would run quicker if you use the TOP clause instead of searching for ranges in your data.

John
Post #1602224
Posted Tuesday, August 12, 2014 5:31 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, August 28, 2014 7:21 AM
Points: 79, Visits: 91
I will try that with a trace, we have looked at using top, but it has not been very effective. The only difference between the two is one gets copied and runs in a query window in ssms and the one in SSIS uses the file with the code from Execute SQL Task, then in General -SQL Statement we use file connection and the option just above that we link to the file with the exact same code run through SSMS Query.

I will test profiler in a few days only as we are in the middle of a TEST ETL for one Client, and just after that another Client is going live, for now I will use scripts directly where we have cursors and for the rest of the code, I will use SSIS.
Post #1602229
Posted Tuesday, August 12, 2014 6:32 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Yesterday @ 4:13 AM
Points: 747, Visits: 1,315
Its strange really because whether its runs on SSMS or SSIS, its will operate on the SQL SERVER level. so time should not be the different. if the query you mentioned above is causing the performance issue this should definitely have performance issue when you run it SSMS as well. Cursor do not seems to be a big issue as you are using is to generate dynamic sql.

if you really want to use the SSIS, then implement the logic inside SSIS. But 1st let Adjust your query from the cursors

SELECT ts.object_id,
'Delete From ' + ts.name + ' Where ' + x.colQuery AS DeleteQuery
FROM sys.tables ts
CROSS APPLY
(
select STUFF((SELECT 'AND ' + QuoteName(c.name) + ' is null '
FROM sys.all_columns c
WHERE c.object_id = ts.object_id
AND c.name != 'ID'
AND c.name != 'INDEX0'
AND (c.name NOT LIKE 'MAS%' AND c.name NOT LIKE 'Z%')
For xml path ('')
),1,3,'') AS colquery
) X
WHERE ts.object_id IN (
SELECT ac.object_id
FROM sys.all_columns ac
WHERE ac.name = 'INDEX0'
)
ORDER BY ts.name

2nd part, in which you delete from the table by using While loop (Row by row). not good its will take ages to finish. to avoid this
delete data in chunks. so if merge both logic the final query will look like this

Declare @vTop as int = 25000				----------- Need to determine which value will perform better
Declare @1stQuery as varchar(500) = ''
Declare @LastQuery as varchar(500) = ''

select
@1stQuery =
' DECLARE @r INT;
SET @r = 1;
WHILE @r > 0
BEGIN
BEGIN TRANSACTION;
DELETE TOP ('+ CAST(@vTop as varchar(10)) +')
FROM ' ------------- Divided the query in parts

, @LastQuery =
' SET @r = @@ROWCOUNT;
COMMIT TRANSACTION;
-- CHECKPOINT; -- if Recovert Model simple
-- BACKUP LOG ... -- if Recovert Model full
END'

SELECT ts.object_id,
@1stQuery + ts.name + ' Where ' + x.colQuery + Char(10) + @LastQuery AS DeleteQuery
FROM sys.tables ts
CROSS APPLY
(
SELECT STUFF((SELECT 'AND ' + QuoteName(c.name) + ' is null '
FROM sys.all_columns c
WHERE c.object_id = ts.object_id
AND c.name != 'ID'
AND c.name != 'INDEX0'
AND (c.name NOT LIKE 'MAS%' AND c.name NOT LIKE 'Z%')
For xml path ('')
),1,3,'') AS colquery
) X
WHERE ts.object_id IN (
SELECT ac.object_id
FROM sys.all_columns ac
WHERE ac.name = 'INDEX0'
)
ORDER BY ts.name

its your choice now whether you want to run in a complete batch or you want to run that query one by one. you can do it in SSIS easy.
Post #1602246
Posted Tuesday, August 12, 2014 6:49 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, August 28, 2014 7:21 AM
Points: 79, Visits: 91
Hi

Thanks so much, I will start testing it and see what it does exactly, and how it fits on our tables. I will get back asap to let you know, as I said previously it may be a few days though.

Andre
Post #1602252
Posted Tuesday, August 12, 2014 6:55 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Yesterday @ 4:13 AM
Points: 747, Visits: 1,315
Take your time. if you need any query do lets us know.
Post #1602259
Posted Wednesday, August 27, 2014 4:06 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Thursday, August 28, 2014 7:21 AM
Points: 79, Visits: 91
I have had a look at these and nothing really solved the issues in SSIS, I think I need to build this into SSIS, just not sure exactly what to use, I have downloaded some SSIS books but all the samples they work with are for importing multiple files into SQL. Do you have some links to videos or guides that will help with this specific case. It looks like I need to use The For each loop, a lot has changed from SQL 2008 where I last used SSIS, and it looks like there is a lot of really great things to use that will help me, I just need the basic logic that I should be using.

Post #1607759
Posted Wednesday, August 27, 2014 4:28 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Yesterday @ 4:13 AM
Points: 747, Visits: 1,315
I have had a look at these and nothing really solved the issues in SSIS, I think I need to build this into SSIS, just not sure exactly what to use, I have downloaded some SSIS books but all the samples they work with are for importing multiple files into SQL. Do you have some links to videos or guides that will help with this specific case. It looks like I need to use The For each loop, a lot has changed from SQL 2008 where I last used SSIS, and it looks like there is a lot of really great things to use that will help me, I just need the basic logic that I should be using.


What exactly the issue you are facing write now? If you want to delete the data in bulk with performance then you SHOULD BE doing it on SQL SERVER even if you are using SSIS (e.g. Execute SQL Task). The other option is the row by row deletion from SSIS same like cursor.

Help me understand what issue you are facing against the solution shared earlier, Share some details?

Regarding SSIS basic understand you can look into the following:
Stairway to Integration Services Andy Leonard

that will get you on track of many things.




Post #1607767
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse