July 1, 2008 at 8:05 am
I have an OLE DB source that is pulled off a view. The ISSortedProperty has been set to FALSE (I thought this might be the problem, but I guess it's not) and the Source is taking forever to pull any data at all. It's confusing me.
When I run the Select * from MyView in SSMS, it completes in less than 2 minutes. SSIS, though, can sit there and run for a looooong time and not do anything. I don't get any row count or anything. Just the yellow "glow" of the "executing" status.
When I was using the IsSortedProperty, I let this thing run all night long (more than 8 hours) and it still hadn't pulled anything. So here I am re-writing and re-writing the source view and I've gotten it down to 1 minute 10 seconds. And still the OLE DB Source can run for an hour plus without pulling any data rows. So I've finally turned off the IsSorted property. Now I'm running and 10 minutes into the stupid thing I still have no records to show for my trouble.
Does anyone have any troubleshooting thoughts? Everything other Source works fine, so it has to be this particular data set, but like I said, I can run the view just fine in SSMS in a short period of time without any trouble at all.
July 1, 2008 at 8:17 am
Y, I have no idea. I've been pulling from views for yrs and I just did a test with a relatively complicated view and it pulled back just fine. I don't know where I'd start on this one either.
Perhaps pull the view code and put it in the package. Maybe that'll tell you something. If it runs the base sql just fine maybe you can use that to go to PSS. Maybe do an sp_refreshview on it as well. Try it from another box.
Just some ideas to try to force it to do something other than just sit there.
Watch my free SQL Server Tutorials at:
http://MidnightDBA.com
Blog Author of:
DBA Rant – http://www.MidnightDBA.com/DBARant
July 1, 2008 at 8:43 am
Hmm. Interestingly enough, the solution seems to be to change the Source from "Table or View" to "SQL Command" and to use an actual SELECT statement with defined column list instead of just choosing the view from the list associated with "Table or View".
It's working much faster, even with IsSorted set to TRUE, now that I've done that. I got the first several thousand records within seconds...
July 1, 2008 at 8:47 am
Well, the IsSorted flag just tells downstream components that your data is sorted. It doesn't actually sort it. It will have no bearing on performance.
Watch my free SQL Server Tutorials at:
http://MidnightDBA.com
Blog Author of:
DBA Rant – http://www.MidnightDBA.com/DBARant
July 2, 2008 at 6:00 am
Yup IsSorted will have no impact on performance.
How many columns are in the view? Remember that the OLEDB Source has to take the data and create a recordset out of it. Recordset objects can be pretty large.
Also, click an the data flow component that your OLEDB source is in and you will find a "DefaultBufferMaxRows" and "DefaultBufferSize". You may be able to tweak these settings a bit to get the components in the data flow to buffer data in smaller or larger pieces. Depending on the amount of available memory this can have a big impact on how quickly the data flow component will get through the pre-execute and begin moving data.
Finally, what is the component immediately after your OLEDB Source? Sometimes the UI will not update well and you may find that the data flow is actually waiting on a component that has not turned yellow.
Oh - one more thing. You can't use the "Table or View" option and manually turn on the "IsSorted" property of an OLEDB Source. "Table or View" will return results in the order that the OLEDB provider returns them in - for SQL 2005 this is essentially random. You need to specify an ORDER BY clause and then match the order you are indicating in the output to the order you have returned records.
July 2, 2008 at 10:50 am
Michael,
Actually, you can manually turn on the IsSorted property with the "Table or View" option. In fact, it's required in order to use a MERGE JOIN component right after your OLE DB Source. You simply go into Advanced Editor, Input & Output Properties. Highlight OLE DB Source Output and the property is there to be set to TRUE or FALSE. Of course, after you do this, you have to set up at least one Output column's SortKeyPosition to something other than 0 to actually use the IsSorted property.
This wasn't a case of the step after being a problem either. I had two other views with the same exact transformations as this particular view had (and in the same order) which worked fine with the "Table or View" option. All views had the exact same # of columns. However, the view giving me the trouble did have a LOT more returned records. Over 200,000 compared to less than 5,000 on the other two views.
Once I did "SQL Command" and a Select statement on this particular view, it worked in under a minute without me having to tweak anything else (like the buffers, etc.). That all tells me the problem had something to do with the data.
So it might indeed have been a RecordSet issue, like you suggested. It's annoying, though, that it took me several days to find the problem. You would think the package progress screen would tell you that it's trying to create something instead of "Execute process has started".
July 2, 2008 at 11:02 am
Sorry, I was unclear. Yes, you can turn the IsSorted property on manually, but since that does not actually sort the data, you have told SSIS that the data is sorted but not actually sorted it. So, this clearly does not make sense. With the Table or View option you have no way to actually specify the sort order, you can only tell SSIS that it is sorted - but you are in fact wrong, the data is not sorted unless you happen to be extremely lucky.
July 10, 2008 at 11:23 am
Michael,
Your last post actually made sense. Until of course, I created another data flow task.
Of course, I can't be sure until I've finished running this step, but it appears setting the ISSORTED property does have some small effect. I had this task built somewhat differently, but it started out with the same 4 transformations:
2 OLE DB Sources,
Derived Column (coming from Source A)
Merge Join (Left input is Derived Column, Right Input is Source B)
Both Sources were set to ISSORTED = True with 1 column set with a SortKeyPosition. For 8 million plus records (and the task wasn't through yet when I stopped it), it took over 3 hours.
I changed my structure slightly. Removed the SortKeyPosition (set it to 0) and changed ISSORTED to false.
2 OLE DB Sources, Source A sorted in query now
Derived Column (coming from Source A)
Sort Transformation (coming from Source B)
'nother Sort Transformation (coming from Source A) (EDIT-Added to Post)
Merge Join (Left input is Source A Sort, Right Input is Source B Sort)
[EDIT] -- For those who read this post before this latest edit, I apologize for the skipped line about the Sort for Source A. I really did have a brain once upon a time. Then I put it someplace I wouldn't lose it and now I can't find it! @=Pbbbttt.
I restarted the Task and now, just 45 minutes later, I'm 5 million + records into the pull. That's quite a bit better than the original 3 hours.
I didn't change any of the actual Data Flow properties. Buffer is set to 100000, but always has been since I created it. Engine Threads set to 5 (again, set before the first test) and no Temporary Buffer Storage has been set.
Now, this could be a network issue (I'm running debug on my local machine which is connecting out to the servers). Or a SQL Server load issue (if other developers were pulling from the same server earlier and aren't pulling right now).
I don't think it's a PC issue because I've got all the same windows & programs opened as I did earlier. However, I can tell you the first time I ran it, my PC got quite a bit slowed down. Once I made the changes, though, I'm not having any trouble flipping between windows on my local box. So, maybe it is a local resource that is consumed by having ISSORTED set.????
Anyway, I'm not taking any particular position in this debate. I'm just letting everyone know what I've noticed on this one particular task, this one particular time regarding this whole thread. You know, just in case people are interested in my experiences.
Thanks everyone for all the good advice in this (and previous SSIS) thread(s). It's helped out a lot!
EDIT: I should note that in both Sources (and both runs), I'm using SQL Command instead of "Table or View"...
Viewing 8 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply