Row_Number() in SSIS?

  • Hi Friends,

    I have a select statement having Row_number() over(partition by coulmn order by column) ....

    i see no option in any of SSIS flow tasks doing this Row number partitioning...

    if we can do this by script component, could you guys give me sample codings so that it would be very useful to me...

    i am asking this since the T-SQL statement has 4 join statements and to process around 5 million records...

    so i use this statement in OLEDB SQL Command(since i got no other option) ...so it is taking too much time to process.....

    what i am wondering is , instead of using in the SQL command, if i just selec the records and process the partition in script component, it can help in speed up the performance....

    So give me your valuable suggestions friends...

    any suggestions would be really appreciated..

    Thanks,
    Charmer

  • And the question is, what?

  • Lynn Pettis (8/14/2012)


    And the question is, what?

    sorry, Lynn...i acciedntly pressed the enter button before completing hte question....i am sorry...:-)

    Thanks,
    Charmer

  • Have you tried tuning your SQL query?

  • Lynn Pettis (8/14/2012)


    Have you tried tuning your SQL query?

    Yes, Lynn...my SQL query has no other way....this is it...:(

    Thanks,
    Charmer

  • Charmer (8/14/2012)


    Lynn Pettis (8/14/2012)


    Have you tried tuning your SQL query?

    Yes, Lynn...my SQL query has no other way....this is it...:(

    Not sure what you mean by "has no other way."

  • Lynn Pettis (8/14/2012)


    Charmer (8/14/2012)


    Lynn Pettis (8/14/2012)


    Have you tried tuning your SQL query?

    Yes, Lynn...my SQL query has no other way....this is it...:(

    Not sure what you mean by "has no other way."

    i mean....SQL query is fine...i meant no other option to tune sQL query...

    Thanks,
    Charmer

  • Sounds like you want to do the sort in SSIS rather than T-SQL - bad idea. One way or another, you should make the SQL engine do that work for you - it's at least an order of magnitude faster.

    Just how were you proposing to do a sort in a script component, by the way?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin (8/14/2012)


    Sounds like you want to do the sort in SSIS rather than T-SQL - bad idea. One way or another, you should make the SQL engine do that work for you - it's at least an order of magnitude faster.

    Just how were you proposing to do a sort in a script component, by the way?

    sorry Phil, i thought there could be a way using script component....i generated sequence number using script component....but i was not sure that we cannot do sort using script component....but Phil, after you said...it is clear that sort in ssis is not possible unless using sql query doing in sql command....

    Thanks,
    Charmer

  • Charmer (8/14/2012)


    Lynn Pettis (8/14/2012)


    Charmer (8/14/2012)


    Lynn Pettis (8/14/2012)


    Have you tried tuning your SQL query?

    Yes, Lynn...my SQL query has no other way....this is it...:(

    Not sure what you mean by "has no other way."

    i mean....SQL query is fine...i meant no other option to tune sQL query...

    How much of a drag does the ROW_NUMBER() aspect put on query execution? It sounds like you already have a fairly hefty process.

    As far as doing it in a script - you might care to use the auto-increment script as your basis:

    http://www.ssistalk.com/2007/02/20/generating-surrogate-keys/[/url]

    In your case - doing the lookup for a max value is not needed since you're cranking it out internally (so simply initialize the value to 1).

    You would just have to add some condition to reset the increment back to 1 when you move out of the partition.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • The other option is to use a SQL Task to load the results of your query into a staging table (permanent or temporary, but probably permanent for ease of coding) and then pull the data from that table.

  • Matt Miller (#4) (8/14/2012)[/b

    How much of a drag does the ROW_NUMBER() aspect put on query execution? It sounds like you already have a fairly hefty process.

    i am not able to get what exactly you are asking Matt....i mean this sentence "How much of a drag does the ROW_NUMBER() aspect put on query execution?"...could you brief me , please?

    Thanks,
    Charmer

  • I'd like to see the query, tables and indexes, and the actual execution plan of the query. Never know, there may be some tuning that can be done.

  • Lynn Pettis (8/14/2012)


    I'd like to see the query, tables and indexes, and the actual execution plan of the query. Never know, there may be some tuning that can be done.

    with t as (

    select

    C.IncidentIDRef AS incidentidref

    ,A.Location,C.INADR#,A.GASNAM

    , ROW_NUMBER() over (PARTITION by sc.Code, C.IncidentIDRef order by ltrim(rtrim(A.GAINCD)) desc) AS ADDRSEQNO

    , GASTR# AS GASTR#

    , b.Code AS GASDIR

    , D.Code AS GASSUF

    , H.Code AS GASDR2

    , GASQLF AS GASQLF

    , E.Code AS GACDIR

    , GACNAM AS GACNAM

    , G.Code AS GACSUF

    , sc.Code

    from PLINCD C

    JOIN SrcCodeMap sc

    on LTRIM(RTRIM(INORI#)) = sc.SrcCodeValueRef and sc.CodeID = 100

    left outer JOIN PSGADR A

    ON INADR# = A.GAADR# and (ltrim(rtrim(GAINCD)) = 'y' or LTRIM(rtrim(gacinc)) = 'y')

    join GloucMidPolice.dbo.SrcIncident si

    on C.ININC# = si.IncidentIDRef and sc.Code = si.JID)

    Select

    IncidentIDRef

    , ADDRSEQNO

    , isnull(Location, 'Address: ' + cast(INADR# as varchar) + ' does not exist in Database') Location

    , LTRIM(RTRIM(GASTR#))

    , LTRIM(RTRIM(GASDIR))

    , LEFT(LTRIM(RTRIM(COALESCE(GASNAM,LOCATION,'No Address Available'))),30) as GASNAM

    , LTRIM(RTRIM(GASSUF))

    , LTRIM(RTRIM(GASDR2))

    , LTRIM(RTRIM(GASQLF))

    , LTRIM(RTRIM(GACDIR))

    , LTRIM(RTRIM(GACNAM))

    , LTRIM(RTRIM(GACSUF))

    , t.Code

    from t where addrseqno = 1

    No index has been created so far...

    Thanks,
    Charmer

  • Charmer (8/14/2012)


    Lynn Pettis (8/14/2012)


    I'd like to see the query, tables and indexes, and the actual execution plan of the query. Never know, there may be some tuning that can be done.

    with t as (

    select

    C.IncidentIDRef AS incidentidref

    ,A.Location,C.INADR#,A.GASNAM

    , ROW_NUMBER() over (PARTITION by sc.Code, C.IncidentIDRef order by ltrim(rtrim(A.GAINCD)) desc) AS ADDRSEQNO

    , GASTR# AS GASTR#

    , b.Code AS GASDIR

    , D.Code AS GASSUF

    , H.Code AS GASDR2

    , GASQLF AS GASQLF

    , E.Code AS GACDIR

    , GACNAM AS GACNAM

    , G.Code AS GACSUF

    , sc.Code

    from PLINCD C

    JOIN SrcCodeMap sc

    on LTRIM(RTRIM(INORI#)) = sc.SrcCodeValueRef and sc.CodeID = 100

    left outer JOIN PSGADR A

    ON INADR# = A.GAADR# and (ltrim(rtrim(GAINCD)) = 'y' or LTRIM(rtrim(gacinc)) = 'y')

    join GloucMidPolice.dbo.SrcIncident si

    on C.ININC# = si.IncidentIDRef and sc.Code = si.JID)

    Select

    IncidentIDRef

    , ADDRSEQNO

    , isnull(Location, 'Address: ' + cast(INADR# as varchar) + ' does not exist in Database') Location

    , LTRIM(RTRIM(GASTR#))

    , LTRIM(RTRIM(GASDIR))

    , LEFT(LTRIM(RTRIM(COALESCE(GASNAM,LOCATION,'No Address Available'))),30) as GASNAM

    , LTRIM(RTRIM(GASSUF))

    , LTRIM(RTRIM(GASDR2))

    , LTRIM(RTRIM(GASQLF))

    , LTRIM(RTRIM(GACDIR))

    , LTRIM(RTRIM(GACNAM))

    , LTRIM(RTRIM(GACSUF))

    , t.Code

    from t where addrseqno = 1

    No index has been created so far...

    Indexes won't help your query as it is. You are using ltrim(rtrim()) function pairs on the columns in your join criteria which is forcing SQL Server to do table/clustered index scans on those tables.

    The options here are to 1) clean up the source data so that the ltrim(rtrim()) of the columns is not needed, or 2) load the data from the source tables into temporary (staging) tables cleaning up the data as you do that so that proper indexing of the staging tables can benefit the query.

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply