Filtering on multiple input pairs

  • So I have a job today that processes all "new" transactions posted against account numbers. It's been going well processing the transaction delta one account number at a time

    select * from data where acct_no = 1241 and tran_dt >= '01-01-2015'

    I used to have 1000 accounts, now I got 10000, and it is taking lot of time. I am thinking instead of making 10000 trips to the db, I can make 1000 trips asking for transaction delta 10 accounts at a time. So I can experiment I'm also thinking I will write a SP and use XML input so I can control how many accounts at a time I try to fetch delta for. Like so.

    <input>

    <pair>

    <acct_no>23423</acct_no>

    <tran_dt>9-11-2015</tran_dt>

    </pair>

    <pair>

    <acct_no>13223423</acct_no>

    <tran_dt>7-12-2015</tran_dt>

    </pair>

    ... and so on and so forth

    </input>

    I need to figure out how to parse out the acct_no and tran_dt pair out of the xml in a way I can plug it inside my query. I think I can parse out the pair as a 'rowset' with two columns acct_no, tran_dt. My problem is I'm not able to figure out how I can get all transaction delta beyond specified date for each account number, since the tran_dt as baseline for the delta differs for each account number.

    I would like to think this can be done in a single query, but I'm racking my brain right now. Any pointers deeply appreciated. Otherwise I will have to do it separately for each account number and then union all the results.

    PS One additional issue is "today" I can get a tran_dt of 03-03-2015, i.e. in the past and that might be my latest transaction for given account number. "tomorrow" I could get a new transaction with same date of 03-03-2015. "today" the delta would be the first transaction and "tomorrow" the delta would be the second transaction. ">=" will not solve this problem since "tomorrow" I would get 2 transactions in my delta. Not trying to be greedy here, but I also have this problem to solve, but right now I would be happy if can figure out how to solve my main problem. Will post anything I try here. One option I have is to explore maintaining a PK id on the table which is ordered and also including it to make sure "new" rows have higher PK ids.

    Thanks much.

  • Eirkur would no doubt be able to do this in a nanosecond, but in a nutshell, you would have to use XQuery to do this. Here's the article that gets into XQuery syntax, but here's one of the articles in the series on here:

    http://www.sqlservercentral.com/articles/Stairway+Series/+Querying+XML+Data/92784/

  • Quick suggestion on parsing the XML

    😎

    USE tempdb;

    GO

    SET NOCOUNT ON;

    DECLARE @INPUT XML = '<input>

    <pair>

    <acct_no>23423</acct_no>

    <tran_dt>9-11-2015</tran_dt>

    </pair>

    <pair>

    <acct_no>13223423</acct_no>

    <tran_dt>7-12-2015</tran_dt>

    </pair>

    </input>';

    SELECT

    PAIR.DATA.value('(acct_no/text())[1]','INT') AS acct_no

    ,PAIR.DATA.value('(tran_dt/text())[1]','DATE') AS tran_dt

    FROM @INPUT.nodes('input/pair') AS PAIR(DATA);

    Results

    acct_no tran_dt

    ----------- ----------

    23423 2015-09-11

    13223423 2015-07-12

  • (See, told ya Eirikur could do it... even if I can't remember how to spell his name!)

  • pietlinden (8/12/2015)


    (See, told ya Eirikur could do it... even if I can't remember how to spell his name!)

    He he, if I had a penny for every mis-spelling of my name I would be ....:-D

    😎

  • Yeah, we silly Anglos can't read that well...

  • Many thanks guys! Now I know my decision for using XML string to provide input was not haphazard.

    So back to my main question then.

    If I have the inputs in a rowset extracted as (A, D) for (Account, Date) with values (A1, D1), (A2, D2), etc. etc. Can I then do a JOIN as follows :

    JOIN ON MainTable.A = My.A and MainTable.D > My.D

    then,

    ORDER BY A, D.

    Will above give me all rows for each A such that each row for a given A has a date later than corresponding D?

    I'm going to try it, but didn't want to be wasting my time if it was never going to work.

  • Pagan DBA (8/12/2015)


    It's been going well processing the transaction delta[font="Arial Black"] one account number at a time

    [/font]

    That would be the real problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I would probably start by reading this article[/url] by Pinal Dave on Simulating LAG in SQL Server 2008.

    If possible I would persist the shredded XML to a table and then you could index it and this query would be pretty easy... well, given the code at SQL Authority (link above).

  • Thanks guys, but like I said in my original post I already know processing account transaction delta one at a time is slow. Hence this post.

    And, I can look into making things efficient like storing in table and indexing (while I am not seeing how this is appropriate given my input changes on demand for only that set of transactions I need to get deltas for say 5 at a time)

    I will look at link to article from Pinal Dave. However appreciate if someone could answer my last question. Basically, will JOIN as I have stated work? FWIW, I'm using SQL Server 2008.

    Thanks in advance.

  • Pagan DBA (8/18/2015)


    Thanks guys, but like I said in my original post I already know processing account transaction delta one at a time is slow. Hence this post.

    And, I can look into making things efficient like storing in table and indexing (while I am not seeing how this is appropriate given my input changes on demand for only that set of transactions I need to get deltas for say 5 at a time)

    I will look at link to article from Pinal Dave. However appreciate if someone could answer my last question. Basically, will JOIN as I have stated work? FWIW, I'm using SQL Server 2008.

    Thanks in advance.

    At this point and if it were me, I'd just try it. One test is worth a thousand expert opinions. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply