• sgmunson - Monday, June 26, 2017 12:28 PM

    Jeff,

    Thanks for the test rig.   Yes, the triangular join isn't pretty.   I chose to take a look at modifying it to have just a single character visitor so that I could at least easily see what the results should be, as the number of different visitors gets so high that repeats are almost non-existent, and in the world of the real web, that's not all that representative of real traffic.   Having done that, the solution runs in 1 minute, 9 seconds.  Not necessarily a problem in a report, but still a lot more resources than I would prefer to spend.   What other technique, however, can deliver the number of removals?   It could be that I just haven't put on my thinking cap quite straight enough, but I'm all ears for technique on this.

    Thanks to the PK on both the DT and Visitor columns in the originally defined problem, it's not actually like real life at all because there's nothing to stop a Customer (or whatever) from visiting more than once a day in real life. 😉

    Heh... I do appreciate the code but a report that takes a minute 9 to solve just 10,000 rows would be a real problem for me even if I only had to run it once a month because I know that 10,000 rows will be just the tip of the iceberg in the future.

    I'll take a look and see what else can be done.  Just as a bit of info, I can tell you that it won't involve an rCTE that increments a date by 1 day.  A well written WHILE loop can beat those pretty easily.  Again, if you haven't already read it, please see the following article for why you shouldn't use rCTEs that "count by one" even for low row counts.
    Hidden RBAR: Counting with Recursive CTE's

    As a teaser from that article, here's a performance chart of an rCTE vs. 3 other methods of counting.  The rCTE is the Red skyrocket on the left.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)