I recently blogged about a solution I had decided to use in order to solve a problem related to PayPeriod Matching. The result needed to meet a few requirements. One of those requirements was to arrive at the results without the use of a Table. I did not want to create a table that may need to be maintained down the road. I also wanted to give myself a little more time to make sure the database being used in the warehouse was not involved in one of the ETL processes that actually restores a the database from a different system. Another requirement was that the current payperiod create an aggregate as well as the prior pay period create an aggregate. This information was to be consumed by Reporting Services for a report. At the time, I now realize that, I had insufficient data for this to work as desired. Thus, I needed to revisit the solution and make a couple of adjustments.
The first adjustment to be made was the creation of a table and the elimination of the CTE. Though the CTE performed very rapidly in every single test I threw at it, it bogged down during our month-end processing. That is another process that will be revised shortly and will not be so resource intensive nor will it be so time intensive. Anyway, that is a topic for another discussion. Present circumstances required an update to the proc that I created in order to make it perform better and regain, on a more long-term basis, the performance the CTE showed during testing and the first couple of weeks it lived in production prior to month-end processing. I went ahead and created the table for the payperiods.
Not only did I create the table due to performance reasons, but it also simplified my query later in the proc. The aggregates for the previous payperiod needed an easier way to be retrieved. There was also an inaccuracy in my query. With more data, I was able to spot it. All previous payperiods were being lumped into the previous payperiod – though I only wanted the immediate previous payperiod and nothing more. This caused the aggregates to be incorrect. The solution could have still been achieved through the use of the CTE, however I wanted to simplify it a little and produce a faster result.
My final solution does not eliminate all CTE’s – merely the PayPeriod Table population CTE. Now I use a CTE to retrieve the current payperiod and then recursively pull in the previous payperiod. The table was created exactly like the CTE with an ID field, PeriodStart and PeriodEnd. I decided the simplest method to ensure I would only aggregate on the two payperiods in question was to only pull those two payperiods into the query. I wanted to be certain that I could only have two periods in play at any time.
; WITH twoperiods AS (SELECT top 2 p1.PeriodID,p1.startdate,p1.enddate FROM PayPeriods p1 INNER JOIN LCComm_PayPeriods p2 ON p2.PeriodID = p1.PrevID WHERE (@CurrentDate BETWEEN p1.Startdate AND p1.Enddate) UNION ALL SELECT p2.PeriodID,p2.startdate,p2.enddate FROM PayPeriods p1 INNER JOIN PayPeriods p2 ON p2.PeriodID = p1.PrevID WHERE (@CurrentDate BETWEEN p1.Startdate AND p1.Enddate) )
With this method, you can see that I peform a top 2 operation in the base query from the PayPeriods table. Without the recursive definition on this query, the base query will only return 1 record. With the recursion, it will only return two records. In addition to that change, I changed the Left Joins later in the query to the following:
LEFT OUTER JOIN twoperiods PS ON MIS.RecordDate BETWEEN PS.StartDate AND PS.EndDate AND @CurrentDate BETWEEN PS.StartDate AND PS.EndDate LEFT OUTER JOIN twoperiods Prev ON MIS.RecordDate BETWEEN Prev.StartDate AND Prev.EndDate AND IsNull(PS.StartDate,0) = 0
And then one final change of note. I changed the aggregation on the Previous PayPeriod to the following:
, Case When IsNull(Prev.StartDate,0) = 0 Then 0 Else 1 End AS PreviousPayPeriod
This was much simpler than what I was trying to use previously. I also found a nice side effect of using the top clause in the base query of the CTE. When using the top in a recursive query, it appears that
OPTION (maxrecursion n)
is no longer necessary. I tested this and retested to verify results. Just another way of controlling a recursive CTE in SQL server.
I was happy with the first query that I came up to meet this requirement. I am much more satisfied with this revision. Query times are <= 1 sec and other performance indicators are positive. Of course, using the table, I can now use indexes on the date ranges which should help query performance somewhat as well.
Despite meeting the requirements in the last article, and the query being pretty cool in performing what it did – sometimes it really is better to test other methods. Even with the need to maintain this table (maybe), the consistent performance gains and accuracy outweigh the desire to not create that table. It is a good idea to test multiple methods in some cases to ensure best path decision is made. It was a good exercise to come back to this one and redo the query – I learned at least one new trick (really a few). Learning something new made it worthwhile.