Need my SQL to go retro (from 2005 to 2000)

  • I have the following SQL which works in SQL Server 2005, but I need to replicate the functionality for SQL Server 2000. Unfortunately, the "row_number()" and "with partitioned as" syntaxes don't exist in the older version of SQL Server, so I can't build my tables or run my reports. I tried using the identity function in the first step (building the avg_days table) with the order by clause, but the row numbers are out of order (I want to order by customer_key asc, calendar_date asc, transaction_id asc). Any ideas would be welcomed as I am out of them and need an answer asap.

    Thanks!

    -------------------------------------------------

    -------------------- AVG_DAYS -------------------

    -------------------------------------------------

    Select distinct A.customer_key, b.transaction_id, c.calendar_dt

    into avg_days

    From crdw_customer A, crdw_txn_header B, crdw_date_time C

    Where A.customer_key = B.customer_key

    And B.time_key = C.time_key

    -------------------------------------------------

    -------------------- AVG_DAYS_1 -----------------

    -------------------------------------------------

    WITH Partitioned AS (

    SELECT *, ROW_NUMBER() OVER (PARTITION BY customer_key, calendar_dt ORDER BY customer_key, calendar_Dt) AS RowNumber

    FROM avg_days)

    SELECT *,

    CASE

    WHEN RowNumber > 1 THEN 0

    ELSE COALESCE(DATEDIFF(DAY, (SELECT MAX(calendar_dt) FROM avg_days WHERE calendar_dt < a.calendar_dt AND customer_key = a.customer_key), a.calendar_dt), 0)

    END AS Days_between

    into avg_days_1

    FROM Partitioned a

  • If you replace your "SELECT INTO" that included an identity function into a "CREATE TABLE" with the identity column, and then did an "INSERT/SELECT... ORDER BY", the identity column will be in the correct order, and you can move forward with your original plan.

    Edit: I should probably elaborate on this a bit. While many people find that the SELECT INTO... ORDER BY works for them in this situation, it is not guaranteed, and depends primarily on where the optimizer decides to place its identity generating function. With the other method, you're not dependent on playing optimizer roulette.

  • The problem is simpler when you realize you don't need a common table expression, ROW_NUMBER(), or a correlated subquery.  Once you put an identity field on avg_days and get the data sorted by customer, date, and transaction, you just need a self-join to compare each record to the one before it.  If the customer is the same and the date has changed, calculate Days_between.

    -- Fill the avg_days table with CREATE & INSERT/SELECT

    CREATE TABLE avg_days (

          id int identity not null primary key clustered,

          customer_key int,

          transaction_id int,

          calendar_dt smalldatetime)

    GO

    INSERT INTO avg_days (customer_key, calendar_dt, transaction_id)

    SELECT A.customer_key, B.transaction_id, C.calendar_dt

    FROM crdw_customer AS A

    INNER JOIN crdw_txn_header AS B ON A.customer_key = B.customer_key

    INNER JOIN crdw_date_time AS C ON c.time_key = b.time_key

    ORDER BY A.customer_key, C.calendar_dt, B.transaction_id

    -- Compare each record to the one before it. If same customer, different day then calculate Days_between

    SELECT a.customer_key, a.transaction_id, a.calendar_dt,

          COALESCE(DATEDIFF(DAY, b.calendar_dt, a.calendar_dt), 0) as Days_between

    INTO avg_days_1

    FROM avg_days AS a

    LEFT JOIN avg_days AS b ON a.id = b.id + 1 AND a.customer_key = b.customer_key AND a.calendar_dt > b.calendar_dt

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply