No PK on any tables - Need that to enable Change Tracking on SQL Server

  • We have a vendor application that is now out of our support. We need to extract the data from this source and move it to Snowflake via Fivetran (ETL Tool). Fivetran needs Change Tracking enabled on the source so that it can pick up only the changes since the last cycle. The issue we are facing is there are no PKs on the source.

    That vendor database has a clustered indexes (not unique not designated as PK) on some of the tables and when I check they are unique. So I can make them PKs. But there are some tables that don't have a unique index nor a clustered key, forget about PK :-(. One option is for me to see all the indexes on the table if any are unique.

    The other option I was thinking was to add a Identity column to all the tables and make that the PK. But if they wrote the SQL the following way then we have an issue.

    declare @mytest table (col1 int identity, col2 varchar(10))

    declare @mytest2 table (rowid int identity, name varchar(10))

    insert into @mytest select * from @mytest2

    I am lost. Any ideas or thoughts? I greatly appreciate your help and time.

    • This topic was modified 1 week, 5 days ago by  cbarus.
  • I'm not sure I understand the example that's giving you problems. Yeah, if the column names are different, you have to explicitly state them on an INSERT ... SELECT. That's how T-SQL works. No big deal. Further, if you're migrating from @mytest to @mytest2, why even bother giving @mytest an IDENTITY. Inserting into @mytest2 will generate the identity values, no need to do it twice.

    And yeah, if you need a PK, you'll have to create one. No choice there.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore Roosevelt
    The Scary DBA
    Author of: SQL Server 2022 Query Performance Tuning, 6th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • Thanks very much for your time Grant.  The reason why I had Identity on the second table also is because, if I add a surrogate key column to all the tables to make that column as PK and an INSERT from table1 into Table2 (with lot more joins of course) and the SQL is written as INSERT INTO table1 SELECT * from Table2, then we are in trouble.

    Unfortunately, as this is a Vendor application, we don't have visibility to the Code.  I started a trace on the DB to see what kind of statements come in.

    BTW, I forgot to mention, the size of the DB is 1.5 GB if that gives any other ideas.

    Thanks again.

    • This reply was modified 1 week, 4 days ago by  cbarus.
  • May be this is a better example of the approach to add a Identity column to all the tables to make them the PKs.  Just thinking through.

    INSERT INTO Table 1

    SELECT B.*, C.Col1

    FROM Table2 AS B

    INNER JOIN Table3 AS C

    ON B.<some column> = C.<some column>

    WHERE <some condition>

  • The "application" is out of support and yet the database is separate and updatable?  If at all possible I wouldn't touch a single thing.   If something breaks... better to treat the whole thing like a static artifact.  Companies like Red Gate (and their competitors) have "data diff" products which compare tables and generate change scripts.  These tools work best with unique integer primary keys in all tables.  However, afaik they're alleged to work without that too.  These tools have CLI's which means scripts are automatable.  Keep staging tables and run the comparisons on a powerful vm or pc.

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Thanks Steve.  Let me explore the diff tools option.  Issue is that Fivetran (ETL tool) goes after SQL Server CT system tables to get the changes.  I am sure there must be another way for them to get to the changes, but may be we are unable to find the right technical resource, I guess.  Will keep you all posted.

    Thanks for your time Steve

  • Steve Collins wrote:

    However, afaik they're alleged to work without that too.

    For Redgate Compare, you don't have to have a primary key. You do have to have a column, or set of columns, that uniquely identify the row. It won't work without that.

    ----------------------------------------------------
    The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore Roosevelt
    The Scary DBA
    Author of: SQL Server 2022 Query Performance Tuning, 6th Edition and SQL Server Execution Plans, 3rd Edition
    Product Evangelist for Red Gate Software

  • BTW and just to be sure, if you're enabling change tracking for some legal reason, it won't hold up in court.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I'm thinking that with only 1.5 GB of data,  you could create a series of triggers that insert or update the data into a second database that has primary keys and change tracking defined.  You could then run the ETL tool against the second database.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply