Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Does Change Data Capture require a table have a primary key Expand / Collapse
Author
Message
Posted Sunday, January 13, 2013 7:21 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 11:01 AM
Points: 346, Visits: 595
or can it record before and after column changes based on the LSN only?

An extract from a file based legacy accounting system is performed every night. The system does not have a primary key because transactions are managed through program code. (the more things change...). The extract is copied to text in Unix and FTP'd to Windows, where the file is loaded into SQL Server by kill & fill. Because of the expense of modifying the source system, there is enormous inertia/resistance to injecting a primary key at the source, so kill & fill it stays.

In reading about Change Data Capture, it seemed to me that column level insert update and delete are stored in helper tables that remember the before and after content of each column tracked. In my reading I have seen many references to the LSN to decide when and what to record as changed, but I have not seen any refereference to the necessity of a primary key for Change Data Capture to work. This is in contrast to replication, where the requirement for the existence of a primary key is made plain.

Is it possible to use Change Data Capture against a table without a primary key?
I would love to use it to change the extract from kill and fill to incremental.

thanks very much for your help
drew
Post #1406539
Posted Monday, January 14, 2013 12:31 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: 2 days ago @ 7:37 AM
Points: 860, Visits: 2,323
To my knowledge the inbuilt CDC component in SQL server doesnt require a Primary Key column.

The reason is this is simply a log of the activity, so if you insert, update then delete a record you will get 4 rows in the CDC table, the Original Insert, the Update Before, the Update After and finally the Delete.

From a data gathering perspecive you also have the opyion of either retrieving all the different changes or just the last change.

Hope this clarifies things.


_________________________________________________________________________
SSC Guide to Posting and Best Practices
Post #1406578
Posted Monday, January 14, 2013 3:39 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 4:53 PM
Points: 1,746, Visits: 2,553
CDC requires a unique index. That makes sense given what it has to be able to do.

I think by default it uses the PK as the unique index, but you can specify a different unique index when you enable CDC on the table and SQL will use that instead of the PK. So a PK should not be required, although a unique index will be.


SQL DBA,SQL Server MVP('07, '08, '09)
I'm not fat, I'm gravity challenged.
Post #1406968
Posted Tuesday, January 15, 2013 1:44 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: 2 days ago @ 7:37 AM
Points: 860, Visits: 2,323
Scott,

Do you have a link to that, as I didnt think that either a unique or PK were required for CDC as I cant find anything on Technet or in the CDC documentation that states one of these must be present.


_________________________________________________________________________
SSC Guide to Posting and Best Practices
Post #1407081
Posted Tuesday, January 15, 2013 2:20 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Yesterday @ 12:20 PM
Points: 41,529, Visits: 34,445
A unique index is required if the table is to support net changes as well as all changes

http://msdn.microsoft.com/en-us/library/cc627369%28v=sql.105%29.aspx
A capture instance will always include a table valued function for returning all change table entries that occurred within a defined interval. This function is named by appending the capture instance name to "cdc.fn_cdc_get_all_changes_".

If the parameter @supports_net_changes is set to 1, a net changes function is also generated for the capture instance. This function returns only one change for each distinct row changed in the interval specified in the call.

To support net changes queries, the source table must have a primary key or unique index to uniquely identify rows. If a unique index is used, the name of the index must be specified using the @index_name parameter. The columns defined in the primary key or unique index must be included in the list of source columns to be captured.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1407101
Posted Tuesday, January 15, 2013 6:12 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 11:01 AM
Points: 346, Visits: 595
First, thank you Jason, Scott and Gail for your attention, time and advice.
Second, NUTS! the requirement of a pk or unique index torpedo's my intention, as the source does not create uniqueness and attempts to create the minimum requirement, a unique index, consistently fail on the existence of duplicates.
The stewards of the accounting system (Finance and Operations) assert that it is not possible to have either constraint on the table in the accounting system.
When we get the source data on the SQL side, trying to uniqueify the table using the most important columns (claim number, date of service, date submitted, member, provider, diagnosis, procedure) consistently fails on duplicates because a provider can decide to rebill for the same exact event. Adjustments and reversals also complicate the uniqueification.
At the risk of the lash, I could try to create a 'covering' index to uniqueify the table, but the table is twelve million rows deep and 138 columns wide.
Third, I must be dumber than dirt because in tinkering with CDC I came to realize that the kill and fill method leaves me no where...CDC does not permit truncation and deletion indeed marks the target rows as deleted, but that means the source data has nothing in the target to compare it to and that the new source data is all there is!
This makes me to think we must use two tables for source and target, target holding yesterday's extract, and source holding today's, but since there is no key and no index we'd need to compare each source row and column to each target row and column with something like a Merge When Not Matched, Outer Join or Not In but against 138 columns?...sounds like a non-starter.
I do not want to give up replacing kill and fill with something more rational, but my pea brain is not big enough to wrap itself around the problem.
thanks again
drew
Post #1407216
Posted Tuesday, January 15, 2013 6:20 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Tuesday, December 03, 2013 11:33 PM
Points: 1,789, Visits: 1,013
add a new column int identity and add the key on it , or create a checksum value for each roe

Jayanth Kurup
Post #1407222
Posted Tuesday, January 15, 2013 6:25 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 11:01 AM
Points: 346, Visits: 595
My initial reaction is while that would create uniqueness based on the id, the content of the row is still duplicate.
I will try it out and see if that gets me home.
thanks very much
Post #1407226
Posted Tuesday, January 15, 2013 6:27 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Yesterday @ 12:20 PM
Points: 41,529, Visits: 34,445
drew.georgopulos (1/15/2013)
Second, NUTS! the requirement of a pk or unique index torpedo's my intention, as the source does not create uniqueness and attempts to create the minimum requirement, a unique index, consistently fail on the existence of duplicates.


Maybe read what I said again.

A unique index is needed if you want to enable the table for the net change function as well as the all changes function. If all you want is all changes, then the requirement for a unique index (for the net changes) falls away



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1407227
Posted Tuesday, January 15, 2013 6:33 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 11:01 AM
Points: 346, Visits: 595
i missed that...sorry to have been so dense.
i'll try again at work
thanks very much
Post #1407234
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse