SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Auto-suggesting foreign keys and data model archaeology


Auto-suggesting foreign keys and data model archaeology

Author
Message
Dave Poole
Dave Poole
SSC Guru
SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)

Group: General Forum Members
Points: 56968 Visits: 3932
Comments posted to this topic are about the item Auto-suggesting foreign keys and data model archaeology

LinkedIn Profile
www.simple-talk.com
Bill Talada
Bill Talada
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10134 Visits: 2296
Wow, lots of work to try to programatically figure out keys!

Thank heavens other companies don't use any constraints or unique indexes or normalization; they eventually provide customers to me.

Having done hundreds of imports and conversions, I don't assume matching column names nor matching data types in commercial software. I do assume orphaned data is present that will prevent me from applying suggested primary and alternate and foreign keys. The process to figure out a non-normalized database is even more painful. Being human I have the ability to spot patterns that computers can't quite do yet. For now I'll stick to doing this manually.
Eric M Russell
Eric M Russell
SSC Guru
SSC Guru (101K reputation)SSC Guru (101K reputation)SSC Guru (101K reputation)SSC Guru (101K reputation)SSC Guru (101K reputation)SSC Guru (101K reputation)SSC Guru (101K reputation)SSC Guru (101K reputation)

Group: General Forum Members
Points: 101212 Visits: 14095
Given a database with no defined foreign key constraints, analyzing schema information is essentially inferring relationships by static analysis. Another approach is looking at join operators within stored procedures or even cached sql execution plans.


"The universe is complicated and for the most part beyond your control, but your life is only as complicated as you choose it to be."
Bill Talada
Bill Talada
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10134 Visits: 2296
My worst conversion/import was from a database using a couple generic tables; most other tables were defined as data in the generic tables. This metadata style design had to come from application developers. The performance must have been horrendous.

Functions, views, stored procedures, and even printed reports sometimes need to be used to try to reverse engineer a mess. It is a shame. A properly designed database should be self evident in how it is to be used.
cyp901
cyp901
SSC-Addicted
SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)

Group: General Forum Members
Points: 433 Visits: 548
The comment about using query plans to determine the columns used in joins is definitely a worthwhile approach.


This article was great and I wish developers especially would think more about foreign keys.
This may have been hinted at in the article but use caution when 'refactoring' an existing database that is used for a front end application. This may be obvious to some but often times the data integrity responsibility (incorrectly) gets placed in the application itself. So keep that in mind and don't rush out to reinvent the wheel. Although, definitely try to reinvent it where necessary.
Bill Talada
Bill Talada
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10134 Visits: 2296
Rarely if ever will query plans be available on a database given to a data warehouse or for importing. Otherwise running profiler or looking at DMVs might help.

The worst case is where most every table contains PK and FK columns called ID or similar that potentially join everywhere.

The best case is when GUIDs have been used. It is so much easier to believe a join than when identities are used and every table has the value 1 through 100 in several key columns.
robert.sterbal 56890
robert.sterbal 56890
SSCrazy Eights
SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)

Group: General Forum Members
Points: 8304 Visits: 1580
https://msdn.microsoft.com/en-us/library/ms186778.aspx

The information schema views included in SQL Server comply with the ISO standard definition for the INFORMATION_SCHEMA.

Seems like it might be worth the effort to understand this better.
Dave Poole
Dave Poole
SSC Guru
SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)SSC Guru (56K reputation)

Group: General Forum Members
Points: 56968 Visits: 3932
The trick here was to try to encapsulate what a human would do but in code.

If you do this sort of thing once in a blue moon then manual approaches are as good as any. As to using execution plans, what would be your recommendation for doing so with minimal impact on production systems?

LinkedIn Profile
www.simple-talk.com
robert.sterbal 56890
robert.sterbal 56890
SSCrazy Eights
SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)

Group: General Forum Members
Points: 8304 Visits: 1580
Wouldn't it make more sense to capture a trace and replay it on a non production device?
cyp901
cyp901
SSC-Addicted
SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)SSC-Addicted (433 reputation)

Group: General Forum Members
Points: 433 Visits: 548
Bill Talada (1/26/2016)
Rarely if ever will query plans be available on a database given to a data warehouse or for importing. Otherwise running profiler or looking at DMVs might help.

The worst case is where most every table contains PK and FK columns called ID or similar that potentially join everywhere.

The best case is when GUIDs have been used. It is so much easier to believe a join than when identities are used and every table has the value 1 through 100 in several key columns.


I'm not sure I agree with the 'Rarely if ever' part of your statement.
This may have been true in the past...
However, since the management data warehouse was added to SQL Server many years ago, it's far more common to have a pool of plans to draw upon. This is only going to be more true moving forward with SQL Server 2016 and the Query Data Store.
While I understand not having plans for a database if you do not have access to the server with the production workload.
I do not think that poor planning and administration practices is a reason to ignore a powerful approach to a problem.

This is also one of the first times I have seen "The best case is when GUIDs have been used", usually for performance reasons I've found the opposite to be true. I do think you make a valid point here though for identifying uniqueness.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search