"ID" primary key from two separate tables as a foreign key in a new table

  • jcelko212 32090 wrote:

    >> The professorID is likely both the natural ID for the professor and the identity of the row. <<

    The professor_id is probably a tax identification number of some kind. ... Validation is done either by a trusted authority, or a process. An example of a trusted authority would be the taxing authority, to whom we are paying part of this professor's salary. His tax identification would be done by table lookup or a combination of table lookup and a process to see that his tax id is correctly assigned to this guy.

    The use of an identity table property is wrong because it's not an attribute of the entity being modeled. Again, this is straight Dr. Codd.

    And after all of this messing around, did you notice that you still have to have the real key in the database? TheID-iots are usually using identity or GUID to mimic a pointer chain in a non-RDBMS data model.

    >> Suppose you do use a string for the ID: how, then, do you determine the next value to be used?<<

    You said something that makes no sense in a relational model. Remember that this is a set oriented language, not one based on sequential file structures. The concept of "next" doesn't exist. If you want to do a sequence of some sort, you should be using a CREAT SEQUENCE statement and have designed that particular data element properly, instead of depending on the current hardware to do it for you.

    I'd say ZERO chance that the professor id is some type of tax number.  The security implications of that alone would prevent it.  For many, their tax number would be their Soc Sec Number -- only an IDiot would want to use that a key that the entire IT staff would be privy to.  Unless they were issued a fed tax number (EIN), but I don't see why most university profs would need those.

    If you imagine that there is some magic group issuing some global number for university professors, I guess you could claim to use that as a key.  But in the real world, that just doesn't exist.

    No one is trying to mimic a pointer chain.  No modern database person even thinks of those any more.  You need to stop making that ancient claim.  It's just silly.

    The professor id is almost certainly just a sequential number assigned by the university.  It's issued at the state level, at highest, since most universities are controlled at the state level.  I can't imagine there's any U.S.-wide authority issuing numbers (other than a fed tax number, which was dealt with above).

    Again, it's the theoretical vs. the practical.  If you waited until there was a universal, verified id / number for all professors at all universities, you'd never be able to create a db, since such an id simply does not exist.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • >> I agree that you should only store attributes as numeric types if you plan on doing math on them, .. <<

    I agree and would add a caveat. Make sure you use the right kind of numeric data type for the data. Decades ago, when I was a FORTRAN programmer, we spent the better part of two weeks learning about floating-point arithmetic. It's a lot trickier than younger programmers think. (a + b) + c <> a + (b + c), the library functions were weird from compiler to compiler, and so forth. IBM and Univac compilers could have some really different results. There is a series of calculations known as the Gibson mix that was used to test floating-point math.

    Today, I tell people to use DECIMAL(s,p) in SQL because you have better control from product to product and a lot fewer surprises. I have one job where the idiot had used floating-point for all of his numerics, including things like ZIP Codes! perhaps this was like the train station platform numbers in the Harry Potter novels.

    >> .. but if you are using only surrogate keys as you should be, no one should ever see the primary key. <<

    Hidden from you? Like pointer chains in non-relational databases, like Total, Image, IDMS, IMS, etc.? Did you ever work with C? One of the first rules was never do pointer arithmetic. It never ended well. The same principle applies to network databases, like the ones you're trying to build with SQL

    Since the key is actually data, having datatypes, constraints, and references and it is part of your basic data model, makes a little hard to do validation and verification on it, or even to think about a join.

    You need to have that data to have a valid data model, and what you seem to have  called a "surrogate" is what I would call "needless unverifiable redundancy"

    >> I guess the question is whether the professorID is the ID number of the professor or if it is only the identity of the record [sic] in the table. <<

    I have a mania that got put on T-shirts at one point, which says, "tables are not files, rows are not records, columns are not fields" to stress the differences between a file system and SQL. Since the professors are entities in the data model, setting up pointers to their physical storage is just not RDBMS. Again, this is straight from Dr. Codd.

    >> there is a reason to use integer, it is 8 bits smaller than char(10), will query faster, and consume less memory. Granted char(10) has more possible values, but as a DBA do you really want to have to determine whether an index value is a tab or a space, or figure out if your key is .. <<

    Storage has not really been a problem for most applications. Since the 1970s, disk and other bits of sand are cheap and plentiful. The real problem is the access time, and we've gotten really good at avoiding indexes and other linear storage methods in favor of hashing, parallel processing, etc. for most of us, frankly, the traditional indexes (ISAM, B-Tree, B+Tree Red-Black. etc) that we inherited from filesystems are good enough.

    >> Again, it's just impractical to use a string rather than a numeric, whether you do math on the value or not. <<

    Wrong.

    >>Suppose you do use a string: how, then, do you determine the next value to be used? <<

    Again in RDBMS. There is no concept of linear ordering, so the concept of next doesn't apply.

    >> Also, the overall performance of an int is vastly better. An int is at least 6 bytes smaller than a char(10) (with no compression, even less with compression). <<

    Vastly? No it isn't. In modern hardware, I'm grabbing 64 to 128 bits at a time. I'm also buffering data and doing all kinds of hardware tricks to push lots of bits through my machine. I see you still use "int" instead of saying "INTEGER NOT NULL" which usually means your last procedural programming language was in the C family. He asked

    >> And, to prevent non-numeric chars, you'll have to add CHECK conditions [sic: a constraint is not a condition] for a string ID that will have even more complexity and overhead. <<

    Don't you try to put as many CHECK() constraints into your DDL as you possibly can? This means you have to do those checks one way, one time and have a common single point of validation for the hundreds of routines that use your data. The constraint will be validated when data enters the database; that's one time for each transaction that is involved in at most.

    The most complex constraint I've ever had to write was a dihedral five check digit.. It's an ugly thing that involves its own table lookup, but it catches virtually every possible input error. I'm usually happy with just a Luhn check digit, but if you're working with data that absolutely, positively has to be right, or it will kill people, then it's worth going the extra nanoseconds for a little complexity.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • <<

    Hidden from you? Like pointer chains in non-relational databases, like Total, Image, IDMS, IMS, etc.? Did you ever work with C? One of the first rules was never do pointer arithmetic. It never ended well. The same principle applies to network databases, like the ones you're trying to build with SQL

    Since the key is actually data, having datatypes, constraints, and references and it is part of your basic data model, makes a little hard to do validation and verification on it, or even to think about a join.

    You need to have that data to have a valid data model, and what you seem to have  called a "surrogate" is what I would call "needless unverifiable redundancy"

    >

    No one as in, the application or user. The value of the key doesn't matter because it is not a property of the entity. Only SQL needs to use it to relate, and perhaps occasionally a DBA doing one-off maintenance.

    <<

    >> Also, the overall performance of an int is vastly better. An int is at least 6 bytes smaller than a char(10) (with no compression, even less with compression). <<

    Vastly? No it isn't. In modern hardware, I'm grabbing 64 to 128 bits at a time. I'm also buffering data and doing all kinds of hardware tricks to push lots of bits through my machine. I see you still use "int" instead of saying "INTEGER NOT NULL" which usually means your last procedural programming language was in the C family.

    >

    It can be. Especially joining hundreds of millions or billions of rows, dealing with big cartesian products or when involving things like column store indexes. I've experienced it myself extracting data from a database that used GUIDs for primary keys, artificially generating an integer to replace the guid keys, then extracted into another table greatly sped up queries with many joins.

     

     

  • I used the script you provided and it was very helpful, thank you. One issue I am running into is that the professor_id and student_id values won't appear in the contacts table despite them existing in their respective tables. Also, I will try to run the alter/ foreign key statements and this error appears:

    Msg 1785, Level 16, State 0, Line 22

    Introducing FOREIGN KEY constraint 'FK__ContactsN__profe__14270015' on table 'ContactsNew' may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints.

    Msg 1750, Level 16, State 1, Line 22

    Could not create constraint or index. See previous errors.

    What can I do to navigate this error message and ultimately have both professor and student ID's appear when I select * from Contacts?

  • The use of the prefixed "TBL_" is a design error so bad it has a name; it's called the tibble.

    This is not a design error.  It does not violate the normal forms.  I might have preferred tblNameOfTable rather than TBL_NameOfTable.  So-called tibbling helps avoid reserved words and keeps from having to identify a table as a lookup or some such to avoid another table that would otherwise have the same name but a different function.

     

  • RonKyle wrote:

    tblNameOfTable

    While not really a problem, it does look at bit odd when tblNameOfTable turns out to be a view.

  • I would not prefix a view with tbl.  In any case I rarely use views in an OLTP environment.  I do make extensive use of them in my OLAP designs per Kimball's suggestion from his 3rd edition book.

  • RonKyle wrote:

    I would not prefix a view with tbl.  In any case I rarely use views in an OLTP environment.  I do make extensive use of them in my OLAP designs per Kimball's suggestion from his 3rd edition book.

    No, of course you don't deliberately prefix a view with "tbl".  You create a table, wrongly prefix it with "tbl", and then later that table needs to become a view instead.  No way you're going to rename everything, so now the name is misleading from then on.

    Or vice versa: a "view_whatever" needs to become a table instead.

    It's a terrible idea to put "tbl" or "view" in the table name (ditto for "proc" on general principles).  It is a very poor naming practice.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher wrote:

    RonKyle wrote:

    I would not prefix a view with tbl.  In any case I rarely use views in an OLTP environment.  I do make extensive use of them in my OLAP designs per Kimball's suggestion from his 3rd edition book.

    No, of course you don't deliberately prefix a view with "tbl".  You create a table, wrongly prefix it with "tbl", and then later that table needs to become a view instead.  No way you're going to rename everything, so now the name is misleading from then on.

    Or vice versa: a "view_whatever" needs to become a table instead.

    It's a terrible idea to put "tbl" or "view" in the table name (ditto for "proc" on general principles).  It is a very poor naming practice.

    I agree about not prefixing table names with anything but sometimes it could be useful to prefix views with something, maybe "v_". Then if you are looking at a complex query someone else has written you would know where the data is coming from. Microsoft prefix most of their system stored procedures with "sp_".

  • Jonathan AC Roberts wrote:

    ScottPletcher wrote:

    RonKyle wrote:

    I would not prefix a view with tbl.  In any case I rarely use views in an OLTP environment.  I do make extensive use of them in my OLAP designs per Kimball's suggestion from his 3rd edition book.

    No, of course you don't deliberately prefix a view with "tbl".  You create a table, wrongly prefix it with "tbl", and then later that table needs to become a view instead.  No way you're going to rename everything, so now the name is misleading from then on.

    Or vice versa: a "view_whatever" needs to become a table instead.

    It's a terrible idea to put "tbl" or "view" in the table name (ditto for "proc" on general principles).  It is a very poor naming practice.

    I agree about not prefixing table names with anything but sometimes it could be useful to prefix views with something, maybe "v_". Then if you are looking at a complex query someone else has written you would know where the data is coming from. Microsoft prefix most of their system stored procedures with "sp_".

    "V_" still a terrible idea.

    I prefix procs I need to put in the master db with "sp_", because otherwise they don't have the "special" property that the "sp_" stands for.  "sp_" does not stand for "stored procedure" as everyone assumes.  It allows procs in the master db to by default run in the current db context, for example, sp_help.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • I've been doing this a long time and I haven't found anything "terrible" about it.  When I'm working with other naming conventions it can sometimes be hard to otherwise avoid a wordy table name.  I don't prefix views, nor have I had a case of turning a table into a view.  Not sure how that would come about.  I also don't prefix stored procedures.  When I've been in environments that prefix them "sp_" I let them know that they should leave that prefix to Microsoft.  If stored procs must be prefixed then use usp_.

  • RonKyle wrote:

    nor have I had a case of turning a table into a view.  Not sure how that would come about.

    We have a few old third party systems that use the tbl convention. All of them have at least one view which starts with tbl. ie When a system evolves over a thirty year period it is quite likely that a table will be turned into a view.

     

  • ScottPletcher wrote:

    Jonathan AC Roberts wrote:

    ScottPletcher wrote:

    RonKyle wrote:

    I would not prefix a view with tbl.  In any case I rarely use views in an OLTP environment.  I do make extensive use of them in my OLAP designs per Kimball's suggestion from his 3rd edition book.

    No, of course you don't deliberately prefix a view with "tbl".  You create a table, wrongly prefix it with "tbl", and then later that table needs to become a view instead.  No way you're going to rename everything, so now the name is misleading from then on.

    Or vice versa: a "view_whatever" needs to become a table instead.

    It's a terrible idea to put "tbl" or "view" in the table name (ditto for "proc" on general principles).  It is a very poor naming practice.

    I agree about not prefixing table names with anything but sometimes it could be useful to prefix views with something, maybe "v_". Then if you are looking at a complex query someone else has written you would know where the data is coming from. Microsoft prefix most of their system stored procedures with "sp_".

    "V_" still a terrible idea.

    I prefix procs I need to put in the master db with "sp_", otherwise they don't the "special" property that the "sp_" stands for.  "sp_" does not stand for "stored procedure" as everyone assumes.  It allows procs in the master db to by default run in the current db context, for example, sp_help.

    I never prefix a view with anything when I design it but I have worked on systems where all views have the same prefixed and I've never found it to be much of an issue.

    Thanks, I didn't know that about the sp_ procedures.

  • >> I'd say ZERO chance that the professor id is some type of tax number. The security implications of that alone would prevent it. For many, their tax number would be their Soc Sec Number -- only an IDiot would want to use that a key that the entire IT staff would be privy to. Unless they were issued a fed tax number (EIN), but I don't see why most university profs would need those. <<

    Having actually been a college professor (several times at several different schools) , I know that we do use the SSN in personnel systems. The issue of security is handled in a different level. The entire IT staff is not privy to personnel records. We create views, masking, other levels of access. Nobody spends any learning DCL; a lot of SQL programmers don't even know that it's the third sub language! In fact, the first draft of X3H2's SQL allowed you to GRANT, but not to REVOKE privileges from users. We were just trying to get out a standard as fast as possible, to keep SQL from mutating into dialects like BASIC had done.

    One of my favorites from my AIRMICS days was an approach created by the U.S. Navy. It actively lies to users depending on their clearance. Jimmy Olsen is shown information that leads him to believe that "Clark Kent is a mild-mannered reporter for a great metropolitan newspaper." However, Lois Lane can find out that "Clark Kent is secretly Superman!" For financial systems, the security system allowed only aggregations to be reported and it would also mess them up. Thus, nobody can get to their own payroll information and give themselves an unexpected raise.

    There was an old Doonesbury cartoon strip where one of their characters was working for a company was very proud of the fact that when they embezzled a certain amount of money, the system would automatically move it to a Swiss bank account and additionally book a flight to Zürich without human intervention. AI at its finest!

     

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • >> ... joining hundreds of millions or billions of rows, dealing with big Cartesian products or when involving things like column store indexes. I've experienced it myself extracting data from a database that used GUIDs for primary keys, artificially generating an integer to replace the guid keys, then extracted into another table greatly sped up queries with many joins. <<

    People seem to forget that GUID or UUID stand for "global" or "universal" identifiers. We never intended them to be used inside the database. They were supposed to reference objects outside of the database. But when your design stinks and you need a kludge, you can use one of them without any thought, planning or design. When I first moved to Austin, I worked for a company that used GUID in some of their software products instead of a true relational key. All they wanted was a "unique-ifier", like a pointer was in the old databases., The results quickly became unworkable because you're always having to do garbage collection. Before you knew it you had exceeded disk space. Your trick of replacing the GUID with some integer just adds another layer of complexity – – how did you tie each integer back to a particular entity? How did you make sure you never use the same integer twice?

    The IDMS and IMS database systems that you're imitating this way had utility programs to rebuild the pointer chains, and do garbage collection on them. If you worked with the C programming language, you probably have seen a simple version of this with malloc. In a multiuser database, it can get very complicated.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

Viewing 15 posts - 16 through 30 (of 33 total)

You must be logged in to reply to this topic. Login to reply