I am trying to normalize this table but I am not sure if I'm on the right track

  • I'm working on normalizing the following table:

    Correspondance_Received_with_Types

    I'm completely new to data normalization. Although I've read up on 1NF, 2NF and 3NF I'm not that confident in doing it completely on my own yet. I did however stumble across Table Analyzer in Access and I thought I'd give it a go. Here is the outcome after using 'ticket number' as a primary key.

    Table Analyzer

    How accurate is this? and is there a similar tool in SSMS (for example) or elsewhere that can perform this task better?.

    Thanks in advance.

    • This topic was modified 2 months ago by  Mr_X.
  • I've had decades of experience doing data modeling, but I have no clue for your tables.  You haven't described any of the data.

    Normalization deals with a logical design, containing "entities" and "attributes" (not "tables" and "columns").

    "Tables" and "columns" only come later when you convert the logical data model to a physical data model.

    How are we supposed to know what CPZ means in your model?

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • You're right I should have given more detailed. Below please find attached some more information I've included. Please note, its a struggle getting all the information as I need to go through so many middle people. But this is what I have so far:

    Correspondence able

    Please let me know if there is any additional information required. Also when normalizing tables to up to 3NF is there multiple variants that tables could be arranged or is there usually only one correct outcome?.

    Thanks in advance.

    • This reply was modified 2 months ago by  Mr_X.
  • CPZ Name & CPZ Code appear to be related. Is it one-to-one? (in which case they should be in a lookup table, and only reference CPZ Code in other tables)

    Or are only certain CPZ names valid with certain CPZ Codes? (in which case you might want a CPZ Code table, CPZ table, and an association table).

    Not a part of normalization, but I'd recommend you not use spaces in object names -- it will force the use of brackets or double quotes around those names.  Not critical, but it makes life easier -- Quoted/bracketed names are less readable & make writing/editing code more tedious. I recommend using proper casing (e.g., CpzCode, CpzName) or underscore between words in names instead.

  • Yes CPZ and CPZCode have a one to one relationship. Regarding the naming method you suggested, it is good practise and I will implement that in future. Thanks for the tip.

    So far I believe we have broken it down to two tables. One with the CPZCode and CPZ and the remaining columns in another table including CPZCode. Am I correct in saying this?. Could it be broken down further, or is this more than sufficient.

    Also when normalizing tables up to 3NF is there multiple variants that tables could be arranged or is there usually only one correct outcome?.

  • As Scott already suggested, think about entities & attributes.

    What are the entities (actors/objects/categories) involved? What attributes do the entities have? How do they relate? One-to-one? Many-to-many? (e.g., can you have more than one response to a ticket?)

    What are the entities represented by tables 1-5? Name them.

    It looks to me like maybe you have a Ticket or ServiceRequest entity.  It appears to have TicketNumber, ReceivedDate, Type.

    Can there be more than one response? If so, that's another entity.

    CPZ is an entity.

  • ratbak wrote:

    Not a part of normalization, but I'd recommend you not use spaces in object names -- it will force the use of brackets or double quotes around those names.  Not critical, but it makes life easier -- Quoted/bracketed names are less readable & make writing/editing code more tedious.

    Names at the logical level should be standard business names, NOT "computerized" names.  The logical model should also not be biased toward any specific dbms, such as SQL Server or Oracle.  Therefore, for example, "IDENTITY" should never appear in a data model.  Instead it should say something like "unique integer assigned as an identifier".

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Rules for normalization:

    (1) Write names in normal language, no "camel case" naming, etc.

    (2) Decide whether you want to use singular or plural names for Entities (there is lots of disagreement on this one among modelers).  But be consistent with it!  All singular or all plural.  (Btw, there is very little disagreement that physical table names should be plural.)

    (3) Make Attribute names clear, even if that makes them long.  A name can be shortened later if needed.

    (4) One of the single most important things to come out of the logical data modeling (normalization) is the attribute ("column") definitions.  This info remains incredibly valuable later.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Good point. I got tangled  in the jumping between tables and logical design too.

    My intent (not clearly stated) was for table naming.

  • OK, with that out of the way, what Entities do we have for this model?  We must be able to assign each Attribute to an Entity.  I'll make the Entity names bold so they can be easily recognized.

    Ticket is one, obviously (if we have a "Ticket Number" we must have a ticket).

    Service seems to be another.

    Employee / Person is another (to match "Response By").

    Controlled Parking Zone / CPZ   (If "CPZ" is standard business language for the modelers,  you can use that as the Entity name (and the table name).  Normally you don't use abbreviations, but one that is already known is acceptable.)

    Next step, we must assign each Attribute to an Entity.  If we have Attributes we can't assign, that's a good hint that we have another Entity that we haven't recognized yet.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Nice. I really like the way you broke it down. I've tried assigning the attributes so far. I've not thought about the keys yet though. Here it is:

    Ticket

    TicketNumber, Received, Type, ServiceCategory (This code is usually repeated several times and there are several of them. I can’t see a pattern)

    Service

    Serviceable

    Action

    ActionTaken

    Employee/Person

    Response By, Response On

    Cancellation

    Cancellation Group, Cancellation Reason

    Controlled Parking Zone/CPZ

    CPZCode, CPZ

    I hope I'm not too off.

  • Employee/Person: Response By indicates a relationship in another table to Employee/Person. It is not an attribute of Employee/Person.  Even less so is Response On datetime. Attributes of an Employee/Person might be things like Name (First Name, Last Name, and/or Full Name), Employee ID, Supervisor, Department, Start Date, etc.

  • Here's my initial cut at assigning Attributes to Entities.  I tend to fully normalize, but you can back off of that later if you want to (in technical terms, we can "denormalize" when we design physical tables to reduce the number of tables).  That is, in Ticket, "Type Code" could just be "Type abc" again and "Service Code" could be "Service abc".  Status Code should STAY.  Use an encoded number for Status, NOT an abc format.

    Btw, I could find no relationship between CPZ and Ticket.  CPZ description says "the area that the vehicle was parked".  What vehicle??

    Entity Name
    Attribute Name / Format / "References Entity"
    ------------------------------------------------

    Ticket
    Ticket Number integer
    Received Date date
    Type Code integer "Ticket Type"
    Serviceable abc
    Service Code integer "Service"
    Status Code integer "Ticket Status" /* this should remain a code,
    do NOT use a "status" attribute in format abc in Ticket entity */

    Ticket Status
    Status Code integer
    Description /* 'Initiated'? / 'Assigned'? / 'Cancelled' / 'Completed' / ... */

    Ticket Response (this must be a separate entity since there can be multiple responses)
    Ticket Number integer "Ticket"
    Ticket Response Id integer
    Response Date datetime
    Responder Id integer "Person"
    Action Taken abc

    Ticket Cancellation (I separated since it is very rare)
    Ticket Number integer "Ticket"
    Ticket Cancellation Id integer
    Cancellation group abc
    Cancellation Reason abc

    Controlled Parking Zone
    CPZ code abc
    Controlled Parking Zone Name abc

    Service
    Service Code integer
    Service abc
    Service Category abc

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • More general notes:

    (11) The attributes within an entity should apply only to that entity.  For example, CPZ should be in the Ticket entity only if it CPZ relates directly to Ticket somehow.

    (12) Keys can be assigned later, but generally "Id" is the name used to imply that a number (not character/abc code) will be used to uniquely identity one instance of an Entity .  For example, a "Customer Id" (equivalent to "Customer Number").

    (13) "Code" means that a code, either numeric or character (abc), will be used to represent a value.  Same as in plain English and in your original description, where "CPZ code" refers to a 1 to 3 char code for CPZ name.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Thank you for your detailed reply. I've only just noticed some of the things you mentioned from a previous reply, I totally missed it earlier. You wrote:

    "Names at the logical level should be standard business names, NOT "computerized" names. "

    What is meant by the logical level? is normalization part of the logical level?

    Regarding the rules for normalization. I'd kindly like to ask why the entity names need to be either singular or plural (not challenging what you said I'd just like to improve my understanding of it). I've never actually heard of camel case until you mentioned it. Kudos. Off topic but is there like an exercise book on database normalization you recommend?.

    Thanks for the table normalization you provided. Using what you provided, I attempted to sketch it using 'app.diagrams.net' (just so that I could picture it better).

    draft

    Here is the .drawio file if you would like to have a look: https://www.mediafire.com/file/1iqlvsnsbdsng6v/draft.drawio/file

    I added CPZ Code in ticket so that I could create a relationship with Controlled Parking Zone. I hope that's okay. Regarding:

    "Btw, I could find no relationship between CPZ and Ticket. CPZ description says "the area that the vehicle was parked". What vehicle??"

    Allow me to retract what I said. CPZ would be the area that the ticket was issued.

    Thank you

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply