In the previous levels of this Stairway, I provided you with information about the basic SELECT statement and the history of SQL. Those levels gave you a basis for understanding how to retrieve data and how the SQL environment has evolved as technology and technical solutions have changed overtime. In this level I will be exploring how to implement a simple SQL Server database that is based off of a relational model. Before jumping into creating a database, first let me share a little history concerning the creator of the relational model.
The Father of Relational Data Modeling
The concept of relational database design was first introduced by Edgar F. Codd in 1970, with a paper titled “A Relational Model of Data for Large Shared Data Banks”. Codd developed this modeling theory while working at IBM. IBM didn’t jump fast enough on Codd’s data modeling concept and was therefore not the first vendor to supply a relational database engine that exploited Codd’s new relational data modeling theory. Codd’s relational modeling concept is now the framework that is used to create relational databases within SQL Server and other relational database engines.
Codd was born in the Isle of Portland in England and studied mathematics and chemistry before joining the Royal Air Force to become a World War II pilot. He moved to New York in 1948 and started working for IBM, where he was a mathematical programmer. He floated around for a number of years then eventually moved to California to work at the IBM San Jose Research Laboratory. Codd continued working to refine and justify the relational data model until the 1990s when his failing health forced him to retire. Edgar F. Codd died at the age of 79 on April 18, 2003.
Implementing a Relational Model in SQL Server
This Stairway is not intended to teach you about relational data modeling, or database design, but instead is to only show you how to create a SQL Server database from a relational model. But before I can provide you the code blocks for creating a SQL Server database we first need to explore a relational data model that will be implemented. My simple model will contain a few entities (data tables) with primary key definitions and some relationships (foreign key constraints) between the different entities.
My simple relational model will be for a simple hotel reservation system. This reservation system will need to track customer reservations information. Figure 1 illustrates this simple relational model that I will be implementing using T-SQL:
By reviewing this model you can see it contains a number of entities (represented by boxes) to track reservation related information. Each entity is made up of a number of attributes (columns) where one or more attributes are identified as the primary key (bold and underlined names). Also represented are some relationships between the entities (represented by the arrows) that show how the different entities are related to each other. I will be taking this model of entities, attributes, primary keys and relationships and then developing a physical SQL Server database that represents the design of this relational model.
To build a physical database from this model we need to identify the different objects in SQL Server that we are going to define based on this model. For each entity, or box in Figure 1, I will be creating a table in SQL Server. For each attribute of each entity, I will be creating a column in the associated table. For each primary key, I will be creating a unique clustered index (note a primary key could also be created using a unique non-clustered index. For more information on indexing refer to the Indexes Stairway at http://www.sqlservercentral.com/stairway/72399/). Lastly, for each relationship I will be creating a foreign key constraint.
To start building my database I first need to create a SQL Server database to hold all the new database objects I plan to create. My database will be called RoomReservation. I will be creating my database by using the following T-SQL code:
CREATE DATABASE RoomReservation;
To start building my RoomReservation database objects from my model, I will then create the table objects. To create a table in SQL Server I need to use the CREATE TABLE statement. With the CREATE TABLE statement I will be able to define each table and all of the columns in each table. Here is the simple syntax for creating a SQL Server table:
CREATE TABLE <table_name> ( <column_definition> [,…N]);
<table_name> = Name of table
<column_definition> = column_name data_type,[NULL | NOT NULL]
For complete syntax of the CREATE TABLE statement refer to SQL Server Books Online.
The first table I create will be the Customer table, created using the code in Listing 1.
USE RoomReservation; GO CREATE TABLE dbo.Customer ( CustomerId INT NOT NULL, FirstName NVARCHAR(50) NOT NULL, LastName NVARCHAR(50) NOT NULL, Address1 NVARCHAR(100) NOT NULL, Address2 NVARCHAR(100) NULL, PhoneNumber NVARCHAR(22) NOT NULL, EmailAddress VARCHAR(100) NULL, City VARCHAR(100) NOT NULL, StateProvince NVARCHAR(100) NOT NULL, PostalCode NVARCHAR(100) NOT NULL);
In this code, when I created my Customer table I created all the columns I need, but I also specified whether or not the column needs a value when a record is inserted or updated into this table. I implemented that by specifying NOT NULL on some columns, while other columns I specified NULL.
If a column is defined to be NOT NULL that means you are not able to create a record unless you populate this column with an actual value. Whereas defining a column using the NULL specification means you can create a row without specifying a value for this column, or another way to put it is that the column allows a NULL value. In my CREATE TABLE statement above I allowed the columns Address2 and EmailAddress to support NULLs, whereas all the rest of the columns require a value to be supplied when creating a row.
This CREATE TABLE statement did not totally define my Customer table as it is represented in my relational database model above. I still need to create a primary key constraint on the column CustomerID. This primary key constraint will ensure that no two records in this table have the same CustomerID value.. The code to create that primary key is in Listing 2.
USE RoomReservation; GO ALTER TABLE dbo.Customer ADD CONSTRAINT PK_Customer PRIMARY KEY CLUSTERED (CustomerId);
This ALTER TABLE statement added a primary key constraint to my Customer table. That primary key will be created in the form of a clustered index named PK_Customer.
In the Transact-SQL language, there is usually more than one way to do the same thing. Alternatively, I could have created my Customer table and primary key all at one time by running the CREATE TABLE statement in Listing 3.
USE RoomReservation; GO CREATE TABLE dbo.Customer ( CustomerId INT NOT NULL CONSTRAINT PK_Customer PRIMARY KEY, FirstName NVARCHAR(50) NOT NULL, LastName NVARCHAR(50) NOT NULL, Address1 NVARCHAR(100) NOT NULL, Address2 NVARCHAR(100) NULL, PhoneNumber NVARCHAR(22) NOT NULL, EmailAddress NVARCHAR(100) NULL, City VARCHAR(100) NOT NULL, StateProvince NVARCHAR(100) NOT NULL, PostalCode NVARCHAR(100) NOT NULL);
At this point I have showed you how to create a table with a defined PRIMARY KEY. The only thing left to show you is how to create a FOREIGN KEY constraint. But before I can do that let me first provide you the script to create the rest of the tables and PRIMARY KEYS in my relational database model above. You can find it in Listing 4.
USE RoomReservation; GO CREATE TABLE dbo.Reservation ( ReservationId INT NOT NULL, ArrivalDate DATETIME NOT NULL, DepartureDate DATETIME NOT NULL, DailyRate SMALLMONEY NOT NULL, ReservationStatusID INT NOT NULL, CustomerId INT NOT NULL, RoomTypeID INT NOT NULL); ALTER TABLE dbo.Reservation ADD CONSTRAINT PK_Reservation PRIMARY KEY CLUSTERED (ReservationId); CREATE TABLE dbo.RoomType ( RoomTypeId INT NOT NULL, RoomDesc NVARCHAR(1000) NOT NULL); ALTER TABLE dbo.RoomType ADD CONSTRAINT PK_RoomType PRIMARY KEY CLUSTERED (RoomTypeId); CREATE TABLE dbo.ReservationStatus ( ReservationStatusId INT NOT NULL, ReservationStatusDesc NVARCHAR(50) NOT NULL); ALTER TABLE dbo.ReservationStatus ADD CONSTRAINT PK_ReservationStatus PRIMARY KEY CLUSTERED (ReservationStatusId); CREATE TABLE dbo.PaymentType ( PaymentTypeId INT NOT NULL, PaymentTypeDesc NVARCHAR(50) NOT NULL); ALTER TABLE dbo.PaymentType ADD CONSTRAINT PK_PaymentType PRIMARY KEY CLUSTERED (PaymentTypeId); CREATE TABLE dbo.CustomerPaymentType ( PaymentTypeId INT NOT NULL, CustomerId INT NOT NULL, PaymentNotes NVARCHAR(2000) NULL); ALTER TABLE dbo.CustomerPaymentType ADD CONSTRAINT PK_CustomerPaymentType PRIMARY KEY CLUSTERED (PaymentTypeId, CustomerId);
A FOREIGN KEY constraint enforces referential integrity between two tables that are related to each other. The table that the foreign key constraint is defined on is the ‘referencing table’ and is required to have a related record in another table, known as the ‘referenced’ table, any time a row is inserted or updated in the table. . In my relational model in Figure 1 these foreign key relationships are represented by the arrows. FOREIGN KEY constraints are only defined on one of the tables in the relationship. In my diagram the FOREIGN KEY constraints will be defined on those tables that have the tail of the arrow (non-pointed end) attached.
To define these FOREIGN KEY contraints in my relational model I will need to alter each referencing table to add the constraint. Listing 5 is the T-SQL code I can use to create a FOREIGN KEY constraint on the Reservation table. This constraint ensures that a record does not get inserted or updated in the Reservation table unless a matching record is found in the Customer table, based on CustomerId.
USE RoomReservation; GO ALTER TABLE dbo.Reservation ADD CONSTRAINT FK_Reservation_CustomerPaymentType FOREIGN KEY (CustomerId) REFERENCES dbo.Customer (CustomerID);
In order to complete my design I need to implement all the other foreign key constaints identified in my model in Figure 1. Listing 6 contains the ALTER TABLE statements to create the additional foreign key constraints in my data model.
USE RoomReservation; GO ALTER TABLE dbo.Reservation ADD CONSTRAINT FK_Reservation_RoomType FOREIGN KEY (RoomTypeId) REFERENCES dbo.RoomType (RoomTypeId); ALTER TABLE dbo.Reservation ADD CONSTRAINT FK_Reservation_ReservationStatus FOREIGN KEY (ReservationStatusId) REFERENCES dbo.ReservationStatus (ReservationStatusId); ALTER TABLE dbo.CustomerPaymentType ADD CONSTRAINT FK_CustomerPaymentType_PaymentType FOREIGN KEY (PaymentTypeId) REFERENCES dbo.PaymentType (PaymentTypeId); ALTER TABLE dbo.CustomerPaymentType ADD CONSTRAINT FK_CustomerPaymentType_Customer FOREIGN KEY (CustomerId) REFERENCES dbo.Customer (CustomerId);
Validating Database Design
Once I am are done building a database from a data model I should validate the implemented design to make sure it is correct. This validation process is to ensure all the data integrity rules I built into my physical database are implemented correctly. In my design here are the rules I need to validate
- All rows inserted or updated must have a specific value defined for any column defined as NOT NULL.
- Columns that are PRIMARY KEYs do not allow duplicate values
- Columns that have foreign key constaints do not allow data that does not have a matching record in the referenced table
Before I can validate the data integrity rules I first need to populate the referenced tables with some valid data. I will use the code in Listing 7 to populate those tables with some valid data:
USE RoomReservation; GO SET NOCOUNT ON; -- Create PaymentType records INSERT INTO PaymentType VALUES(1,'Visa'); INSERT INTO PaymentType VALUES(2,'MasterCard'); INSERT INTO PaymentType VALUES(3,'American Express'); -- Create Customer INSERT INTO Customer VALUES (1,'Greg','Larsen','123 Some Place' ,NULL,'123-456-7890',Null,'MyCity','MA','12345'); -- Create Reservation Status INSERT INTO ReservationStatus VALUES (1,'Booked'); INSERT INTO ReservationStatus VALUES (2,'Cancelled'); -- Create Room Type INSERT INTO RoomType VALUES (1,'Kingsize'); INSERT INTO RoomType VALUES (2,'Queen'); INSERT INTO RoomType VALUES (3,'Double');
To validate my the data integrity rules that I built into my database I will be running the INSERT statements in Listing 8.
USE RoomReservation; GO -- Violates NOT NULL constraint INSERT INTO Reservation VALUES(1,'2011-8-1 5:00 PM' ,'2011-8-2 9:00 AM' , 150.99,NULL,1,1); -- Violates Primary Key Constraint INSERT INTO RoomType VALUES (3,'Suite'); -- Violates Foreign Key Constraint INSERT INTO CustomerPaymentType VALUES (1,2,'Will need an internet connection');
Each one of these INSERT statements should fail because they violate a data integrity rule that was built into the RoomReservation database. The first INSERT statement violates the NOT NULL validation check of the ReservationStatusID column.
The second INSERT statement violates the PRIMARY KEY constraint that was placed on the RoomType table. This INSERT statement is trying to insert the value of 3 for the RoomTypeID column. The problem is there is already a record in the RoomType table with RoomTypeID value of 3.
The last INSERT statement violates the FOREIGN KEY constraint of the CustomerPaymentType table. In this particular INSERT statement there is no CustomerID with a value of 2 in the Customer table.
To correctly insert these records, the inserted data values will need to be cleaned up. Once the data has been cleaned up I will be able to be insert this new data into the appropriate tables. Listing 9 contains the cleaned up INSERT statements that will pass all data integrity checks and be successfully inserted into the appropriate tables in the RoomReservation database:
USE RoomReservation; GO -- Violates NOT NULL constraint INSERT INTO Reservation VALUES(1,'2011-8-1 5:00 PM' ,'2011-8-2 9:00 AM' , 150.99,1,1,1); -- Violates Primary Key Constraint INSERT INTO RoomType VALUES (4,'Suite'); -- Violates Foreign Key Constraint INSERT INTO CustomerPaymentType VALUES (1,1,'Will need an internet connection');
Relational Database Design
My reservation example demonstrates how to take a relational model and use it to implement a SQL Server database. By using the NOT NULL, PRIMARY KEY, and FOREIGN KEY constaints, I built data integrity rules right into my database design. This allowed me to enforce these rules in the underlying database definition, instead of having to write code in my business processing layer to validate these data rules. By doing this I allowed the SQL Server database engine to perform these data integrity checks for me.
By understanding and creating your database design around the relational database model, you will be building a robust and efficient database implementation, where you can build data integrity checking right into the database.