SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


How to Define a schema for the fact table, and the dimensional tables in SQL from a relational...


How to Define a schema for the fact table, and the dimensional tables in SQL from a relational schema?

Author
Message
phamkhanhtung1989
phamkhanhtung1989
Valued Member
Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)

Group: General Forum Members
Points: 66 Visits: 27
Hi, there are what I have done. can you check over for me if it's wrong?


//Create dimensional tables

create table [dbo].[Dim_Course](
[CrsCode] [char](8) not null,
[CrsName] [nvarchar](50) not null,
[Descr] [nvarchar](200)
Constraint [PK_Dim_Course] primary key
(
[CrsCode]
)
)

create table [dbo].[Dim_Dept](
[DeptId] [char](4) not null,
[Name] [nvarchar](50) not null,
[FacultyName] [nvarchar](50) not null
Constraint [PK_Dim_Dept] primary key clustered
(
[DeptId]
)
)

Create table [dbo].[Dim_Professor](
[Id] [char](6) not null,
[Name][nvarchar](50) not null,
[Address] [nvarchar](50) not null,
[Status] [bit] not null
Constraint [PK_Dim_Professor] primary key clustered
(
[Id]
)
)
Create table [dbo].[Dim_Semester](
[Data_key] [int] not null,
[Semester] [nvarchar](20) not null,
[Year][int] not null
Constraint [PK_Dim_Semester] primary key
(
[Data_key]
)
)

//Create Fact_Table

Create table [dbo].[Fact_Table](
[DeptId] [char](4) not null,
[CrsCode] [char](8) not null,
[Id] [char](6) not null,
[Data_key] [int] not null,
[Enrollment][int] not null
Constraint [PK_Fact_Table] primary key
(
[DeptId],
[CrsCode],
[Id],
[Data_key]
)
foreign key ([DeptId]) references Dim_Dept([DeptId]),
foreign key (CrsCode) references Dim_Course(CrsCode),
foreign key (Id) references Dim_Professor(Id),
foreign key (Data_key) references Dim_Semester(Data_key)
)




I'm a little bit confused about the address attribute in the Dim_Professor. Normally, for dimensional table, every attribute has to be organized as a hierarchy. So, it would be:
Number > Street > City> Province (territory) > Country. However, if I organize it like that, how can I insert data into that table from a relational database in which the address attribute is just address?
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)

Group: General Forum Members
Points: 63040 Visits: 13298
Regarding the address: typically you would store the street and number in one field, but zipcode, town and country in other fields. Maybe in a different dimension altogether: the geography dimension.

Regarding your dimension tables: in correct dimensional modelling, you add a surrogate key, which is a meaningless integer key. The fact table will hold the surrogate keys because they are smaller than your business keys. It also makes your joins quicker.


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
phamkhanhtung1989
phamkhanhtung1989
Valued Member
Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)

Group: General Forum Members
Points: 66 Visits: 27
Is it necessary to define the address as a hierarchy?

Also, do you mean these primary keys in those tables are redundant?
Koen Verbeeck
Koen Verbeeck
SSC Guru
SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)

Group: General Forum Members
Points: 63040 Visits: 13298
It is not necessary, it depends on the requirements. If you never going to analyze on location, it is not necessary.
However, if you would like to answer questions like "which regions give us the most students for this course", I would implement the hierarchy.

The surrogate key is not redundant, it has its purpose. It is independent from application data, so it protects your data model from changes.
For example, what if they later on decide to change the department IDs and they are not unique no more? Your surrogate key is still unique, so you don't have any problems in your department dimension and your fact table.

I would suggest you read the Data Warehouse Toolkit by Ralph Kimball, it explains the dimensional theory pretty well.


How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?
My blog at SQLKover.

MCSE Business Intelligence - Microsoft Data Platform MVP
phamkhanhtung1989
phamkhanhtung1989
Valued Member
Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)

Group: General Forum Members
Points: 66 Visits: 27
so overall, is it ok with my schema?

I also wrote a query to insert huge amount of data from the relational schema into fact table.

insert into Fact_Table(DeptId, CrsCode, Id, SemesterId, Enrollment)
select from Department.DeptId, Course.CrsCode, Professor.Id, Transcript.Semester,
(Select count(*) from Transcript t where t.CrsCode=Course.CrsCode and t.Semester=Transcript.Semester)
from Department inner join Course on Department.DeptId=Course.DeptId
inner join Transcript on Course.CrsCode=Transcript.CrsCode
inner join Teaching on Course.CrsCode=Teaching.CrsCode
inner join Professor on Teaching.ProfId=Professor.Id
group by Department.DeptId, Course.CrsCode, Professor.Id, Transcript.Semester



could you take a quick look at it and tell me it's ok or not?
I cannot check it as my sql server 2008 has an error with the sql server analysis services.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search