Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

How to Define a schema for the fact table, and the dimensional tables in SQL from a relational schema? Expand / Collapse
Author
Message
Posted Tuesday, July 30, 2013 6:11 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, July 31, 2013 2:29 PM
Points: 10, Visits: 27
Hi, there are what I have done. can you check over for me if it's wrong?

//Create dimensional tables

create table [dbo].[Dim_Course](
[CrsCode] [char](8) not null,
[CrsName] [nvarchar](50) not null,
[Descr] [nvarchar](200)
Constraint [PK_Dim_Course] primary key
(
[CrsCode]
)
)

create table [dbo].[Dim_Dept](
[DeptId] [char](4) not null,
[Name] [nvarchar](50) not null,
[FacultyName] [nvarchar](50) not null
Constraint [PK_Dim_Dept] primary key clustered
(
[DeptId]
)
)

Create table [dbo].[Dim_Professor](
[Id] [char](6) not null,
[Name][nvarchar](50) not null,
[Address] [nvarchar](50) not null,
[Status] [bit] not null
Constraint [PK_Dim_Professor] primary key clustered
(
[Id]
)
)
Create table [dbo].[Dim_Semester](
[Data_key] [int] not null,
[Semester] [nvarchar](20) not null,
[Year][int] not null
Constraint [PK_Dim_Semester] primary key
(
[Data_key]
)
)

//Create Fact_Table

Create table [dbo].[Fact_Table](
[DeptId] [char](4) not null,
[CrsCode] [char](8) not null,
[Id] [char](6) not null,
[Data_key] [int] not null,
[Enrollment][int] not null
Constraint [PK_Fact_Table] primary key
(
[DeptId],
[CrsCode],
[Id],
[Data_key]
)
foreign key ([DeptId]) references Dim_Dept([DeptId]),
foreign key (CrsCode) references Dim_Course(CrsCode),
foreign key (Id) references Dim_Professor(Id),
foreign key (Data_key) references Dim_Semester(Data_key)
)


I'm a little bit confused about the address attribute in the Dim_Professor. Normally, for dimensional table, every attribute has to be organized as a hierarchy. So, it would be:
Number > Street > City> Province (territory) > Country. However, if I organize it like that, how can I insert data into that table from a relational database in which the address attribute is just address?
Post #1479237
Posted Wednesday, July 31, 2013 2:48 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 3:29 PM
Points: 12,966, Visits: 10,742
Regarding the address: typically you would store the street and number in one field, but zipcode, town and country in other fields. Maybe in a different dimension altogether: the geography dimension.

Regarding your dimension tables: in correct dimensional modelling, you add a surrogate key, which is a meaningless integer key. The fact table will hold the surrogate keys because they are smaller than your business keys. It also makes your joins quicker.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1479350
Posted Wednesday, July 31, 2013 3:11 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, July 31, 2013 2:29 PM
Points: 10, Visits: 27
Is it necessary to define the address as a hierarchy?

Also, do you mean these primary keys in those tables are redundant?
Post #1479358
Posted Wednesday, July 31, 2013 3:20 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 3:29 PM
Points: 12,966, Visits: 10,742
It is not necessary, it depends on the requirements. If you never going to analyze on location, it is not necessary.
However, if you would like to answer questions like "which regions give us the most students for this course", I would implement the hierarchy.

The surrogate key is not redundant, it has its purpose. It is independent from application data, so it protects your data model from changes.
For example, what if they later on decide to change the department IDs and they are not unique no more? Your surrogate key is still unique, so you don't have any problems in your department dimension and your fact table.

I would suggest you read the Data Warehouse Toolkit by Ralph Kimball, it explains the dimensional theory pretty well.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1479359
Posted Wednesday, July 31, 2013 3:32 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, July 31, 2013 2:29 PM
Points: 10, Visits: 27
so overall, is it ok with my schema?

I also wrote a query to insert huge amount of data from the relational schema into fact table.

insert into Fact_Table(DeptId, CrsCode, Id, SemesterId, Enrollment)
select from Department.DeptId, Course.CrsCode, Professor.Id, Transcript.Semester,
(Select count(*) from Transcript t where t.CrsCode=Course.CrsCode and t.Semester=Transcript.Semester)
from Department inner join Course on Department.DeptId=Course.DeptId
inner join Transcript on Course.CrsCode=Transcript.CrsCode
inner join Teaching on Course.CrsCode=Teaching.CrsCode
inner join Professor on Teaching.ProfId=Professor.Id
group by Department.DeptId, Course.CrsCode, Professor.Id, Transcript.Semester

could you take a quick look at it and tell me it's ok or not?
I cannot check it as my sql server 2008 has an error with the sql server analysis services.
Post #1479363
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse