Views and Horizontal Partitioning

In this article, I am writing about Views, starting with the simplest of views and moving to more powerful uses like horizontal partitioning. On the way we will also see SCHEMABINDING and Indexed Views.

Let us jump into code:

Step 1

We will use the following table for our examples:

drop table jk_t_source1
CREATE TABLE jk_t_source1(f1 int,f2 char (10),f3 varchar (40),f4 datetime)
insert into jk_t_source1 values(111,'aaa','bbbbb',getdate())
insert into jk_t_source1 values(222,'bbb','ccccc',getdate() -1)
insert into jk_t_source1 values(333,'ccc','ddddd',getdate() -2)
insert into jk_t_source1 values(444,'ddd','eeeee',getdate() -3)
insert into jk_t_source1 values(555,'eee','fffff',getdate() -4)

Item 2

Let us start with the simplest of views. Here is the code to create a simple view:

alter view jk_v_View1
as
select * from jk_t_source1

and this is how you use it, exactly similar to the way you would use a table.

select * from jk_v_View1

Item 3

You cannot use an 'ORDER BY' in a view

alter view jk_v_View1
as
select * from jk_t_source1
order by f4

You would get the following error

Server: Msg 1033, Level 15, State 1, Procedure jk_v_View1, Line 4

The ORDER BY clause is invalid in views, inline functions, derived tables,

and subqueries, unless TOP is also specified.

The error indicates that you can use 'ORDER BY' when 'TOP' is also used. So basically we can circumvent to achieve what we want. Here it is.

alter view jk_v_View1
as
select top 100 percent *  from jk_t_source1
order by f4
select * from jk_v_View1

Item 4

When you are making modifications to the view, and if you wish to see the current definition of the view, you can use:

sp_helptext jk_v_View1

and this one to see all the views, you own:

select * from information_schema.views

Item 5

The column names in the View and the underlying table/s need not be the same.

alter view jk_v_View1(fa, fb, fc, fd)
as
select * from jk_t_Source1

or you can chose only a subset of the columns

alter view jk_v_View1(fa, fc, fd)
as
select f1,f3,f4 from jk_t_Source1

and if you need the same column more than once, you could do that as well.

alter view jk_v_View1(fa1, fb1, fa2, fb2)
as
select f1, f2, f1, f2 from jk_t_Source1

you can return a computed column, like this

alter view jk_v_View1(f1)
as
select f1 + 10 from jk_t_Source1

You can modify the data types if you wish

alter view jk_v_View1
as
select f1, f2, f3, cast(f4 as varchar(19)) f4 from jk_t_Source1

and you can use the following statements/SPs to verify that the data types are indeed different.

sp_columns jk_t_source1
sp_columns jk_v_View1
select column_name, data_type from information_schema.columns
where table_name = 'jk_t_source1'
select column_name, data_type from information_schema.columns
where table_name = 'jk_v_View1'

The select statement we used inside the view is a very simple one. It could be any complex SQL, involving joins etc. You could also call functions like this:

alter view jk_v_View1
as
select dbo.jk_f_funcScalar() col1,*  from jk_f_funcNonScalar()

Here are the two functions for the above example:

create function jk_f_funcNonScalar()
returns @t1 table (f1 int,f2 char (10),f3 varchar (40),f4 datetime)
as
begin
insert into @t1 select * from jk_t_source1
return
end
create function jk_f_funcScalar()
returns int
as
begin
declare @t1 int
set @t1 = 5
return @t1
end

Non-scalar functions can be used only in the from clause of a select statement, not in the field list. Vice-versa, scalar functions can be used only in the field list and not in the from clause.

Item 6

Now with fundamentals out of our way, we can start looking at some advanced features. The first thing I would like to clear off our plate, is SCHEMABINDING because it is going to come in our way in many places.

Schemabinding

Schemabinding basically takes care of dependencies. Let us see with an example. Use the Non Scalar version we used earlier. Schemabinding is not applicable for Scalar functions.

create function jk_f_funcNonScalar()
returns @t1 table (f1 int,f2 char (10),f3 varchar (40),f4 datetime)
as
begin
insert into @t1 select * from jk_t_source1
return
end

Use it in a view, like this

alter view jk_v_View1
as
select * from dbo.jk_f_funcNonScalar()

Now drop the function

drop function jk_f_funcNonScalar

And try to use the view

select * from jk_v_View1

And you would get the following error

Server: Msg 208, Level 16, State 1, Procedure jk_v_View1, Line 3

Invalid object name 'dbo.jk_f_funcNonScalar'.

Server: Msg 4413, Level 16, State 1, Line 1

Could not use view or function 'jk_v_View1' because of binding errors.

Now do this. We dropped jk_f_funcNonScalar, so recreate it for proceeding further. Let us use Schemabinding now. You have to use schemabinding for the 'function the view calls' to make the view a schemabinding object. So let us first use schemabinding for the function.

alter function jk_f_funcNonScalar()
returns @t1 table (f1 int,f2 char (10),f3 varchar (40),f4 datetime)
WITH SCHEMABINDING
as
begin
insert into @t1 select f1,f2,f3,f4 from dbo.jk_t_source1
return
end
alter view jk_v_View1 WITH SCHEMABINDING
as
select f1,f2,f3,f4 from dbo.jk_f_funcNonScalar()

Now drop the function again

drop function jk_f_funcNonScalar

And you will see this message (this is because the VIEW is now SCHEMA bound)

Server: Msg 3729, Level 16, State 1, Line 1

Cannot DROP PROCEDURE 'jk_f_funcNonScalar' because it is being referenced by object 'jk_v_View1'.

When the View is SCHEMA bound, and any dependant objects are affected, SQL Server will give an error.

here is another example of a problem when a view is not schema bound. drop and recreate the view and the function. While doing so, remove the SCHEMABINDING option from the view and the function. Then modify the function as follows. Note that the function now returns only three fields.

alter function jk_f_funcNonScalar()
returns @t1 table (f1 int,f2 char (10),f3 varchar (40))
--WITH SCHEMABINDING
as
begin
insert into @t1 select f1,f2,f3 from dbo.jk_t_source1
return
end

and try to use the view

select * from jk_v_View1

and you should get the following error:

Server: Msg 207, Level 16, State 3, Procedure jk_v_View1, Line 3

Invalid column name 'f4'.

Server: Msg 4413, Level 16, State 1, Line 1

Could not use view or function 'jk_v_View1' because of binding errors.

So much for schemabinding option. We touched upon Schemabinding because, the next topic we are going to discuss, Indexed Views, requires that Views be schema bound to be indexed.

So just to wind up schemabinding, re-create the function and the view with the SCHEMABINDING option. (here is the code)

alter function jk_f_funcNonScalar()
returns @t1 table (f1 int,f2 char (10),f3 varchar (40),f4 datetime)
WITH SCHEMABINDING
as
begin
insert into @t1 select f1,f2,f3,f4 from dbo.jk_t_source1
return
end
alter view jk_v_View1 WITH SCHEMABINDING
as
select f1,f2,f3,f4 from dbo.jk_f_funcNonScalar()

We cannot use the "*" in select statements when SCHEMABINDING is used. You would get the following error:

Server: Msg 1054, Level 15, State 6, Procedure jk_v_View1, Line 3

Syntax '*' is not allowed in schema-bound objects.

Item 7 - Indexed Views

The first index you create on a view has to be a 'unique clustered index'. The view cannot have any nondeterministic expressions. Our view is not a qualified candidate to be indexed, as its state is currently. You would get the following error if you tried to index it.

Server: Msg 1943, Level 16, State 1, Line 1

Index cannot be created on view 'MYDB.dbo.jk_v_View1' because the view has

one or more nondeterministic expressions.

From MS documentation: getdate() and rand() (non-deterministic functions) are allowed in views.

A clustered index cannot be created on a view if the view references any nondeterministic functions.

One of the properties SQL Server records for user-defined functions is whether the function is deterministic. A nondeterministic user-defined function cannot be invoked by either a view or computed column if you want to create an index on the view or computed column.

Let us change the view definition, like this

alter view jk_v_View1 WITH SCHEMABINDING
as
select f1,f2,f3,f4 from dbo.jk_t_source1

and create various indexes.

create unique clustered index jk_i_View1 on jk_v_View1(f1)
create unique index jk_i_View1_2 on jk_v_View1(f2)
create index jk_i_View1_3_4 on jk_v_View1(f3,f4)

If you drop a clustered index on a view, all the other indexes on the view are automatically dropped by SQL Server. My suggestion will be to not create indexes on views, unless you visualize a perceivable difference in performance, compared to the hassles of maintaining indexed views. I am not sure if any other feature in SQL Server is so restrictive to use/implement and maintain.

Item 8 - Partitioned Views

Data can be horizontally partitioned for performance reasons and load balancing. Horizontal partitioning refers to splitting a single table into multiple tables. Example, if a table had five billion rows, and almost all transactions in your system are accessing only the one billion records pertaining to the current year and very rarely, some parts of the application access the older data.

Obviously, it makes sense for any DML to work only with the 1 billion records in question rather than the entire set. At the same time, it may not be possible to eliminate the 4 billion records entirely into an archive, because some parts of the application still needs that old data as well.

SQL Server has a very simple way of achieving horizontal partitioning. We will first see how you can benefit from using such a facility for data retrieval. We will look at other DML operations at the end.

Let us start by creating the tables required for our example. Note the check constraint on each one of these tables. We will discuss about these a little later.

Create the following four tables.

drop table jk_horiz_300000
drop table jk_horiz_1
drop table jk_horiz_2
drop table jk_horiz_3
drop view jk_v_horizpart
create table jk_horiz_300000(f1 int primary key check(f1 between     1 and 
300000), f2 varchar(10))
create table jk_horiz_1(f1 int primary key 
    check(f1 between 1 and 100000), f2 varchar(10))
create table jk_horiz_2(f1 int primary key 
    check(f1 between 100001 and 200000), f2 varchar(10))
create table jk_horiz_3(f1 int primary key 
    check(f1 between 200001 and 300000), f2 varchar(10))

Then create the following view.

create view jk_v_horizpart
as
select * from jk_horiz_1
union all
select * from jk_horiz_2
union all
select * from jk_horiz_3

You can use the following insert scripts to populate the tables. If possible build it with a billion records each. With small amounts of data, you will not be able to see any visible difference in the time taken. However, I will demonstrate two other visual clues to prove that this feature makes a real big difference. If you change the amount of data, than what I am using, then you got to change the constraints on the tables too accordingly

declare @temp int
set @temp = 1
while @temp <= 300000
begin
insert into jk_horiz_300000 values(@temp,'aaa')
set @temp = @temp + 1
end
declare @temp int
set @temp = 1
while @temp <= 100000
begin
insert into jk_horiz_1 values(@temp,'aaa')
set @temp = @temp + 1
end
declare @temp int
set @temp = 100001
while @temp <= 200000
begin
insert into jk_horiz_2 values(@temp,'aaa')
set @temp = @temp + 1
end
declare @temp int
set @temp = 200001
while @temp <= 300000
begin
insert into jk_horiz_3 values(@temp,'aaa')
set @temp = @temp + 1
end

Now in Query Analyzer, activate the Query->Show Execution Plan option and execute the following queries in a batch. (I mean execute both the queries at the same time, by highlighting them) This will enable you to see the execution plan for both the queries together.

select * from jk_horiz_300000 where f1 = 200700
select * from jk_v_horizpart where f1 = 200700

Please observe the Execution plan for the second query, and it would access only one of the tables. Try another query and you can observe the same.

select a.f1, b.f1
from jk_horiz_300000 a 
    inner join jk_v_horizpart b 
    on a.f1 = b.f1 and a.f1 = 200700

You can try some more combinations. SQL Server uses the optimizer based on the check constraint on the table and the where clause in the query when you use the view. Using these two, SQL Server can decide as to which table to go to, to get the data.

Distributed partitioned Views

Now if you have access to two physical servers, you can place one of the partitioned tables on a different server and modify the view to access the table by using a four part naming convention, like

select * from [servername].[dbname].[owner].[tablename]

You will notice that, if your query on the view does not pertain to the data residing on the second server, there will be no network traffic to that server. Also multiple requests on these tables with varying where clauses can be distributed across physical servers, thus achieving a fair amount of load balancing. With this horizontal partitioning will improve performance drastically in high data volume systems.

Now a look at the other DML statements that you can use. While we can use insert, update and delete statements on partitioned views, unfortunately all restrictions that apply to normal views, while using these DML statements, apply to partitioned views as well. As I always keep saying, I doubt whether there is any other feature in SQL Server which is as restrictive in nature as the applicability of these DML statements on views.

Take, for example, a restriction like 'INSERT statements are not allowed if a member table contains a column with an identity property.' This is just one. In my system almost all tables have a identity field. The only way I can overcome this restriction is by using the IDENTITY_INSERT ON/OFF feature, in which case it does not make any sense to have the identity field in the table in the first place.

Data warehouse applications can benefit obviously, if designed with all these restrictions in mind. Just a note before closing, Vertical partitioning is achieved by splitting a single table into multiple tables based on the usage of columns. All the tables will have the same number of rows but different set of columns.

Conclusion

In this write up, we started with the simplest of views and went on to see some real advanced features. Hope it gives an idea of how powerful views are in SQL Server 2000

Thankz for reading!!!

Views and Horizontal Partitioning

Conclusion

Rate

Share

Categories

Share

Rate

Views and Horizontal Partitioning

Conclusion

Rate

Share

Categories

Share

Rate

Related content

Our Very Own SQL Server Data Compression

A surrogate key architecture to perform powerful database operations

Database Design and Reference Tables

SQL Building Blocks and Server Settings

Setting Matters Straight