SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


New Column Updates


New Column Updates

Author
Message
Vic Kirkpatrick-173212
Vic Kirkpatrick-173212
Right there with Babe
Right there with Babe (769 reputation)Right there with Babe (769 reputation)Right there with Babe (769 reputation)Right there with Babe (769 reputation)Right there with Babe (769 reputation)Right there with Babe (769 reputation)Right there with Babe (769 reputation)Right there with Babe (769 reputation)

Group: General Forum Members
Points: 769 Visits: 76
Why would anyone have a clustered index on an NVARCHAR(341) column? Just curious because this is a main part of the problem to begin with. Hopefully this does not get inserted by a user on a table this size.
Leo Peysakhovich
Leo Peysakhovich
SSC Eights!
SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)SSC Eights! (956 reputation)

Group: General Forum Members
Points: 956 Visits: 367
If DBA is not in total control of table indexes then ....... I can understand that dba is not always in control with database design but nobody tells dba where and how to use indexes but dba. It means that smart way to do the task was 1. analyze indexes 2. use standard way for the update.



Bimal Fernando
Bimal Fernando
Mr or Mrs. 500
Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)

Group: General Forum Members
Points: 515 Visits: 87

Remeber very carefully. Don't ever set a default value when you add a new column to a table with such a large number of rows. Becasue it is no different to addding the column as null and then issue one transaction to update all the 35 million records with the default value. It's the worst way of doing this. That'll load the whole server and block the table for a very long time as well.


Bimal Fernando
Bimal Fernando
Mr or Mrs. 500
Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)

Group: General Forum Members
Points: 515 Visits: 87

Yes, I agree with you guys. the Design is crap here, but my solution is for people who cannot change the whole design to add a new column to a table like this and to explain how the best way to traverse through a table to do this type of an update.

I am sure there are lot of people who were trying to do this type of thing and waited so long it to finish saying to themlesves "This is a very large table, no wonder it takes so long". This will help them to understand the underline operation bit better and approach it more efficiently next time.

And others who are designing the tables, now you know the troubles ahead if you didn't design it properly at the first time.


Bimal Fernando
Bimal Fernando
Mr or Mrs. 500
Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)

Group: General Forum Members
Points: 515 Visits: 87

OK, firstly sorry if my writing is bit confusing. i'll try to clear it up with this explanation.

My first example which updates the newzip column, is actually to demonstrate the method widely suggested to follow when you add a new column to a very large table. That's just an example, and i first used that method to do the update on my own table called [LargeTable] and that's the one took 20+ hours. And i haven't posted it in the article because the provided example (with NewZip column) is good enough to understand what it does.

But for the solution i am suggesting, i have given the exact code i ran on my table [LargeTable].

Hope this clears the confusion.


SDM
SDM
Mr or Mrs. 500
Mr or Mrs. 500 (566 reputation)Mr or Mrs. 500 (566 reputation)Mr or Mrs. 500 (566 reputation)Mr or Mrs. 500 (566 reputation)Mr or Mrs. 500 (566 reputation)Mr or Mrs. 500 (566 reputation)Mr or Mrs. 500 (566 reputation)Mr or Mrs. 500 (566 reputation)

Group: General Forum Members
Points: 566 Visits: 401

I was going to keep out of this until the statement about the design being crap.

The clustered index is there - on that specific column - because it was designed that way for the critical operations on that DB. The performance-critical activities are searches on data, where the search criteria (usually) translate quite nicely into range searches on the column implemented by the clustered index. There is no other column on that table which would give the slightest benefit in a range search.

As to length... yes, that does look a bit silly. It is in fact accommodating the largest value that could be constructed as the search key without truncation. Does it really need to be that big? No. Nearly 98% of values are 25 characters or less, and the maximum is 56.

As to inserting / page splits. Yep, the effect would be rather worse than if (say) the identity column was used, but (a) the column is never updated, (b) inserts are never done online and (c) the usual insert rate is very low (well, there is one occasional batch process that's a bit heavy, and a bit of a dog, but that's a separate thread entirely).

What are my motives in responding? Well, yes, I had some involvement in this design, but more important is the principle. Don't automatically assume character columns are bad in clustered indexes. A proper design needs to understand ALL the intended uses of the table, and make indexing decisions accordingly.

An interesting article from Chris Hedgate is at: http://www.sqlservercentral.com/columnists/chedgate/clusterthatindex.asp





GSquared
GSquared
SSC Guru
SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)

Group: General Forum Members
Points: 57369 Visits: 9730
I have to agree with SDM. Just because, in general, a clustered index on an alpha field (char, nchar, varchar, nvarchar) is usually a bad idea, doesn't mean it's always a bad idea. Keep in mind that these things are guidelines, not laws.

A simple example is a list of US states. I have the clustered index (and primary key) on the 2-letter abbreviation in my databases. Why wouldn't I?

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
sqlJunkies2
sqlJunkies2
Valued Member
Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)

Group: General Forum Members
Points: 55 Visits: 41
I saw the range table creation script and think that we can create it with another twist by utilizing Izik-Ben-Gan "Sequential Number table function" without using loop.

Result expected

RangeStart RangeEnd
AA AB
AB AC
....

The final solution be like one line of query.

WITH
L0 AS (
SELECT CHAR(n) AS Alphabet FROM dbo.[fn_nums](90) WHERE n >=65)
SELECT
A.Alphabet + B.Alphabet AS RangeStart,
A.Alphabet + char(ascii(B.Alphabet)+1) AS RangeEnd
FROM L0 AS A, L0 AS B
WHERE
A.Alphabet + B.Alphabet <> 'ZZ'
ORDER BY
A.Alphabet + B.Alphabet


Basically the first CTE

L0 AS (SELECT CHAR(n) AS Alphabet FROM dbo.[fn_nums](90) WHERE n >=65)

Will give us 'A' to 'Z' and then we cross join itself with the second column is the rangeEnd with second alphabet is the next alphabet of the first column. (It's easier just try to match the result to the query)


But before that can run we'll need to create THE function

CREATE FUNCTION [dbo].[fn_nums](@n AS BIGINT) RETURNS TABLE
AS
RETURN
WITH
L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 AS A, L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 AS A, L1 AS B),
L3 AS(SELECT 1 AS c FROM L2 AS A, L2 AS B),
L4 AS(SELECT 1 AS c FROM L3 AS A, L3 AS B),
L5 AS(SELECT 1 AS c FROM L4 AS A, L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY c) AS n FROM L5)
SELECT n FROM Nums WHERE n <= @n;

This function will return a table result with a single column n and start from 1 to the parameter @n super fast. Very useful function check out Inside TSQL querying book by Izik Ben Gan.

James
Bimal Fernando
Bimal Fernando
Mr or Mrs. 500
Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)Mr or Mrs. 500 (515 reputation)

Group: General Forum Members
Points: 515 Visits: 87
GSquared (4/7/2008)
I have to agree with SDM. Just because, in general, a clustered index on an alpha field (char, nchar, varchar, nvarchar) is usually a bad idea, doesn't mean it's always a bad idea. Keep in mind that these things are guidelines, not laws.

A simple example is a list of US states. I have the clustered index (and primary key) on the 2-letter abbreviation in my databases. Why wouldn't I?


I agree. But please note that I meant to say NVARCHAR(341) is not a good choice for a clustered key and I was talking about very large table having around 30-40 million rows where defragmentation and page splits (reorganizing of pages) can be a real issue.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search