Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««123

New Column Updates Expand / Collapse
Author
Message
Posted Tuesday, May 22, 2007 6:26 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, November 13, 2008 9:13 AM
Points: 499, Visits: 76
Why would anyone have a clustered index on an NVARCHAR(341) column? Just curious because this is a main part of the problem to begin with. Hopefully this does not get inserted by a user on a table this size.
Post #367896
Posted Tuesday, May 22, 2007 6:46 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, August 13, 2014 7:25 AM
Points: 138, Visits: 272
If DBA is not in total control of table indexes then .......   I can understand that dba is not always in control with database design but nobody tells dba where and how to use indexes but dba. It means that smart way to do the task was 1. analyze indexes  2. use standard way for the update.


Post #367909
Posted Wednesday, May 23, 2007 5:19 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, October 7, 2013 6:12 PM
Points: 135, Visits: 87

Remeber very carefully. Don't ever set a default value when you add a new column to a table with such a large number of rows. Becasue it is no different to addding the column as null and then issue one transaction to update all the 35 million records with the default value. It's the worst way of doing this. That'll load the whole server and block the table for a very long time as well.

Post #368468
Posted Wednesday, May 23, 2007 5:31 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, October 7, 2013 6:12 PM
Points: 135, Visits: 87

Yes, I agree with you guys. the Design is crap here, but my solution is for people who cannot change the whole design to add a new column to a table like this and to explain how the best way to traverse through a table to do this type of an update.

I am sure there are lot of people who were trying to do this type of thing and waited so long it to finish saying to themlesves "This is a very large table, no wonder it takes so long". This will help them to understand the underline operation bit better and approach it more efficiently next time.

And others who are designing the tables, now you know the troubles ahead if you didn't design it properly at the first time.

Post #368470
Posted Wednesday, May 23, 2007 5:46 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, October 7, 2013 6:12 PM
Points: 135, Visits: 87

OK, firstly sorry if my writing is bit confusing. i'll try to clear it up with this explanation.

My first example which updates the newzip column, is actually to demonstrate the method widely suggested to follow when you add a new column to a very large table. That's just an example, and i first used that method to do the update on my own table called [LargeTable] and that's the one took 20+ hours. And i haven't posted it in the article because the provided example (with NewZip column) is good enough to understand what it does.

But for the solution i am suggesting, i have given the exact code i ran on my table [LargeTable].

Hope this clears the confusion.

Post #368471
Posted Wednesday, May 23, 2007 10:50 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, August 12, 2014 1:23 AM
Points: 363, Visits: 362

I was going to keep out of this until the statement about the design being crap. 

The clustered index is there - on that specific column - because it was designed that way for the critical operations on that DB.  The performance-critical activities are searches on data, where the search criteria (usually) translate quite nicely into range searches on the column implemented by the clustered index.  There is no other column on that table which would give the slightest benefit in a range search. 

As to length... yes, that does look a bit silly.  It is in fact accommodating the largest value that could be constructed as the search key without truncation.  Does it really need to be that big?  No.  Nearly 98% of values are  25 characters or less, and the maximum is 56. 

As to inserting / page splits.  Yep, the effect would be rather worse than if (say) the identity column was used, but (a) the column is never updated, (b) inserts are never done online and (c) the usual insert rate is very low (well, there is one occasional batch process that's a bit heavy, and a bit of a dog, but that's a separate thread entirely).

What are my motives in responding?  Well, yes, I had some involvement in this design, but more important is the principle.  Don't automatically assume character columns are bad in clustered indexes.  A proper design needs to understand ALL the intended uses of the table, and make indexing decisions accordingly.

An interesting article from Chris Hedgate is at: http://www.sqlservercentral.com/columnists/chedgate/clusterthatindex.asp




Post #368525
Posted Monday, April 7, 2008 7:29 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 13,872, Visits: 9,596
I have to agree with SDM. Just because, in general, a clustered index on an alpha field (char, nchar, varchar, nvarchar) is usually a bad idea, doesn't mean it's always a bad idea. Keep in mind that these things are guidelines, not laws.

A simple example is a list of US states. I have the clustered index (and primary key) on the 2-letter abbreviation in my databases. Why wouldn't I?


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #480788
Posted Tuesday, January 6, 2009 10:33 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, November 14, 2009 5:07 PM
Points: 15, Visits: 41
I saw the range table creation script and think that we can create it with another twist by utilizing Izik-Ben-Gan "Sequential Number table function" without using loop.

Result expected

RangeStart RangeEnd
AA AB
AB AC
....

The final solution be like one line of query.

WITH
L0 AS (
SELECT CHAR(n) AS Alphabet FROM dbo.[fn_nums](90) WHERE n >=65)
SELECT
A.Alphabet + B.Alphabet AS RangeStart,
A.Alphabet + char(ascii(B.Alphabet)+1) AS RangeEnd
FROM L0 AS A, L0 AS B
WHERE
A.Alphabet + B.Alphabet <> 'ZZ'
ORDER BY
A.Alphabet + B.Alphabet


Basically the first CTE

L0 AS (SELECT CHAR(n) AS Alphabet FROM dbo.[fn_nums](90) WHERE n >=65)

Will give us 'A' to 'Z' and then we cross join itself with the second column is the rangeEnd with second alphabet is the next alphabet of the first column. (It's easier just try to match the result to the query)


But before that can run we'll need to create THE function

CREATE FUNCTION [dbo].[fn_nums](@n AS BIGINT) RETURNS TABLE
AS
RETURN
WITH
L0 AS(SELECT 1 AS c UNION ALL SELECT 1),
L1 AS(SELECT 1 AS c FROM L0 AS A, L0 AS B),
L2 AS(SELECT 1 AS c FROM L1 AS A, L1 AS B),
L3 AS(SELECT 1 AS c FROM L2 AS A, L2 AS B),
L4 AS(SELECT 1 AS c FROM L3 AS A, L3 AS B),
L5 AS(SELECT 1 AS c FROM L4 AS A, L4 AS B),
Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY c) AS n FROM L5)
SELECT n FROM Nums WHERE n <= @n;

This function will return a table result with a single column n and start from 1 to the parameter @n super fast. Very useful function check out Inside TSQL querying book by Izik Ben Gan.

James
Post #631196
Posted Wednesday, January 7, 2009 5:08 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, October 7, 2013 6:12 PM
Points: 135, Visits: 87
GSquared (4/7/2008)
I have to agree with SDM. Just because, in general, a clustered index on an alpha field (char, nchar, varchar, nvarchar) is usually a bad idea, doesn't mean it's always a bad idea. Keep in mind that these things are guidelines, not laws.

A simple example is a list of US states. I have the clustered index (and primary key) on the 2-letter abbreviation in my databases. Why wouldn't I?


I agree. But please note that I meant to say NVARCHAR(341) is not a good choice for a clustered key and I was talking about very large table having around 30-40 million rows where defragmentation and page splits (reorganizing of pages) can be a real issue.
Post #632012
« Prev Topic | Next Topic »

Add to briefcase «««123

Permissions Expand / Collapse