SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Spring Clean Your Database Schema


Spring Clean Your Database Schema

Author
Message
ianstirk
ianstirk
SSC Veteran
SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)

Group: General Forum Members
Points: 216 Visits: 1037
Comments posted to this topic are about the item Spring Clean Your Database Schema
richardd
richardd
Hall of Fame
Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)

Group: General Forum Members
Points: 3176 Visits: 648
Since this only works in SQL 2005 (due to the "OVER" clause on the "COUNT(*)" statement), you could replace the temporary table with a CTE.

Alternatively, for SQL 2000, you could use a nested query, and order by "COUNT(COLUMN_NAME) DESC" instead.



Phil Morris-454316
Phil Morris-454316
SSC Rookie
SSC Rookie (38 reputation)SSC Rookie (38 reputation)SSC Rookie (38 reputation)SSC Rookie (38 reputation)SSC Rookie (38 reputation)SSC Rookie (38 reputation)SSC Rookie (38 reputation)SSC Rookie (38 reputation)

Group: General Forum Members
Points: 38 Visits: 36
Half the problem here is that SQL isn't very good at relational 'domains' i.e. a standard definition of a 'type' of column, with predefinied rules (constraints), datatype, valid values etc.

Also missing is the ability to abstract types i.e. when defining parameters in t/sql you can't define them based on a domain, or even an existing column (in ORACLE this is done with the '%' operator e.g.
title                   book.title%TYPE 

)

Finally, the lack of a 'record' datatype also makes life difficult, as you can't easily define a full set of local variables that map the types of a row in a table, resulting ineven MORE duplicate definitions e.g:

--define a PL/SQL table containing entries of type book_rec:
Type book_rec_tab IS TABLE OF book_rec%TYPE
INDEX BY BINARY_INTEGER;



Personally it's about time this stuff arrived! It would do more for consistency in typing than any amount of tools.

However, this is a useful stopgap tool - many thanks to the author!
James Goodwin
James Goodwin
UDP Broadcaster
UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)

Group: General Forum Members
Points: 1473 Visits: 1107
To make this work in SQL 2000
Change the first Query to:


-- Calculate prevalence of column name
SELECT COLUMN_NAME, CONVERT(DECIMAL(12,2), Count(*)* 100.0/t.total) as [%]
INTO #Prevalence
FROM INFORMATION_SCHEMA.COLUMNS
CROSS JOIN
(SELECT Count(*) as total FROM INFORMATION_SCHEMA.COLUMNS) t
GROUP BY COLUMN_NAME, t.total
-- Do the columns differ on datatype across the schemas and tables?



--
JimFive
Carla Wilson-484785
Carla Wilson-484785
SSCrazy
SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)

Group: General Forum Members
Points: 2055 Visits: 1950
Ha! Just last Friday, I discovered I could not define a foreign key because the datatype was different between two tables. The first thing on my To-Do list for today was to check all the tables to see what other tables might be using the incorrect datatype.
Thanks for writing the query I need for me!
Carla Wilson-484785
Carla Wilson-484785
SSCrazy
SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)SSCrazy (2.1K reputation)

Group: General Forum Members
Points: 2055 Visits: 1950
I decided I wanted the Prevalence to be the percentage of all tables not of all columns, so I used James Goodwin's code and modified it to get the total table count.

SELECT COLUMN_NAME, CONVERT(DECIMAL(12,2), Count(*)* 100.0/t.totalTables) as [%]
INTO #Prevalence
FROM INFORMATION_SCHEMA.COLUMNS
CROSS JOIN
(SELECT Count(*) as totalTables FROM INFORMATION_SCHEMA.TABLES ) t
GROUP BY COLUMN_NAME, t.totalTables

Select * from #Prevalence gives me useful information, all by itself.
herman-991596
herman-991596
Forum Newbie
Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)

Group: General Forum Members
Points: 2 Visits: 22
In SQL Server 2008 in the first statement (SET TRANSACTION)

ISOLATIONLEVEL should be

ISOLATION LEVEL

Herman
henrik staun poulsen
henrik staun poulsen
SSCrazy
SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)

Group: General Forum Members
Points: 2197 Visits: 1225
Ian,

I only wanted to see the column names that appear more than once, so that I exclude all the columns that are ok.

So I re-wrote your query to:
-- From http://www.sqlservercentral.com/articles/Admin/65138/


/*----------------------------------------------------------------------
Purpose: Identify columns having different datatypes, for the same column name.
Sorted by the prevalence of the mismatched column.
------------------------------------------------------------------------
Revision History:
06/01/2008 Ian_Stirk@yahoo.com Initial version.
2009-01-28 hsp@stovi.com; only show the column names that appear more than once
-----------------------------------------------------------------------*/
-- Do not lock anything, and do not get held up by any locks.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

BEGIN TRY
DROP TABLE #Prevalence
END TRY
BEGIN CATCH
END CATCH
BEGIN TRY
DROP TABLE #FieldList
END TRY
BEGIN CATCH
END CATCH

-- Calculate prevalence of column name
SELECT
COLUMN_NAME
,[%] = CONVERT(DECIMAL(12,2),COUNT(COLUMN_NAME)* 100.0 / COUNT(*)OVER())
INTO #Prevalence
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
-- Do the columns differ on datatype across the schemas and tables?
SELECT DISTINCT
C1.COLUMN_NAME
, C1.TABLE_SCHEMA + '.' + C1.TABLE_NAME AS TableName
, C1.DATA_TYPE
, C1.CHARACTER_MAXIMUM_LENGTH
, C1.NUMERIC_PRECISION
, C1.NUMERIC_SCALE
, [%]
INTO #FieldList
FROM INFORMATION_SCHEMA.COLUMNS C1
INNER JOIN INFORMATION_SCHEMA.COLUMNS C2 ON C1.COLUMN_NAME = C2.COLUMN_NAME
INNER JOIN #Prevalence p ON p.COLUMN_NAME = C1.COLUMN_NAME
WHERE ((C1.DATA_TYPE != C2.DATA_TYPE)
OR (C1.CHARACTER_MAXIMUM_LENGTH != C2.CHARACTER_MAXIMUM_LENGTH)
OR (C1.NUMERIC_PRECISION != C2.NUMERIC_PRECISION)
OR (C1.NUMERIC_SCALE != C2.NUMERIC_SCALE))
ORDER BY [%] DESC, C1.COLUMN_NAME, C1.TABLE_SCHEMA + '.' + C1.TABLE_NAME
-- Tidy up.
DROP TABLE #Prevalence

--SELECT * FROM #fieldList ORDER BY column_name

;
WITH cte (column_name, Data_type, Character_maximum_length, Numeric_precision, Numeric_Scale, UsedInTables, rownum) as
(
SELECT column_name
, MAX(data_type) AS Data_type, MAX(Character_maximum_length) AS Character_maximum_length, MAX(Numeric_precision) AS Numeric_precision
, MAX(Numeric_Scale) AS Numeric_Scale
, (SELECT TableName + ', ' FROM #fieldList F1
WHERE f1.Column_name= F.Column_name
AND (f1.Data_type = F.Data_Type OR F.Data_Type IS NULL)
AND (f1.Character_maximum_length = f.Character_maximum_length OR f.Character_maximum_length IS NULL)
AND (f1.Numeric_precision = f.Numeric_precision OR f.Numeric_precision IS NULL)
AND (f1.Numeric_Scale = f.Numeric_Scale OR f.Numeric_Scale IS NULL)
for xml path('')) AS UsedInTables
, ROW_NUMBER() OVER ( PARTITION BY column_name ORDER BY data_type, Character_maximum_length) AS rownum
FROM #fieldList F
WHERE
f.tablename NOT LIKE 'tmp%'
GROUP BY column_name, Data_type, Character_maximum_length, Numeric_precision, Numeric_Scale
)

SELECT c1.* FROM cte c1
INNER JOIN cte c2 ON c1.column_name = c2.Column_name
WHERE c2.rownum=2

Thank you very much for your initial script.
Best regards,

Henrik Staun Poulsen
Stovi Software



m mcdonald
m mcdonald
SSCommitted
SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)

Group: General Forum Members
Points: 1948 Visits: 1688
A bit late but I made the following updates which were useful to me -- flag table versus view and identify if a field is an identity field. Maybe someone else will as well...

/*----------------------------------------------------------------------
Purpose: Identify columns having different datatypes, for the same column name.
Sorted by the prevalence of the mismatched column.
------------------------------------------------------------------------
Revision History:
06/01/2008 Ian_Stirk@yahoo.com Initial version.
03/11/2014 Martin McDonald -- Modified to add is_table and is_identity information
-----------------------------------------------------------------------*/
-- Do not lock anything, and do not get held up by any locks.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
-- Calculate prevalence of column name
SELECT
COLUMN_NAME
,[%] = CONVERT(DECIMAL(12,2),COUNT(COLUMN_NAME)* 100.0 / COUNT(*)OVER())
INTO #Prevalence
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
-- Do the columns differ on datatype across the schemas and tables?
SELECT DISTINCT
C1.COLUMN_NAME
, C1.TABLE_SCHEMA
, C1.TABLE_NAME
, case
when t.name is null then 'View'
else 'Table'
end as is_table
, c.is_identity
, C1.DATA_TYPE
, C1.CHARACTER_MAXIMUM_LENGTH
, C1.NUMERIC_PRECISION
, C1.NUMERIC_SCALE
, [%]
FROM INFORMATION_SCHEMA.COLUMNS C1
INNER JOIN INFORMATION_SCHEMA.COLUMNS C2 ON C1.COLUMN_NAME = C2.COLUMN_NAME
INNER JOIN #Prevalence p ON p.COLUMN_NAME = C1.COLUMN_NAME
left outer join sys.tables t on c1.TABLE_NAME = t.name
left outer join sys.columns c on t.object_id = c.object_id
and c1.COLUMN_NAME = c.name
WHERE ((C1.DATA_TYPE != C2.DATA_TYPE)
OR (C1.CHARACTER_MAXIMUM_LENGTH != C2.CHARACTER_MAXIMUM_LENGTH)
OR (C1.NUMERIC_PRECISION != C2.NUMERIC_PRECISION)
OR (C1.NUMERIC_SCALE != C2.NUMERIC_SCALE))
ORDER BY [%] DESC, C1.COLUMN_NAME, C1.TABLE_SCHEMA, C1.TABLE_NAME
-- Tidy up.
DROP TABLE #Prevalence
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search