Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Spring Clean Your Database Schema Expand / Collapse
Author
Message
Posted Saturday, January 17, 2009 12:33 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, June 18, 2014 10:03 PM
Points: 48, Visits: 836
Comments posted to this topic are about the item Spring Clean Your Database Schema
Post #638742
Posted Monday, January 19, 2009 6:07 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 9:31 AM
Points: 2,246, Visits: 554
Since this only works in SQL 2005 (due to the "OVER" clause on the "COUNT(*)" statement), you could replace the temporary table with a CTE.

Alternatively, for SQL 2000, you could use a nested query, and order by "COUNT(COLUMN_NAME) DESC" instead.



Post #639253
Posted Monday, January 19, 2009 6:37 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, April 27, 2012 8:54 AM
Points: 6, Visits: 36
Half the problem here is that SQL isn't very good at relational 'domains' i.e. a standard definition of a 'type' of column, with predefinied rules (constraints), datatype, valid values etc.

Also missing is the ability to abstract types i.e. when defining parameters in t/sql you can't define them based on a domain, or even an existing column (in ORACLE this is done with the '%' operator e.g.
title                   book.title%TYPE 

)

Finally, the lack of a 'record' datatype also makes life difficult, as you can't easily define a full set of local variables that map the types of a row in a table, resulting ineven MORE duplicate definitions e.g:
--define a PL/SQL table containing entries of type book_rec:
Type book_rec_tab IS TABLE OF book_rec%TYPE
INDEX BY BINARY_INTEGER;

Personally it's about time this stuff arrived! It would do more for consistency in typing than any amount of tools.

However, this is a useful stopgap tool - many thanks to the author!
Post #639278
Posted Monday, January 19, 2009 7:09 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, January 27, 2014 10:14 AM
Points: 1,322, Visits: 1,091
To make this work in SQL 2000
Change the first Query to:

-- Calculate prevalence of column name
SELECT COLUMN_NAME, CONVERT(DECIMAL(12,2), Count(*)* 100.0/t.total) as [%]
INTO #Prevalence
FROM INFORMATION_SCHEMA.COLUMNS
CROSS JOIN
(SELECT Count(*) as total FROM INFORMATION_SCHEMA.COLUMNS) t
GROUP BY COLUMN_NAME, t.total
-- Do the columns differ on datatype across the schemas and tables?

--
JimFive
Post #639300
Posted Monday, January 19, 2009 8:06 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 7:34 AM
Points: 1,554, Visits: 1,848
Ha! Just last Friday, I discovered I could not define a foreign key because the datatype was different between two tables. The first thing on my To-Do list for today was to check all the tables to see what other tables might be using the incorrect datatype.
Thanks for writing the query I need for me!
Post #639350
Posted Monday, January 19, 2009 8:46 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 7:34 AM
Points: 1,554, Visits: 1,848
I decided I wanted the Prevalence to be the percentage of all tables not of all columns, so I used James Goodwin's code and modified it to get the total table count.

SELECT COLUMN_NAME, CONVERT(DECIMAL(12,2), Count(*)* 100.0/t.totalTables) as [%]
INTO #Prevalence
FROM INFORMATION_SCHEMA.COLUMNS
CROSS JOIN
(SELECT Count(*) as totalTables FROM INFORMATION_SCHEMA.TABLES ) t
GROUP BY COLUMN_NAME, t.totalTables

Select * from #Prevalence gives me useful information, all by itself.
Post #639370
Posted Monday, January 19, 2009 3:00 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, October 1, 2010 5:55 PM
Points: 2, Visits: 21
In SQL Server 2008 in the first statement (SET TRANSACTION)

ISOLATIONLEVEL should be

ISOLATION LEVEL

Herman
Post #639538
Posted Wednesday, January 28, 2009 12:47 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 11:42 PM
Points: 1,352, Visits: 982
Ian,

I only wanted to see the column names that appear more than once, so that I exclude all the columns that are ok.

So I re-wrote your query to:
-- From http://www.sqlservercentral.com/articles/Admin/65138/


/*----------------------------------------------------------------------
Purpose: Identify columns having different datatypes, for the same column name.
Sorted by the prevalence of the mismatched column.
------------------------------------------------------------------------
Revision History:
06/01/2008 Ian_Stirk@yahoo.com Initial version.
2009-01-28 hsp@stovi.com; only show the column names that appear more than once
-----------------------------------------------------------------------*/
-- Do not lock anything, and do not get held up by any locks.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

BEGIN TRY
DROP TABLE #Prevalence
END TRY
BEGIN CATCH
END CATCH
BEGIN TRY
DROP TABLE #FieldList
END TRY
BEGIN CATCH
END CATCH

-- Calculate prevalence of column name
SELECT
COLUMN_NAME
,[%] = CONVERT(DECIMAL(12,2),COUNT(COLUMN_NAME)* 100.0 / COUNT(*)OVER())
INTO #Prevalence
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
-- Do the columns differ on datatype across the schemas and tables?
SELECT DISTINCT
C1.COLUMN_NAME
, C1.TABLE_SCHEMA + '.' + C1.TABLE_NAME AS TableName
, C1.DATA_TYPE
, C1.CHARACTER_MAXIMUM_LENGTH
, C1.NUMERIC_PRECISION
, C1.NUMERIC_SCALE
, [%]
INTO #FieldList
FROM INFORMATION_SCHEMA.COLUMNS C1
INNER JOIN INFORMATION_SCHEMA.COLUMNS C2 ON C1.COLUMN_NAME = C2.COLUMN_NAME
INNER JOIN #Prevalence p ON p.COLUMN_NAME = C1.COLUMN_NAME
WHERE ((C1.DATA_TYPE != C2.DATA_TYPE)
OR (C1.CHARACTER_MAXIMUM_LENGTH != C2.CHARACTER_MAXIMUM_LENGTH)
OR (C1.NUMERIC_PRECISION != C2.NUMERIC_PRECISION)
OR (C1.NUMERIC_SCALE != C2.NUMERIC_SCALE))
ORDER BY [%] DESC, C1.COLUMN_NAME, C1.TABLE_SCHEMA + '.' + C1.TABLE_NAME
-- Tidy up.
DROP TABLE #Prevalence

--SELECT * FROM #fieldList ORDER BY column_name

;
WITH cte (column_name, Data_type, Character_maximum_length, Numeric_precision, Numeric_Scale, UsedInTables, rownum) as
(
SELECT column_name
, MAX(data_type) AS Data_type, MAX(Character_maximum_length) AS Character_maximum_length, MAX(Numeric_precision) AS Numeric_precision
, MAX(Numeric_Scale) AS Numeric_Scale
, (SELECT TableName + ', ' FROM #fieldList F1
WHERE f1.Column_name= F.Column_name
AND (f1.Data_type = F.Data_Type OR F.Data_Type IS NULL)
AND (f1.Character_maximum_length = f.Character_maximum_length OR f.Character_maximum_length IS NULL)
AND (f1.Numeric_precision = f.Numeric_precision OR f.Numeric_precision IS NULL)
AND (f1.Numeric_Scale = f.Numeric_Scale OR f.Numeric_Scale IS NULL)
for xml path('')) AS UsedInTables
, ROW_NUMBER() OVER ( PARTITION BY column_name ORDER BY data_type, Character_maximum_length) AS rownum
FROM #fieldList F
WHERE
f.tablename NOT LIKE 'tmp%'
GROUP BY column_name, Data_type, Character_maximum_length, Numeric_precision, Numeric_Scale
)

SELECT c1.* FROM cte c1
INNER JOIN cte c2 ON c1.column_name = c2.Column_name
WHERE c2.rownum=2

Thank you very much for your initial script.
Best regards,

Henrik Staun Poulsen
Stovi Software



Post #644717
Posted Tuesday, March 11, 2014 7:11 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 12:35 PM
Points: 1,304, Visits: 1,287
A bit late but I made the following updates which were useful to me -- flag table versus view and identify if a field is an identity field. Maybe someone else will as well...

/*----------------------------------------------------------------------
Purpose: Identify columns having different datatypes, for the same column name.
Sorted by the prevalence of the mismatched column.
------------------------------------------------------------------------
Revision History:
06/01/2008 Ian_Stirk@yahoo.com Initial version.
03/11/2014 Martin McDonald -- Modified to add is_table and is_identity information
-----------------------------------------------------------------------*/
-- Do not lock anything, and do not get held up by any locks.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
-- Calculate prevalence of column name
SELECT
COLUMN_NAME
,[%] = CONVERT(DECIMAL(12,2),COUNT(COLUMN_NAME)* 100.0 / COUNT(*)OVER())
INTO #Prevalence
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
-- Do the columns differ on datatype across the schemas and tables?
SELECT DISTINCT
C1.COLUMN_NAME
, C1.TABLE_SCHEMA
, C1.TABLE_NAME
, case
when t.name is null then 'View'
else 'Table'
end as is_table
, c.is_identity
, C1.DATA_TYPE
, C1.CHARACTER_MAXIMUM_LENGTH
, C1.NUMERIC_PRECISION
, C1.NUMERIC_SCALE
, [%]
FROM INFORMATION_SCHEMA.COLUMNS C1
INNER JOIN INFORMATION_SCHEMA.COLUMNS C2 ON C1.COLUMN_NAME = C2.COLUMN_NAME
INNER JOIN #Prevalence p ON p.COLUMN_NAME = C1.COLUMN_NAME
left outer join sys.tables t on c1.TABLE_NAME = t.name
left outer join sys.columns c on t.object_id = c.object_id
and c1.COLUMN_NAME = c.name
WHERE ((C1.DATA_TYPE != C2.DATA_TYPE)
OR (C1.CHARACTER_MAXIMUM_LENGTH != C2.CHARACTER_MAXIMUM_LENGTH)
OR (C1.NUMERIC_PRECISION != C2.NUMERIC_PRECISION)
OR (C1.NUMERIC_SCALE != C2.NUMERIC_SCALE))
ORDER BY [%] DESC, C1.COLUMN_NAME, C1.TABLE_SCHEMA, C1.TABLE_NAME
-- Tidy up.
DROP TABLE #Prevalence
Post #1549720
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse