|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, May 20, 2013 6:22 AM
Points: 47,
Visits: 762
|
|
|
|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Yesterday @ 11:03 AM
Points: 1,854,
Visits: 485
|
|
Since this only works in SQL 2005 (due to the "OVER" clause on the "COUNT(*)" statement), you could replace the temporary table with a CTE.
Alternatively, for SQL 2000, you could use a nested query, and order by "COUNT(COLUMN_NAME) DESC" instead.
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, April 27, 2012 8:54 AM
Points: 6,
Visits: 36
|
|
Half the problem here is that SQL isn't very good at relational 'domains' i.e. a standard definition of a 'type' of column, with predefinied rules (constraints), datatype, valid values etc.
Also missing is the ability to abstract types i.e. when defining parameters in t/sql you can't define them based on a domain, or even an existing column (in ORACLE this is done with the '%' operator e.g. title book.title%TYPE )
Finally, the lack of a 'record' datatype also makes life difficult, as you can't easily define a full set of local variables that map the types of a row in a table, resulting ineven MORE duplicate definitions e.g:
--define a PL/SQL table containing entries of type book_rec: Type book_rec_tab IS TABLE OF book_rec%TYPE INDEX BY BINARY_INTEGER;
Personally it's about time this stuff arrived! It would do more for consistency in typing than any amount of tools.
However, this is a useful stopgap tool - many thanks to the author!
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 2:52 PM
Points: 1,322,
Visits: 1,071
|
|
To make this work in SQL 2000 Change the first Query to:
-- Calculate prevalence of column name SELECT COLUMN_NAME, CONVERT(DECIMAL(12,2), Count(*)* 100.0/t.total) as [%] INTO #Prevalence FROM INFORMATION_SCHEMA.COLUMNS CROSS JOIN (SELECT Count(*) as total FROM INFORMATION_SCHEMA.COLUMNS) t GROUP BY COLUMN_NAME, t.total -- Do the columns differ on datatype across the schemas and tables?
-- JimFive
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: 2 days ago @ 7:15 AM
Points: 1,352,
Visits: 1,738
|
|
Ha! Just last Friday, I discovered I could not define a foreign key because the datatype was different between two tables. The first thing on my To-Do list for today was to check all the tables to see what other tables might be using the incorrect datatype. Thanks for writing the query I need for me!
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: 2 days ago @ 7:15 AM
Points: 1,352,
Visits: 1,738
|
|
I decided I wanted the Prevalence to be the percentage of all tables not of all columns, so I used James Goodwin's code and modified it to get the total table count.
SELECT COLUMN_NAME, CONVERT(DECIMAL(12,2), Count(*)* 100.0/t.totalTables) as [%] INTO #Prevalence FROM INFORMATION_SCHEMA.COLUMNS CROSS JOIN (SELECT Count(*) as totalTables FROM INFORMATION_SCHEMA.TABLES ) t GROUP BY COLUMN_NAME, t.totalTables
Select * from #Prevalence gives me useful information, all by itself.
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, October 01, 2010 5:55 PM
Points: 2,
Visits: 21
|
|
In SQL Server 2008 in the first statement (SET TRANSACTION)
ISOLATIONLEVEL should be
ISOLATION LEVEL
Herman
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 1:49 AM
Points: 1,132,
Visits: 855
|
|
Ian,
I only wanted to see the column names that appear more than once, so that I exclude all the columns that are ok.
So I re-wrote your query to: -- From http://www.sqlservercentral.com/articles/Admin/65138/
/*---------------------------------------------------------------------- Purpose: Identify columns having different datatypes, for the same column name. Sorted by the prevalence of the mismatched column. ------------------------------------------------------------------------ Revision History: 06/01/2008 Ian_Stirk@yahoo.com Initial version. 2009-01-28 hsp@stovi.com; only show the column names that appear more than once -----------------------------------------------------------------------*/ -- Do not lock anything, and do not get held up by any locks. SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED BEGIN TRY DROP TABLE #Prevalence END TRY BEGIN CATCH END CATCH BEGIN TRY DROP TABLE #FieldList END TRY BEGIN CATCH END CATCH -- Calculate prevalence of column name SELECT COLUMN_NAME ,[%] = CONVERT(DECIMAL(12,2),COUNT(COLUMN_NAME)* 100.0 / COUNT(*)OVER()) INTO #Prevalence FROM INFORMATION_SCHEMA.COLUMNS GROUP BY COLUMN_NAME -- Do the columns differ on datatype across the schemas and tables? SELECT DISTINCT C1.COLUMN_NAME , C1.TABLE_SCHEMA + '.' + C1.TABLE_NAME AS TableName , C1.DATA_TYPE , C1.CHARACTER_MAXIMUM_LENGTH , C1.NUMERIC_PRECISION , C1.NUMERIC_SCALE , [%] INTO #FieldList FROM INFORMATION_SCHEMA.COLUMNS C1 INNER JOIN INFORMATION_SCHEMA.COLUMNS C2 ON C1.COLUMN_NAME = C2.COLUMN_NAME INNER JOIN #Prevalence p ON p.COLUMN_NAME = C1.COLUMN_NAME WHERE ((C1.DATA_TYPE != C2.DATA_TYPE) OR (C1.CHARACTER_MAXIMUM_LENGTH != C2.CHARACTER_MAXIMUM_LENGTH) OR (C1.NUMERIC_PRECISION != C2.NUMERIC_PRECISION) OR (C1.NUMERIC_SCALE != C2.NUMERIC_SCALE)) ORDER BY [%] DESC, C1.COLUMN_NAME, C1.TABLE_SCHEMA + '.' + C1.TABLE_NAME -- Tidy up. DROP TABLE #Prevalence --SELECT * FROM #fieldList ORDER BY column_name
; WITH cte (column_name, Data_type, Character_maximum_length, Numeric_precision, Numeric_Scale, UsedInTables, rownum) as ( SELECT column_name , MAX(data_type) AS Data_type, MAX(Character_maximum_length) AS Character_maximum_length, MAX(Numeric_precision) AS Numeric_precision , MAX(Numeric_Scale) AS Numeric_Scale , (SELECT TableName + ', ' FROM #fieldList F1 WHERE f1.Column_name= F.Column_name AND (f1.Data_type = F.Data_Type OR F.Data_Type IS NULL) AND (f1.Character_maximum_length = f.Character_maximum_length OR f.Character_maximum_length IS NULL) AND (f1.Numeric_precision = f.Numeric_precision OR f.Numeric_precision IS NULL) AND (f1.Numeric_Scale = f.Numeric_Scale OR f.Numeric_Scale IS NULL) for xml path('')) AS UsedInTables , ROW_NUMBER() OVER ( PARTITION BY column_name ORDER BY data_type, Character_maximum_length) AS rownum FROM #fieldList F WHERE f.tablename NOT LIKE 'tmp%' GROUP BY column_name, Data_type, Character_maximum_length, Numeric_precision, Numeric_Scale )
SELECT c1.* FROM cte c1 INNER JOIN cte c2 ON c1.column_name = c2.Column_name WHERE c2.rownum=2
Thank you very much for your initial script. Best regards,
Henrik Staun Poulsen Stovi Software
|
|
|
|