Comments posted to this topic are about the item Data Profiling with T-SQL
Thanks for the excellent article, and the wonderfully useful script. I can already think of a number of instances where something like this could have been huge time savers.
One thing of note, there appears to be a character that got lost in translation along the way.
On Line 114: There appears to be a single quote (') that was incorrectly converted to value '#39;'
WHERE DATA_TYPE IN ('Date', 'DateTime', 'SmallDateTime', #39;DateTime2', 'time')
Thanks Ken, I will see if I can get this corrected.
Glad that you find this script useful.
Adam, excellent article and my favorite topic.
I have several articles and scripts also on Data Profiling via SSIS(DataFlow script component) and also TSQL.
[font="Comic Sans MS"]Ira Warren Whiteside[/font]
Nice script Adam, but I found another typo. About line 257, you'll find the following:
+ QUOTENAME(@COLUMN_NAME) + ')
NLL should be NULL, right?
Thanks for the script though. It works great after I cleaned up the 2 typos. 🙂
This is great stuff! Thanks very much!
It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens[/url]
Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
when I execute the attached script for my table and for all columns, I am getting the following error:
207 Invalid column name 'NLL'.
Here are my values:
DECLARE @TABLE_SCHEMA NVARCHAR(128) = 'dbo' -- Your schema here
DECLARE @TABLE_NAME NVARCHAR(128) = 'Customer' -- Your table here
DECLARE @ColumnListIN NVARCHAR(4000) = '' -- Enter a comma-separated list of specific columns
-- to profile, or leave blank for all
Please help me on this error.
Note to Author. This is a GREAT script but I also got the NLL error. Appears there's typo in line 257 ... just replace NLL with NULL and it works like a charm
This is a really useful script, thanks.
I was using it on a 160,000,000 row table and noticed that I got an arithmetic overflow error. Traced this down to the following part
, AVG(LEN(' + QUOTENAME(@COLUMN_NAME) + '))
I cast the result of the LEN as bigint before calculating the AVG to overcome this:
, AVG(CAST(LEN(' + QUOTENAME(@COLUMN_NAME) + ') AS BIGINT))
Thanks for the comment and I am glad that this was useful.
160 M records - wow!
All the best,
Excellent script... thank you.
I have two issues to report...
1 - DECLARE @ColumnList VARCHAR(4000) = ' ' - this is too short for tables with lots of columns... consider using a varchar(max) - or even a dynamic value like sum of the len() of all columns from a table etc...
2 - for columns which are BIT data type... I noticed that when all rows have value of "0", it doesn't get included in the results output... not sure why.
kkffjj - Saturday, April 15, 2017 3:08 PM
3 - you need to have a condition before using ISDATE()... as this function fails on values > 4000 characters in length.
The foreign key check does not work correctly.
On Line 404 there is an update with a join between the #ProfileData table and a CTE with the check for FK. The #ProfileData COLUMN_NAME field contains square brackets on the field names and the CTE does not.
I changed the UPDATE to this:
SET P.IsFK = 1
FROM #ProfileData P
INNER JOIN FK_CTE CTE
ON P.COLUMN_NAME = '[' + CTE.FKColumnName + ']'
The ISDATE function does not work for columns with long texts.
When I replaced
WHERE ISDATE(' + QUOTENAME(@COLUMN_NAME) + ') = 1)
WHERE TRY_CAST(' + QUOTENAME(@COLUMN_NAME) + ' AS Date) IS NOT NULL)
it worked fine.
Viewing 15 posts - 1 through 15 (of 17 total)