|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Yesterday @ 5:20 PM
Points: 2,467,
Visits: 6,157
|
|
Charles Kincaid (8/3/2010)
Great article. If you have to parse strings in SQL then having the tally table is great. I would have appraoched the problem with a different tool set. SELECT '|' + CountryName + '|' FROM Country This would have shown me that there were hidden characters. Pasting a snipet into my favorite text editor and hovering over one of the bad apples whould have gotten me to writing the replace statement. Yet since you have to show the value of Tally I think that you did it very well. I'm looking forward to the next installment. I'm still trying to sell certain managament of the value of Tally and Dates.
You're looking forward to the next installment? I'd better come up with one then! : -) I'm sure I'll find more uses in what I'm doing currently and I'll write them up when I do.
-------------------------------------- When you encounter a problem, if the solution isn't readily evident go back to the start and check your assumptions. -------------------------------------- It’s unpleasantly like being drunk. What’s so unpleasant about being drunk? You ask a glass of water. -- Douglas Adams
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Friday, February 22, 2013 9:33 AM
Points: 37,
Visits: 67
|
|
| Excel is not a friendly place to "clean things up". I have frequently found Excel to insert unfriendly characters into data, and often to truncate data strings.
|
|
|
|
|
Say Hey Kid
      
Group: General Forum Members
Last Login: Monday, June 10, 2013 1:08 PM
Points: 679,
Visits: 2,038
|
|
Charles Kincaid (8/3/2010)
Great article. If you have to parse strings in SQL then having the tally table is great. I would have appraoched the problem with a different tool set. SELECT '|' + CountryName + '|' FROM Country This would have shown me that there were hidden characters. Pasting a snipet into my favorite text editor and hovering over one of the bad apples whould have gotten me to writing the replace statement.
The above is a reasonably simple way of doing it; also, cut and pasting from a cell in the grid into the editor in SSMS/Query Analyzer does the same thing; you can see that there's something else there.
From the article, however, I spot what I consider to be the real issue: "I found the list for countries on Wikipedia. I copied the first 4 columns of the table, dropped them into Excel to clean them up and imported the result into a SQL Server table. I created the table and used the wizard to pull the data in."
While the tally table exercise was entertaining, what I see as the real issue is that the data that was imported into SQL Server was never actually inspected. In cases like this, I generally have two "standard" ways of doing things: 1) Generate a text file for a BULK INSERT/BCP... then pull it up in a hex editor (HxD or your favorite) to see what's what. This instantly shows you everything, right down to end of lines, end of pages, and so on. 2) Use EXCEL to generate INSERT statements, i.e. =CONCATENATE("INSERT INTO table VALUES ('",A1,"')") Again, you'd see the extra characters instantly between the tick marks in your CREATE statement.
I try to always look at the source data first; looking at the end result generally takes long.
That said, an interesting use of the tally table character splitter technique.
|
|
|
|
|
SSCoach
         
Group: General Forum Members
Last Login: Yesterday @ 3:33 PM
Points: 18,858,
Visits: 12,443
|
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: 2 days ago @ 1:29 PM
Points: 2,278,
Visits: 3,011
|
|
Thanks for the article. I have actually used this technique myself and find it very useful. I do a little more filtering to get the unwanted characters. Here is something similar to what i use.
DECLARE @t TABLE(col VARCHAR(MAX)); INSERT INTO @t VALUES ('zzzzzzzzz' + CHAR(13)); INSERT INTO @t VALUES ('zzzzzzzzz' + CHAR(11) + CHAR(13)); INSERT INTO @t VALUES ('zzzz'); INSERT INTO @t VALUES ('zzzz-'); INSERT INTO @t VALUES ('zz.zz'); INSERT INTO @t VALUES ('zz?zz');
SELECT col, n AS Pos, SUBSTRING(col,n,1) AS [ASCII_Char], ASCII(SUBSTRING(col,n,1)) AS [ASCII_Cd] FROM @t t INNER JOIN dbo.[Numbers] n ON n.n <= LEN(col) WHERE SUBSTRING(col,n,1) LIKE '[^A-Za-z0-9/-.?]' ESCAPE '/' --use escape to build your exception list
My blog: http://jahaines.blogspot.com
|
|
|
|