Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Bad data import Expand / Collapse
Author
Message
Posted Wednesday, April 28, 2010 9:26 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 6:19 AM
Points: 3,917, Visits: 5,109
Comments posted to this topic are about the item Bad data import

____________________________________________
Space, the final frontier? not any more...
All limits henceforth are self-imposed.
“libera tute vulgaris ex”
Post #912500
Posted Thursday, April 29, 2010 12:49 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 10:46 AM
Points: 13,622, Visits: 10,513
Very nice question. I work a lot with SSIS, but I hadn't heard of this function yet.



How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #912578
Posted Thursday, April 29, 2010 12:57 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 1:14 PM
Points: 5,977, Visits: 8,237
I got it right, because of elimination, but I don't understand the entire scenario.

Since the destination column is NVARCHAR(26), how can a value be outside the codepage range? I though codepages were relevant for non-Unicode data only?



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #912583
Posted Thursday, April 29, 2010 4:42 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 6:19 AM
Points: 3,917, Visits: 5,109
Hugo Kornelis (4/29/2010)
I got it right, because of elimination, but I don't understand the entire scenario.

Since the destination column is NVARCHAR(26), how can a value be outside the codepage range? I though codepages were relevant for non-Unicode data only?


in the situation that gave rise to this question, CODEPOINT identified the value of the character as 65533.
As far as I am aware, there is no charater (in any character set, albeit unicode or non-unicode) with that value.



____________________________________________
Space, the final frontier? not any more...
All limits henceforth are self-imposed.
“libera tute vulgaris ex”
Post #912681
Posted Thursday, April 29, 2010 4:46 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 1:14 PM
Points: 5,977, Visits: 8,237
stewartc-708166 (4/29/2010)
Hugo Kornelis (4/29/2010)
I got it right, because of elimination, but I don't understand the entire scenario.

Since the destination column is NVARCHAR(26), how can a value be outside the codepage range? I though codepages were relevant for non-Unicode data only?


in the situation that gave rise to this question, CODEPOINT identified the value of the character as 65533.
As far as I am aware, there is no charater (in any character set, albeit unicode or non-unicode) with that value.


So that was a bug in the mainframe program that exportet the file, then?



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #912684
Posted Thursday, April 29, 2010 4:53 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 6:19 AM
Points: 3,917, Visits: 5,109
Hugo Kornelis (4/29/2010)
stewartc-708166 (4/29/2010)
Hugo Kornelis (4/29/2010)
I got it right, because of elimination, but I don't understand the entire scenario.

Since the destination column is NVARCHAR(26), how can a value be outside the codepage range? I though codepages were relevant for non-Unicode data only?


in the situation that gave rise to this question, CODEPOINT identified the value of the character as 65533.
As far as I am aware, there is no charater (in any character set, albeit unicode or non-unicode) with that value.


So that was a bug in the mainframe program that exportet the file, then?


that is correct.
identifying the record and referring it to the custodian facilitated the extract to be rectified.


____________________________________________
Space, the final frontier? not any more...
All limits henceforth are self-imposed.
“libera tute vulgaris ex”
Post #912687
Posted Thursday, April 29, 2010 7:34 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 12:18 PM
Points: 2,818, Visits: 2,561
stewartc-708166 (4/29/2010)
Hugo Kornelis (4/29/2010)
I got it right, because of elimination, but I don't understand the entire scenario.

Since the destination column is NVARCHAR(26), how can a value be outside the codepage range? I though codepages were relevant for non-Unicode data only?


in the situation that gave rise to this question, CODEPOINT identified the value of the character as 65533.
As far as I am aware, there is no charater (in any character set, albeit unicode or non-unicode) with that value.



This is very interesting. I am wondering where to practically draw the line with regard to how high a number resulting from a CODEPOINT test to draw the line between alpha/numeric characters from any language and other symbols/characters. Looking at this reference:
http://www.ssec.wisc.edu/~tomw/java/unicode.html I see there is a Unicode character(?) for the value 65533. Additional searching shows that Unicode goes up to 10FFFF or 1114111. I suspect the place to draw the line is either 65518 or 65276. I would certainly appreciate a link reference or a bit of feedback with regard to how to slice Unicode for the data flow in this question.
Thanks.
Post #912833
Posted Thursday, April 29, 2010 8:03 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, September 19, 2012 8:39 AM
Points: 595, Visits: 1,226
Nice question.

Converting oxygen into carbon dioxide, since 1955.

Post #912874
Posted Thursday, April 29, 2010 8:11 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 2:22 PM
Points: 2,555, Visits: 3,809
65533 is a valid unicode number but represents a special replacement character. It is the highest value in the character set. dbowlin makes a good point about what values to check for. You would ahve to know the source and the target databases to know what's reasonable. I doubt that 65533 is reasonable under 99.9% of the cases but there may be rare instances.

Good question, though.
Post #912884
Posted Thursday, April 29, 2010 8:37 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: 2 days ago @ 2:46 PM
Points: 109, Visits: 598
Yeah, sounds like a HIGH-VALUE declaration in the COBOL (or whatever) code.
Post #912923
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse