SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


collation with OPENROWSET BULK, file in different collation than server default


collation with OPENROWSET BULK, file in different collation than server default

Author
Message
e1785
e1785
SSC-Enthusiastic
SSC-Enthusiastic (184 reputation)SSC-Enthusiastic (184 reputation)SSC-Enthusiastic (184 reputation)SSC-Enthusiastic (184 reputation)SSC-Enthusiastic (184 reputation)SSC-Enthusiastic (184 reputation)SSC-Enthusiastic (184 reputation)SSC-Enthusiastic (184 reputation)

Group: General Forum Members
Points: 184 Visits: 109
I am having a difficulty with SQL Server collations (2008 R2) that is driving me up the wall. I've learned everything I can about MSSQL and Windows collations and character sets (lots of crazy details!), but I still can't tell what's going wrong. Hoping somebody here can save me...

My goal is simple:
-- I have a program that takes text files (all in Windows Code Page 1252) and bulk inserts them into database tables.
-- Every CHAR and VARCHAR field in every table has collation SQL_Latin1_General_CP1_CI_AS.
-- The format files mark these fields as SQLCHAR, likewise with collation SQL_Latin1_General_CP1_CI_AS.
-- The database and table default collations are Latin1_General_CI_AS.
-- The server default collation is Japanese_CI_AS
-- The result of sp_helpsort is Japanese, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive

I'm using MERGE commands, with the file as a source table. The relevant part:

SELECT mycol1, mycol2 FROM OPENROWSET
(BULK 'C:\my\file.txt', FORMATFILE='C:\my\format.fmt', FIRSTROW = 2)
AS Z



I receive the error:

Msg 7339, Level 16, State 1, Line 2
OLE DB provider 'BULK' for linked server '(null)' returned invalid data for column '[BULK].title'.



I've narrowed the problem down to accented "é" characters. When I remove those from the file, it loads properly. Since the file itself, the format file, and the table all understand the single-byte é, I don't see why it should cause problems.

The only idea I have -- and I hope this isn't right -- is if MSSQL is trying to convert the data to Unicode even though it's coming from and going to Windows-1252, due to the Server default of Japanese_CI_AS. Unfortunately, this page suggests that might be the case:

http://msdn.microsoft.com/en-us/library/ms191145(v=sql.90).aspx
"The use remote collation option specifies whether the collation of a remote column or of a local server will be used... When false, distributed queries always use the default collation of the local server instance, and collation name and the collation of remote columns are ignored.
...
"Using the linked server options is the only way to enable using remote collations. Therefore, queries that are constructed that use ad hoc names provided by OPENROWSET and OPENDATASOURCE cannot use the collation information of remote character data. Additionally, all linked servers in SQL Server version 7.0 that are upgraded to SQL Server 2000 or later are set to use remote collation = false."


I interpret this as meaning that it will read my Windows-1252 file as Unicode instead (Japanese_CI_AS). That would explain the error, since "é" is a two-byte character in Unicode and the one-byte equivalent wouldn't be understood. But why are my settings in the format file and table structure being ignored? More importantly, is there a way around this? This is an automated process and never processes the same file twice, so setting up the every new file as a "remote database" isn't an option unless that can also be automated.

Is there anything I can change to fix this? I can change the format file, the table settings, or the database settings. I can change the query, but only somewhat since I have to use MERGE as these are UPSERT operations. I can't change the server settings or the data files themselves.
Erland Sommarskog
Erland Sommarskog
SSCertifiable
SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)

Group: General Forum Members
Points: 5500 Visits: 875
How does the format file look like? Any chance you can attach the problematic datafile?

Erland Sommarskog, SQL Server MVP, www.sommarskog.se
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search