Log in
::
Register
::
Not logged in
Search:
Home
Articles
Editorials
Forums
Scripts
Blogs
QotD
Books
SQL Jobs
Training
Authors
About us
Contact us
Newsletters
Advertise
Write for us
Recent Posts
Recent Posts
Popular Topics
Popular Topics
Home
Search
Members
Calendar
Who's On
Home
»
SQL Server 7,2000
»
T-SQL
»
BULK INSERT & Text Qualifier
21 posts, Page 1 of 3
1
2
3
»
»»
BULK INSERT & Text Qualifier
Rate Topic
Display Mode
Topic Options
Author
Message
mdjtlj
mdjtlj
Posted Monday, November 17, 2003 10:06 AM
Forum Newbie
Group: General Forum Members
Last Login: Wednesday, May 27, 2009 12:00 PM
Points: 4,
Visits: 19
Greetings,
I have a st*&id vendor that provides files that we need to load in the following format:
"datahere","data with a comma, more","xxx", etc...
Bastically it is comma-delimited with a text qualifier of '"'
This is giving some heartache with respect to the use of BULK INSERT, as the fieldterminator and row terminator all work correctly, the only problem is that the first field is hosed. Here's the example T-SQL:
BULK INSERT [InterPay Office].dbo.[address] FROM
"c:\data\interlake\interpay office\tempfiles\address.csv"
WITH
(DATAFILETYPE='CHAR',
FIELDTERMINATOR='","',
FIRSTROW=2,
ROWTERMINATOR='"')
A sample of the output would be like this:
ID Name Address etc....
"1 Michael 656 whereever
"2 Joe 12345 whatever
Anybody have any suggestions about how to fix this. The sp is going to be run nightly, where we have about 15 tables to import.
I guess that I could do the following as alternatives:
1. Take the data as is and place another sp that does the replace of '"' with '' via a cursor
2. Use VB to redo their files into a "decent" format.
DTS allows you to specify the delimiter, text qualifier, and row terminator, but all of this functionality is not available in this method.
Anyone have more suggestions?
Post #18289
Jonathan
Jonathan
Posted Monday, November 17, 2003 10:31 AM
SSC Eights!
Group: General Forum Members
Last Login: Monday, November 12, 2007 6:02 AM
Points: 925,
Visits: 9
Not sure I understand your question, but did you try SET QUOTED_IDENTIFIER OFF at the beginning of the script?
--Jonathan
--Jonathan
Post #87412
mdjtlj
mdjtlj
Posted Monday, November 17, 2003 10:45 AM
Forum Newbie
Group: General Forum Members
Last Login: Wednesday, May 27, 2009 12:00 PM
Points: 4,
Visits: 19
Yes,
SET QUOTED_IDENTIFIER OFF was set at the top of the script. The basic problem is that I've got a file with data fields such as:
"3637","","","","1505 London Road",......
and using the BULK INSERT command (as shown) it is giving
"3637 <NULL> <NULL> <NULL> 1505 London Road
Everything is right, except for the "3637 which keeps on picking up the ".
Hopefully that is clearer.
Post #87413
Jonathan
Jonathan
Posted Monday, November 17, 2003 11:27 AM
SSC Eights!
Group: General Forum Members
Last Login: Monday, November 12, 2007 6:02 AM
Points: 925,
Visits: 9
See if you can get it working using bcp interactively and then use the generated format file with BULK INSERT.
--Jonathan
--Jonathan
Post #87414
ThomasH
ThomasH
Posted Monday, November 17, 2003 12:57 PM
Old Hand
Group: General Forum Members
Last Login: Tuesday, March 29, 2005 2:31 PM
Points: 393,
Visits: 1
You can try to set your rowterminator to '"/n'. This way, SQL will "eat" the doublequote that starts each record after the 1st record. It might be cleaner in using a format file since each column delimiter may be specified individually.
Once you understand the BITs, all the pieces come together
Post #87415
Mark Harr
Mark Harr
Posted Tuesday, November 18, 2003 5:54 AM
Ten Centuries
Group: General Forum Members
Last Login: Thursday, June 25, 2009 11:37 PM
Points: 1,060,
Visits: 83
quote:
You can try to set your rowterminator to '"/n'.
Nice idea, Thomas. Although I think you meant using
'"'
, where the doublequote is after the newline tag.
I had a similar situation, but solved it with using a format file to skip the first character, using a single doublequote as field terminator, and skipping the 2 char field ',"' between each field. I certainly like the combination of using the 3 char field terminator and 2 char row terminator, it would have saved some time developing and testing the format file.
Mark
Post #87416
Jeremy Kemp
Jeremy Kemp
Posted Tuesday, November 18, 2003 6:58 AM
Old Hand
Group: General Forum Members
Last Login: Thursday, March 23, 2006 6:53 AM
Points: 348,
Visits: 1
On a similar vein, I have used a field terminator or "," (double quote comman double quote).
The downside to this is that the first double quote and the last double quote in the line are not removed so I had to remove them with string manipulation.
select substring(col1,1,len(col1)-1),...., left(collast,len(collast)-1)
etc.
Jeremy
Post #87417
Mike Mortensen
Mike Mortensen
Posted Tuesday, November 18, 2003 11:53 AM
Grasshopper
Group: General Forum Members
Last Login: Tuesday, May 05, 2009 3:29 PM
Points: 13,
Visits: 10
I've had to figure this out in the past. Here's what you need to do, using your example.
"datahere","data with a comma, more","xxx"
FMT file:
7.0
3
1 SQLCHAR 0 10 "\",\"" 1 DataHereColumn
2 SQLCHAR 0 30 "\",\"" 2 DataWithCommaColumn
3 SQLCHAR 0 3 "\"\r" 3 xxxColumn
Using \",\" will treat the "," as the field terminator. You need to include the forward slash before the double quote mark so the format treats it as a real character.
BULK INSERT query:
BULK INSERT database.owner.targettable FROM 'drive:\path\file.txt' WITH (FIRSTROW = 2, FORMATFILE = 'drive:\path\formatfile.fmt')
The first record should look like this when you're done:
Field1: "datahere
Field2: data with a comma, more
Field3: xxx
Next, you need to remove the leading double quote from Field1 for ALL records. If you do not have double quotes in the data, just use:
UPDATE database.owner.table
SET Field1 = REPLACE(Field1, '"', '')
Let me know if this doesn't work for you!
Edited by - mmortensen on 11/18/2003 12:01:58 PM
Post #87418
sainswor99
sainswor99
Posted Tuesday, November 18, 2003 9:38 PM
Forum Newbie
Group: General Forum Members
Last Login: Wednesday, June 13, 2007 6:00 AM
Points: 8,
Visits: 1
To be honest, it sounds more like a business relationship problem rather than a programming problem; how good is your relationship with this vendor? All of the solutions below are creative and may solve the immediate problem, but does it cause more problems in the long run? Does the vendor understand that you are relying on a daily batch of this data, and are they committed to maintaining this non-standard format?
On the other hand, have you considered using DTS with an ActiveX script to transform the first column? Not sure if it will provide you with the speed you need, but it might be comparable. How much data per night are you talking; couple thousand rows or a couple million?
HTH,
Stu
Post #87419
mdjtlj
mdjtlj
Posted Thursday, November 20, 2003 8:32 AM
Forum Newbie
Group: General Forum Members
Last Login: Wednesday, May 27, 2009 12:00 PM
Points: 4,
Visits: 19
Greetings,
Thanks for you all of your support and thoughts on the matter. I've spent quite a few hours putting together format files, testing them and getting really pi#$ed.
Using the bcp to write the format file using the -n switch does not give the required prefix, so I ended up using a combination of the automatically generated as well as the format file that is created when you try to do a load where the data types are not specified (this gives a prefix). Took both of these fmt files and merged them to get what I'd consider a good format file.
Out of the 26 tables that I have to do this on, it worked on 24 tables. The only common thread that I can find is that these two tables have over 170 columns. The BULK INSERT and bcp both blow up, saying that the field length is too long for column 1 (which is a SQLINT 4). Rather than try to figure out how to cut down on the column load (not practical anyway) to find the breaking point, I decided on an alternate loading mechanism.
Ultimately, I've ended up putting together a generic CSV to XML translator and then used SQLXML 3.0 bulk load to force the data in. Decided against using .NET datasets, due to the slow nature of the load.
Would be interested in hearing any suggestions on the XML load speed issue (50,000 rows usually for each table)
Relationship with the vendor? Let's put it this way, they are using MDBS Titanium 6.1f (www.mdbs.com), navigational model, as the datastore and generally are not very responsive to requests such as this. The only saving grace is that they will be moving to mySQL in an upcoming release (please insert bashing here), so I'll be able to reach in and get what I need when needed.
Many thanks for all your help.
Post #87420
« Prev Topic
|
Next Topic »
21 posts, Page 1 of 3
1
2
3
»
»»
Permissions
You
cannot
post new topics.
You
cannot
post topic replies.
You
cannot
post new polls.
You
cannot
post replies to polls.
You
cannot
edit your own topics.
You
cannot
delete your own topics.
You
cannot
edit other topics.
You
cannot
delete other topics.
You
cannot
edit your own posts.
You
cannot
edit other posts.
You
cannot
delete your own posts.
You
cannot
delete other posts.
You
cannot
post events.
You
cannot
edit your own events.
You
cannot
edit other events.
You
cannot
delete your own events.
You
cannot
delete other events.
You
may
send private messages.
You
cannot
send emails.
You
may
read topics.
You
cannot
rate topics.
You
cannot
vote within polls.
You
cannot
upload attachments.
You
may
download attachments.
You
cannot
post HTML code.
You
cannot
edit HTML code.
You
cannot
post IFCode.
You
cannot
post JavaScript.
You
cannot
post EmotIcons.
You
cannot
post or upload images.
Copyright © 2002-2009 Simple Talk Publishing. All Rights Reserved.
Privacy Policy.
Terms of Use