SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Importing CSV files without knowing destination schema


Importing CSV files without knowing destination schema

Author
Message
thomas.lemesurier
thomas.lemesurier
Valued Member
Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)

Group: General Forum Members
Points: 67 Visits: 91
I have 4 very large CSV files (large number of records and large number of fields) and need to script the process of importing the data from these CSVs into SQL. I gather that the Bulk Insert command should do the job here, however, it would appear that this command requires that a suitable table is created first. I don't have the exact schema of the database that the CSVs were exported from and would like to avoid having to go through the CSVs manually creating a table with the right number and size fields.

So I thought I would use the Import Data wizard to manually import the CSVs (thus creating a table with the correct fields). Then I'd be able to script the creation of the table, and include that part in my Bulk Insert script (just before I call the actual Bulk Insert command itself). I hope that part makes sense. Please let me know if there's a better way of doing this.

Problem is that although I've tried using the "Suggest Types" option, when I actually go to import the file I get errors warning of truncation, even when I go through and manually increase the size of the fields.... Is there an easier way to do this?
Carolyn Richardson
Carolyn Richardson
SSCommitted
SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)

Group: General Forum Members
Points: 1870 Visits: 3486
Do the files have to be CSV? as this might be easier if they were XML.

Facts are stubborn things, but statistics are more pliable - Mark Twain
Carolyn
SQLServerSpecialists
Lowell
Lowell
One Orange Chip
One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)

Group: General Forum Members
Points: 28093 Visits: 39934
here's my suggestion:
add a folder which contains all of your .csv files as a linked server.
a text linked server will show evert .txt file and every.csv file as a table...
so you can do something really simple like
SELECT *
INTO NEWTABLE
FROM TxtSvr...sample#csv

which would create the table on sql server with the same structure.
i think csv files assume the first row in the file is the name of the columns.

here's the syntax:

--===== Create a linked server to the drive and path you desire.
EXEC dbo.sp_AddLinkedServer TxtSvr,
'Jet 4.0',
'Microsoft.Jet.OLEDB.4.0',
'C:\',
NULL,
'Text'
GO
--===== Set up login mappings.
EXEC dbo.sp_AddLinkedSrvLogin TxtSvr, FALSE, NULL, Admin, NULL
GO
--===== List the tables in the linked server which is really a list of
-- file names in the directory. Note that the "#" sign in the
-- Table_Name is where the period in the filename actually goes.
EXEC dbo.sp_Tables_Ex TxtSvr
GO
--===== Query one of the files by using a four-part name.
SELECT *
FROM TxtSvr...sample#csv


--===== Drop the text server
EXEC dbo.sp_DropServer 'TxtSvr', 'DropLogins'
GO



Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)

Group: General Forum Members
Points: 85567 Visits: 41082
Lowell (8/27/2008)
here's my suggestion:
add a folder which contains all of your .csv files as a linked server.
a text linked server will show evert .txt file and every.csv file as a table...
so you can do something really simple like
SELECT *
INTO NEWTABLE
FROM TxtSvr...sample#csv

which would create the table on sql server with the same structure.
i think csv files assume the first row in the file is the name of the columns.

here's the syntax:

--===== Create a linked server to the drive and path you desire.
EXEC dbo.sp_AddLinkedServer TxtSvr,
'Jet 4.0',
'Microsoft.Jet.OLEDB.4.0',
'C:\',
NULL,
'Text'
GO
--===== Set up login mappings.
EXEC dbo.sp_AddLinkedSrvLogin TxtSvr, FALSE, NULL, Admin, NULL
GO
--===== List the tables in the linked server which is really a list of
-- file names in the directory. Note that the "#" sign in the
-- Table_Name is where the period in the filename actually goes.
EXEC dbo.sp_Tables_Ex TxtSvr
GO
--===== Query one of the files by using a four-part name.
SELECT *
FROM TxtSvr...sample#csv


--===== Drop the text server
EXEC dbo.sp_DropServer 'TxtSvr', 'DropLogins'
GO



Heh... looks real familiar... Smile

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
thomas.lemesurier
thomas.lemesurier
Valued Member
Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)

Group: General Forum Members
Points: 67 Visits: 91
Hi Lowell, Hi Jeff,

Thanks for your suggestion - sounds just right for my situation. Everything works fine until I get to this bit:
SELECT * 
FROM TxtSvr...sample#csv

It says the table isn't there. In fact the text server contains just three tables and they are debuglog#txt, services#txt and test#txt. None of my CSV files have been picked up. Any ideas on what I need to change?
thomas.lemesurier
thomas.lemesurier
Valued Member
Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)

Group: General Forum Members
Points: 67 Visits: 91
Ok, apologies for that guys - I needed to put the csv files on the sql server, not my local drive Blush
thomas.lemesurier
thomas.lemesurier
Valued Member
Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)

Group: General Forum Members
Points: 67 Visits: 91
Hmmm, sorry that this is dragging on everyone, now that I have the files in the right place (Wink) the tables are being picked up. However, when I open the tables, there are twice as many rows in the table than in the CSV file. And every value in every row/column is NULL.

I should say that my CSV files are actually TAB separated. However, I've tried replacing the tabs with commas and although I then get the correct number of columns, I'm still getting all NULLs. Not sure if it makes any difference, but I've got tons of columns (>50)...
Lowell
Lowell
One Orange Chip
One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)

Group: General Forum Members
Points: 28093 Visits: 39934
thomas.lemesurier (8/28/2008)
Hmmm, sorry that this is dragging on everyone, now that I have the files in the right place (Wink) the tables are being picked up. However, when I open the tables, there are twice as many rows in the table than in the CSV file. And every value in every row/column is NULL.

I should say that my CSV files are actually TAB separated. However, I've tried replacing the tabs with commas and although I then get the correct number of columns, I'm still getting all NULLs. Not sure if it makes any difference, but I've got tons of columns (>50)...


I'm pretty sure the first row determines the number of columns...if there's anything preceeding the data, that might be an issue.

I could understand getting some rows that are null for all fields at the end of the file; every CrLf(carriage Return Line Feed...Char(10) + Char(13) is used to determine whether a row exists; if there are a bunch of CrLf at the end of the document, I'd expect that,and would use a WHERE clause to ignore then (WHERE COL1 IS NOT NULL or something)

sometimes a better text editor can help; I use EditPlus, which has find and replace that allows regular expressions and more; if i were to open a file like this in a linked text server, you can see that rows 9 thru 14 would exist and be null.

Note how this text editor displays CrLf witha paragraph symbol, Tabs As a Dbl Arrow character,and spaces as a floating period...it helps visualize the data much better.


If ALL of your data is null, it might be that the file is unix formatted...it has CHAR(10) as the row limiter, and doesn't the additional have Char(13) that is expected;

you could find and replace, or for example with editplus, you can open the file, and save it as a PC file with document...Fileformat..PC/UNIX/Mac setting and re-save the text file.
Lots of other text editors have the same ability.

Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)

Group: General Forum Members
Points: 85567 Visits: 41082
thomas.lemesurier (8/28/2008)
Hmmm, sorry that this is dragging on everyone, now that I have the files in the right place (Wink) the tables are being picked up. However, when I open the tables, there are twice as many rows in the table than in the CSV file. And every value in every row/column is NULL.

I should say that my CSV files are actually TAB separated. However, I've tried replacing the tabs with commas and although I then get the correct number of columns, I'm still getting all NULLs. Not sure if it makes any difference, but I've got tons of columns (>50)...


Time for you to attach a file so we can see what it actually looks like.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
thomas.lemesurier
thomas.lemesurier
Valued Member
Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)Valued Member (67 reputation)

Group: General Forum Members
Points: 67 Visits: 91
Hmmm, Lowell hit the nail on the head - it's a unicode file. But having said that, I've tried saving it out as ANSI (just using notepad) just it's still giving me nothing but NULLS.

OK, I've uploaded a cut down version of the file and stuck a .txt extension on the end so it can be uploaded. Thanks,

Tom
Attachments
Artists_mini.csv.txt (35 views, 980 bytes)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search