Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Importing CSV files without knowing destination schema Expand / Collapse
Author
Message
Posted Wednesday, August 27, 2008 10:49 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, April 29, 2009 10:28 AM
Points: 33, Visits: 91
I have 4 very large CSV files (large number of records and large number of fields) and need to script the process of importing the data from these CSVs into SQL. I gather that the Bulk Insert command should do the job here, however, it would appear that this command requires that a suitable table is created first. I don't have the exact schema of the database that the CSVs were exported from and would like to avoid having to go through the CSVs manually creating a table with the right number and size fields.

So I thought I would use the Import Data wizard to manually import the CSVs (thus creating a table with the correct fields). Then I'd be able to script the creation of the table, and include that part in my Bulk Insert script (just before I call the actual Bulk Insert command itself). I hope that part makes sense. Please let me know if there's a better way of doing this.

Problem is that although I've tried using the "Suggest Types" option, when I actually go to import the file I get errors warning of truncation, even when I go through and manually increase the size of the fields.... Is there an easier way to do this?
Post #559811
Posted Wednesday, August 27, 2008 11:40 AM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: 2 days ago @ 9:01 AM
Points: 1,462, Visits: 3,015
Do the files have to be CSV? as this might be easier if they were XML.

Facts are stubborn things, but statistics are more pliable - Mark Twain
Carolyn
SQLServerSpecialists
Post #559838
Posted Wednesday, August 27, 2008 12:02 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 8:47 AM
Points: 12,905, Visits: 32,168
here's my suggestion:
add a folder which contains all of your .csv files as a linked server.
a text linked server will show evert .txt file and every.csv file as a table...
so you can do something really simple like
SELECT *
INTO NEWTABLE
FROM TxtSvr...sample#csv

which would create the table on sql server with the same structure.
i think csv files assume the first row in the file is the name of the columns.

here's the syntax:
--===== Create a linked server to the drive and path you desire.
EXEC dbo.sp_AddLinkedServer TxtSvr,
'Jet 4.0',
'Microsoft.Jet.OLEDB.4.0',
'C:\',
NULL,
'Text'
GO
--===== Set up login mappings.
EXEC dbo.sp_AddLinkedSrvLogin TxtSvr, FALSE, NULL, Admin, NULL
GO
--===== List the tables in the linked server which is really a list of
-- file names in the directory. Note that the "#" sign in the
-- Table_Name is where the period in the filename actually goes.
EXEC dbo.sp_Tables_Ex TxtSvr
GO
--===== Query one of the files by using a four-part name.
SELECT *
FROM TxtSvr...sample#csv


--===== Drop the text server
EXEC dbo.sp_DropServer 'TxtSvr', 'DropLogins'
GO



Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #559856
Posted Wednesday, August 27, 2008 8:38 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:25 PM
Points: 35,372, Visits: 31,925
Lowell (8/27/2008)
here's my suggestion:
add a folder which contains all of your .csv files as a linked server.
a text linked server will show evert .txt file and every.csv file as a table...
so you can do something really simple like
SELECT *
INTO NEWTABLE
FROM TxtSvr...sample#csv

which would create the table on sql server with the same structure.
i think csv files assume the first row in the file is the name of the columns.

here's the syntax:
--===== Create a linked server to the drive and path you desire.
EXEC dbo.sp_AddLinkedServer TxtSvr,
'Jet 4.0',
'Microsoft.Jet.OLEDB.4.0',
'C:\',
NULL,
'Text'
GO
--===== Set up login mappings.
EXEC dbo.sp_AddLinkedSrvLogin TxtSvr, FALSE, NULL, Admin, NULL
GO
--===== List the tables in the linked server which is really a list of
-- file names in the directory. Note that the "#" sign in the
-- Table_Name is where the period in the filename actually goes.
EXEC dbo.sp_Tables_Ex TxtSvr
GO
--===== Query one of the files by using a four-part name.
SELECT *
FROM TxtSvr...sample#csv


--===== Drop the text server
EXEC dbo.sp_DropServer 'TxtSvr', 'DropLogins'
GO



Heh... looks real familiar... :)


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #560088
Posted Thursday, August 28, 2008 3:16 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, April 29, 2009 10:28 AM
Points: 33, Visits: 91
Hi Lowell, Hi Jeff,

Thanks for your suggestion - sounds just right for my situation. Everything works fine until I get to this bit:
SELECT * 
FROM TxtSvr...sample#csv

It says the table isn't there. In fact the text server contains just three tables and they are debuglog#txt, services#txt and test#txt. None of my CSV files have been picked up. Any ideas on what I need to change?
Post #560226
Posted Thursday, August 28, 2008 3:37 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, April 29, 2009 10:28 AM
Points: 33, Visits: 91
Ok, apologies for that guys - I needed to put the csv files on the sql server, not my local drive
Post #560248
Posted Thursday, August 28, 2008 5:08 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, April 29, 2009 10:28 AM
Points: 33, Visits: 91
Hmmm, sorry that this is dragging on everyone, now that I have the files in the right place (;)) the tables are being picked up. However, when I open the tables, there are twice as many rows in the table than in the CSV file. And every value in every row/column is NULL.

I should say that my CSV files are actually TAB separated. However, I've tried replacing the tabs with commas and although I then get the correct number of columns, I'm still getting all NULLs. Not sure if it makes any difference, but I've got tons of columns (>50)...
Post #560292
Posted Thursday, August 28, 2008 5:29 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 8:47 AM
Points: 12,905, Visits: 32,168
thomas.lemesurier (8/28/2008)
Hmmm, sorry that this is dragging on everyone, now that I have the files in the right place (;)) the tables are being picked up. However, when I open the tables, there are twice as many rows in the table than in the CSV file. And every value in every row/column is NULL.

I should say that my CSV files are actually TAB separated. However, I've tried replacing the tabs with commas and although I then get the correct number of columns, I'm still getting all NULLs. Not sure if it makes any difference, but I've got tons of columns (>50)...


I'm pretty sure the first row determines the number of columns...if there's anything preceeding the data, that might be an issue.

I could understand getting some rows that are null for all fields at the end of the file; every CrLf(carriage Return Line Feed...Char(10) + Char(13) is used to determine whether a row exists; if there are a bunch of CrLf at the end of the document, I'd expect that,and would use a WHERE clause to ignore then (WHERE COL1 IS NOT NULL or something)

sometimes a better text editor can help; I use EditPlus, which has find and replace that allows regular expressions and more; if i were to open a file like this in a linked text server, you can see that rows 9 thru 14 would exist and be null.

Note how this text editor displays CrLf witha paragraph symbol, Tabs As a Dbl Arrow character,and spaces as a floating period...it helps visualize the data much better.


If ALL of your data is null, it might be that the file is unix formatted...it has CHAR(10) as the row limiter, and doesn't the additional have Char(13) that is expected;

you could find and replace, or for example with editplus, you can open the file, and save it as a PC file with document...Fileformat..PC/UNIX/Mac setting and re-save the text file.
Lots of other text editors have the same ability.


Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #560303
Posted Thursday, August 28, 2008 6:43 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:25 PM
Points: 35,372, Visits: 31,925
thomas.lemesurier (8/28/2008)
Hmmm, sorry that this is dragging on everyone, now that I have the files in the right place (;)) the tables are being picked up. However, when I open the tables, there are twice as many rows in the table than in the CSV file. And every value in every row/column is NULL.

I should say that my CSV files are actually TAB separated. However, I've tried replacing the tabs with commas and although I then get the correct number of columns, I'm still getting all NULLs. Not sure if it makes any difference, but I've got tons of columns (>50)...


Time for you to attach a file so we can see what it actually looks like.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #560345
Posted Thursday, August 28, 2008 8:24 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, April 29, 2009 10:28 AM
Points: 33, Visits: 91
Hmmm, Lowell hit the nail on the head - it's a unicode file. But having said that, I've tried saving it out as ANSI (just using notepad) just it's still giving me nothing but NULLS.

OK, I've uploaded a cut down version of the file and stuck a .txt extension on the end so it can be uploaded. Thanks,

Tom


  Post Attachments 
Artists_mini.csv.txt (20 views, 980 bytes)
Post #560506
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse