Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


BombProof Data Import from Excel ?


BombProof Data Import from Excel ?

Author
Message
homebrew01
homebrew01
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2980 Visits: 9071
I have a very large spreadsheet of 125,000 rows, 200 columns, some of which contain a lot of text. I have tried every method I can think of to import the data into SQL such as converting to .csv, tab delimited text, xls with just 60000 rows ... I keep getting hung up with data truncation or something, even when I tell the Import wizard to ignore truncation. and my SQL table has all varchar(max) columns.

Is there some foolproof way force the data into a sql table, either predfined or created on the fly, then I can worry about what the data looks like after it's in SQL ? I tried BULK INSERT 2 and get non-useful errors and just get the column headers.

I installed the Office 12.0 OLE DB Provider, but get errors when trying to use it with a .XLSX file .... very frustrating 6 hours.

I have a SQL 2008 (R0) server and SQL 2005 servers to work with.

Suggestions ?



Ed Wagner
Ed Wagner
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10307 Visits: 9591
Have you tried importing into an "inbox" or "incoming data" table that the Import Wizard creates itself? I just did this recently to get data into a table and then I split things up from there. It likes to create everything as an nvarchar and I let it just to get the data into the database.

And here I thought I was alone in having problems importing data from Excel into SQL Server...


Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions
Lowell
Lowell
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14965 Visits: 38985
how does the data actually come? is it a real csv file, and not an excel doc?

I've imported millions of rows via bulk insert or SSIS with delimited files, which would be my preference;

excel 2007 doesn't handle lots of rows too nicely, i believe , and can manipulate the data to what it THINKS the data should be, instead of what it really is.

I've also automated excel in .NET ,and it's frustratingly slow when it's processing a Range line by line.

Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Evil Kraig F
Evil Kraig F
SSCertifiable
SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)SSCertifiable (5.7K reputation)

Group: General Forum Members
Points: 5701 Visits: 7660
Unfortunately, I agree with Lowell here. When you start trying to bombproof (I like that term) Excel importations, you're looking at VBA code which will open its own connection to the OLEDB and purposely IGNORE the drivers for excel data access.

This may sound silly but start with 100 rows (enough to fill the data sampler) and work up from there, see if you're finding an arbitrary data limit in memory that's goofing you up.


- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
homebrew01
homebrew01
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2980 Visits: 9071
I think SQL doesn't take a big enough sample of data, so it makes the output columns too small. I found a registry setting for how many rows to sample and changed it from 8 to 10,000 or so, but it didn't seem to help.



homebrew01
homebrew01
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2980 Visits: 9071
Would SQL 2012 work any better ? I could install it on my laptop just for this.



Lowell
Lowell
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14965 Visits: 38985
i think it has nothing to do with SQL, but rather the ACE drivers that are used to access via Excel;
the ACE driver does the peek at the first 100 rows, and does a best-guess datatype casting of the data base on that;
that's the registry setting you can tweak to make it peek at more rows.

so it sounds like your issues' have more to do with that.


if the actual raw data is CSV, then i would take the ACE /excel drivers out of the equasion and use a text driver , whether via SSIS bcp or bulk insert to get the data instead.

Lowell

--
help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

homebrew01
homebrew01
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2980 Visits: 9071
Ed Wagner (3/18/2013)
Have you tried importing into an "inbox" or "incoming data" table that the Import Wizard creates itself? I just did this recently to get data into a table and then I split things up from there. It likes to create everything as an nvarchar and I let it just to get the data into the database.

And here I thought I was alone in having problems importing data from Excel into SQL Server...

You are not alone !!

After many failed attemps, I finally found the right combination by saving the .csv as Tab delimited text. Since SQL could not determine the correct length for the columns, I manually defined most of the columns as DT_STR length of 8000 (max allowed), but found 1 needed to be DT_TEXT because it must have been more than 8000. After much frustration, the import finally worked.



Ed Wagner
Ed Wagner
SSChampion
SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)SSChampion (10K reputation)

Group: General Forum Members
Points: 10307 Visits: 9591
Well done. When I receive data from external sources, I find it to be less than spectacular quality. Fields that are supposed to contain dates contain strings, numeric fields contain alphanumeric data and string data doesn't conform to the agreed-to maximum length, etc. This is why I try to bring it into an empty table of large varchar and nvarchar fields. From there I can clean and put it into a real table that I'll actually use. Sad but necessary.

Thanks for posting the notes on your resolution to the problem. Since it's a problem I encounter frequently, I was interested in the final solution you implemented.


Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions
Koen Verbeeck
Koen Verbeeck
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16461 Visits: 13207
Lowell (3/19/2013)
i think it has nothing to do with SQL, but rather the ACE drivers that are used to access via Excel;
the ACE driver does the peek at the first 100 rows, and does a best-guess datatype casting of the data base on that;
that's the registry setting you can tweak to make it peek at more rows.

so it sounds like your issues' have more to do with that.


if the actual raw data is CSV, then i would take the ACE /excel drivers out of the equasion and use a text driver , whether via SSIS bcp or bulk insert to get the data instead.


To put the dot on the i:

The ACE driver looks by default to the first 8 rows. You can set the TypeGuessRows registry setting to any number between 0 and 16. The value of 0 makes the driver scan the first 16384 rows. However, this is useless until you add IMEX=1 to the end of the connection string, otherwise you'll end up with NULL data if your column has mixed data types.

Shameless plug with more info:
What’s the deal with Excel & SSIS?

Bottomline:
convert to a flat file :-D



How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at SQLKover.

MCSA SQL Server 2012 - MCSE Business Intelligence
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search