Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

BombProof Data Import from Excel ? Expand / Collapse
Author
Message
Posted Monday, March 18, 2013 1:22 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 11:52 AM
Points: 2,832, Visits: 8,513
I have a very large spreadsheet of 125,000 rows, 200 columns, some of which contain a lot of text. I have tried every method I can think of to import the data into SQL such as converting to .csv, tab delimited text, xls with just 60000 rows ... I keep getting hung up with data truncation or something, even when I tell the Import wizard to ignore truncation. and my SQL table has all varchar(max) columns.

Is there some foolproof way force the data into a sql table, either predfined or created on the fly, then I can worry about what the data looks like after it's in SQL ? I tried BULK INSERT 2 and get non-useful errors and just get the column headers.

I installed the Office 12.0 OLE DB Provider, but get errors when trying to use it with a .XLSX file .... very frustrating 6 hours.

I have a SQL 2008 (R0) server and SQL 2005 servers to work with.

Suggestions ?



Post #1432321
Posted Monday, March 18, 2013 1:45 PM
SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 11:54 AM
Points: 4,246, Visits: 3,686
Have you tried importing into an "inbox" or "incoming data" table that the Import Wizard creates itself? I just did this recently to get data into a table and then I split things up from there. It likes to create everything as an nvarchar and I let it just to get the data into the database.

And here I thought I was alone in having problems importing data from Excel into SQL Server...




Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions
Post #1432325
Posted Monday, March 18, 2013 2:10 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 2:09 PM
Points: 12,905, Visits: 32,166
how does the data actually come? is it a real csv file, and not an excel doc?

I've imported millions of rows via bulk insert or SSIS with delimited files, which would be my preference;

excel 2007 doesn't handle lots of rows too nicely, i believe , and can manipulate the data to what it THINKS the data should be, instead of what it really is.

I've also automated excel in .NET ,and it's frustratingly slow when it's processing a Range line by line.


Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #1432331
Posted Monday, March 18, 2013 2:23 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: 2 days ago @ 6:08 PM
Points: 5,401, Visits: 7,514
Unfortunately, I agree with Lowell here. When you start trying to bombproof (I like that term) Excel importations, you're looking at VBA code which will open its own connection to the OLEDB and purposely IGNORE the drivers for excel data access.

This may sound silly but start with 100 rows (enough to fill the data sampler) and work up from there, see if you're finding an arbitrary data limit in memory that's goofing you up.



- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
Post #1432338
Posted Monday, March 18, 2013 5:44 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 11:52 AM
Points: 2,832, Visits: 8,513
I think SQL doesn't take a big enough sample of data, so it makes the output columns too small. I found a registry setting for how many rows to sample and changed it from 8 to 10,000 or so, but it didn't seem to help.


Post #1432410
Posted Tuesday, March 19, 2013 12:05 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 11:52 AM
Points: 2,832, Visits: 8,513
Would SQL 2012 work any better ? I could install it on my laptop just for this.


Post #1432837
Posted Tuesday, March 19, 2013 12:10 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 2:09 PM
Points: 12,905, Visits: 32,166
i think it has nothing to do with SQL, but rather the ACE drivers that are used to access via Excel;
the ACE driver does the peek at the first 100 rows, and does a best-guess datatype casting of the data base on that;
that's the registry setting you can tweak to make it peek at more rows.

so it sounds like your issues' have more to do with that.


if the actual raw data is CSV, then i would take the ACE /excel drivers out of the equasion and use a text driver , whether via SSIS bcp or bulk insert to get the data instead.


Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #1432841
Posted Tuesday, March 19, 2013 7:06 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 11:52 AM
Points: 2,832, Visits: 8,513
Ed Wagner (3/18/2013)
Have you tried importing into an "inbox" or "incoming data" table that the Import Wizard creates itself? I just did this recently to get data into a table and then I split things up from there. It likes to create everything as an nvarchar and I let it just to get the data into the database.

And here I thought I was alone in having problems importing data from Excel into SQL Server...

You are not alone !!

After many failed attemps, I finally found the right combination by saving the .csv as Tab delimited text. Since SQL could not determine the correct length for the columns, I manually defined most of the columns as DT_STR length of 8000 (max allowed), but found 1 needed to be DT_TEXT because it must have been more than 8000. After much frustration, the import finally worked.



Post #1432968
Posted Wednesday, March 20, 2013 5:28 AM
SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 11:54 AM
Points: 4,246, Visits: 3,686
Well done. When I receive data from external sources, I find it to be less than spectacular quality. Fields that are supposed to contain dates contain strings, numeric fields contain alphanumeric data and string data doesn't conform to the agreed-to maximum length, etc. This is why I try to bring it into an empty table of large varchar and nvarchar fields. From there I can clean and put it into a real table that I'll actually use. Sad but necessary.

Thanks for posting the notes on your resolution to the problem. Since it's a problem I encounter frequently, I was interested in the final solution you implemented.



Tally Tables - Performance Personified
String Splitting with True Performance
Best practices on how to ask questions
Post #1433139
Posted Wednesday, March 20, 2013 4:05 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Yesterday @ 11:42 AM
Points: 13,295, Visits: 11,086
Lowell (3/19/2013)
i think it has nothing to do with SQL, but rather the ACE drivers that are used to access via Excel;
the ACE driver does the peek at the first 100 rows, and does a best-guess datatype casting of the data base on that;
that's the registry setting you can tweak to make it peek at more rows.

so it sounds like your issues' have more to do with that.


if the actual raw data is CSV, then i would take the ACE /excel drivers out of the equasion and use a text driver , whether via SSIS bcp or bulk insert to get the data instead.


To put the dot on the i:

The ACE driver looks by default to the first 8 rows. You can set the TypeGuessRows registry setting to any number between 0 and 16. The value of 0 makes the driver scan the first 16384 rows. However, this is useless until you add IMEX=1 to the end of the connection string, otherwise you'll end up with NULL data if your column has mixed data types.

Shameless plug with more info:
What’s the deal with Excel & SSIS?

Bottomline:
convert to a flat file




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1433534
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse