RE: Excel Source DT_NTEXT Problem

One Orange Chip

Points: 29387

April 21, 2013 at 10:47 pm

When you read data from Excel using SSIS, the OLEDB provider reads the first few rows of the worksheet and tries to guess what the data type is for each column based on the contents of these rows. So, if in the those rows, it only sees data that would fit into a "STRING", then this is the data type it uses. If it happens that there is a cell with a lot of data in it, then the provider may guess that the data type should be "NTEXT". The same sort of logic applies for numeric fields and (I think) dates.

You can't change this behaviour but you can change the number of rows that the OLEDB provider reads in order to make that guess. This value is stored in the registry. Try searching the registry on your machine for "TypeGuessRows" - you will find at least 2 instances of it for Excel (in a key something like HKLM\Software\Microsoft\Office\15.0\Access Connectivity Engine\Engines\Excel). By default the value is 8 which means that the OLEDB Provider will search the first 8 rows of the spreadsheet and use the values in those rows when it tries to guess the data type. If you increase this value, it will use more rows to do its guess.

This may not work for you all the time - that really depends on the nature of the data you are importing and whether the data that truly represents your data type is in those rows, but it is the only way I know to influence the choice of data type.

This process occurs whenever the spreadsheet is opened by the OLEDB provider i.e. every time the SSIS package runs. You will need to make the same registry change wherever the package will be executing.