Import Excel Spreadsheet to Database Tables

  • peterhe

    SSChampion

    Points: 11362

    mgkirsch,

    I used xquery orignally and changed to use OPENXML and PIVOT due to the following two issues:

    1) It was slow for big xml;

    2) If any cell in the spreadsheet does not have value, you will not get valid data imported.

    I still keep the xquery code there as commented out. If you need to use xquery, you need to resorgnaize the stored procedure a little bit, and make sure all cells in the spreadsheet have values.

  • JGay 2041

    SSC Enthusiast

    Points: 142

    Peter,

    My xml spreadsheet lacks two columns that exist in OutputTable. Per business rules I'm unable to add those columns to the spreadsheet. I need to add to OutputTable during the Insert process.

    Is this the section of code that I should add the columns to?:

    SET @InsertSql=N'';

    IF @OutputTableName<>N''

    BEGIN

    -- Build INSERT OutputTable ... SELECT ... statement

    SELECT @InsertSql=@InsertSql+ColumnName+N','

    FROM #Columns

    ORDER BY ExcelColNum;

    SET @InsertSql=LEFT(@InsertSql,LEN(@InsertSql)-1)

    SET @InsertSql=N'INSERT '+@OutputTableName+'('+@InsertSql+N')'+nchar(13)+nchar(10)+N'SELECT '+@InsertSql+N' FROM ('+nchar(13)+nchar(10);

    END

    In this section of code - there exists an open parenthesis but not a closed one. Is it necessary to have a closed parenthesis?

    Coding question: what's the purpose of nchar(13) and nchar(10)?

    Recap: 1) Is this the right location to add additional columns, 2) do we need a closing parenthesis, 3) what's the deal with nchar(13) and nchar(10)

    Thanks!

    JG

  • peterhe

    SSChampion

    Points: 11362

    JG,

    In your case, it's better to pre-create your table with all necessary columns, allow null for those columns that do not exist in the worksheet. Then call the SP to import data to the table. The columns that do not exist in the worksheet will have NULL value, e.g.

    IF object_id('tempdb..#Data') IS NOT NULL

    DROP TABLE #Data;

    CREATE TABLE #Data (

    TextCol nvarchar(30) NULL,

    NumCol nvarchar(50) NULL,

    NotExist_C1 int null,

    NotExist_C2 int null,

    NotExist_C3 int null,

    DateCol nvarchar(50) Null,

    Text2Col nvarchar(30) NULL

    )

    EXEC dbo.[uspImportExcelSheet]

    @ExcelFileName=N'C:\TestBook.xml',

    @WorkSheetName=N'Sheet1',

    @OutputTableName=N'#Data',

    @FirstRowIsHeader=1

    SELECT * FROM #Data

    IF object_id('tempdb..#Data') IS NOT NULL

    DROP TABLE #Data;

  • JGay 2041

    SSC Enthusiast

    Points: 142

    Peter,

    Yes. I have a table already created. I had to look again to be sure of the fields. I was mistaken from before. The table has 1 NOT NULL field - it's the Identity column. You gave code to ignore that column in the -- Physical Table section - and that works perfectly.

    The next two columns are not in the xml spreadsheet. The xml spreadsheet's 1st column is the pre-created table's 4th column.

    as in:

    identity column, not exist in ss, not exist in ss, text4 col.

    When I run the SP, the identity column increments as it should, the two not exist columns contain NULL, as expected, and the text4 col plus the remaining 117 columns have data from the xml spreadsheet.

    What I'd like to have is the ability to pass in the data for the two 'not exist' columns so that all the data goes into the 'pre-created' table.

    I have this so far...

    EXEC dbo.uspImportExcelSheet

    @ExcelFileName=N'C:\TestTable.xml',

    @WorkSheetName=N'Sheet1',

    @OutputTableName=N'Staging.Sales',

    @FirstRowIsHeader=1,

    @ClientAccountNum=N'R555',

    @ClientAccountID=N'232',

    @Debug=0

    In the SP above @Debug tinyint=0, I have...

    @ClientAccountNum nvarchar(4)=N'',

    @ClientAccountID nvarchar(255),

    In the SELECT after BEGIN TRY, I have...

    SELECT @OutputTableName=ISNULL(LTRIM(RTRIM(@OutputTableName)),N''),

    @WorkSheetName=ISNULL(LTRIM(RTRIM(@WorkSheetName)),N''),

    @ExcelFileName=ISNULL(LTRIM(RTRIM(@ExcelFileName)),N''),

    @ClientAccountNum=ISNULL(LTRIM(RTRIM(@ClientAccountNum)),N''),

    @ClientAccountID=ISNULL(LTRIM(RTRIM(@ClientAccountID)),N''), @FirstRowIsHeader=ISNULL(@FirstRowIsHeader,0),

    @RowsToCheckDataType=ISNULL(@RowsToCheckDataType,0),

    @RowsToImport=ISNULL(@RowsToImport,0);

    -- Check valid excel file name is given

    And because these two fields are needed, I have added the appropriate RAISERROR code.

    What I need help on, is where do I insert these variables so that they will 'insert' into the 'pre-create' table in the appropriate columns.

    Thanks

    JG

  • peterhe

    SSChampion

    Points: 11362

    JG,

    The SP is a generic tool to import xml spreadsheet. After data is imported, you can apply any logic and processing to the data within your table. As that said, if you need any special handling of the data, it is better to do it after the data is imported into the table.

    In your scenario, after the data is imported, you can update the table like:

    EXEC the sp...

    UPDATE YourTable

    SET YourCol='555', YourCol2='fff'

    etc.

  • JGay 2041

    SSC Enthusiast

    Points: 142

    Peter,

    Understood. That's what I figured and have completed.

    Thanks - this is a great tool.

    JG

  • munawargani2001

    SSC Journeyman

    Points: 95

    This is a great tool.

    Only problem is, it asks for WorksheetName, which in my case is blank.

    Can u please suggest me what changes will it require to handle this situation?

  • peterhe

    SSChampion

    Points: 11362

    I am afraid sheet name is a necessary. It is equivalent to table name.

    By default, Excel put Sheet 1, Sheet 2, ... as sheet name. In office 10, it does not allow blank sheet name. it accepts a SPACE as sheet name though.

    Can you ask the Excel creator to keep the sheet name?

  • munawargani2001

    SSC Journeyman

    Points: 95

    Thanks for ur prompt response.

    I receive XML Spreadsheet 2003 with xls extension and the WorkSheetName is a Space.

    This spreadsheet is generated by another system, which is not under the control of my business user.

    Is there a way of just reading the first sheet, which in my case is the only sheet in the workbook without the WorksheetName ?

  • xyvyx@hotmail.com

    SSC-Addicted

    Points: 425

    I recently ran into a file exported by a company in this Excel 2003 XML format, although they were named as .xls files. I don't know about JET, but I was unable to read these files with the latest ACE.OLEDB drivers. I've used ACE with a number of XLS and XLSX files, but it refused to read this older, plain XML format. (If anyone uses them, I'm talking about IHS production workbook exports... so if you're in the Petro industry, you might be familiar with these)

    Anyway, so I modified this procedure to work with SQL 2008 R2... it appears that it's use of #temp tables isn't possible with the scope of dynamically executed SQL. So all I did was change that to use a global temp table. It also didn't handle strange column names, so I changed it to [box in] those as well.... so far, so good!

    Attached is my updated version of the sproc, if you have use for it, enjoy!

    -Chris

Viewing 10 posts - 61 through 70 (of 70 total)

You must be logged in to reply to this topic. Login to reply