-
SJanki
SSCrazy
Points: 2847
Hi All
Not sure if I am posting the question in the right forum or shall I find some excel sheet help!
I have an excel file with master data in the rows and child data below in in different columns.
for e.g (claim#, insuredname is in rows, followed by the child records -----)
Claim#1
InsuredName1
----------------
-----------------
Claim#2
InsuredName2
------------------
------------------
I have to import this into a single table , Do I need to fix the excel first??
There are thousands of claim# (records) ...
without fixing the excel is there a way I can import it into sql-server table using ssis.
Thanks!
SJanki
-
Orlando Colamatteo
SSC Guru
Points: 182293
For those not interested in opening the Excel file here it is:

Just a couple WAGs, you could try importing the entire sheet into one table and then using T-SQL to send the data in the initial table into properly formed tables that represent your data.
Or, you could process the Excel in a Script Component setup as a Data Source where you process the data line by line and push the rows onto an appropriate output per the data format/type.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
-
Phil Parkin
SSC Guru
Points: 247172
Another possibility would be to write some code in Excel which adds the two 'group' columns (claim number and insured) to the table data & then throw away all non-data rows.
I say 'some code' - it will require a fair bit - but once it's in that format you'll find it should import nicely.
-
Rock from VbCity
Hall of Fame
Points: 3991
Hi SJanki
I do not know if you resolved your issue already, if you haven't the following package handles the problem you described, the data flow picture below drives the explanation.

PROCEDURE
a. Create a brand new package.
b. Add a Data Flow Task into its Flow Control; switch to the Data Flow Task
c. Add an Excel Source component
- Configure its Connection Manager by adding a new one
(1) Select Microsoft Excel 2007 if it is available, otherwise pickup Microsoft Excel 97-2003, then
(2) click on the browse button to locate your Excel file

(3) un-check the option First row has columns headers; it is not visible on the picture.
(4) Now assign the columns names as used by the application, as shown by the picture.

d. Add a Conditional Split transform, its condition DataRow will exclude the heading and non-data rows with the following condition:
ISNULL(Status) == FALSE && ISNULL([Pay Ind]) == FALSE && Status != "Status" && [Pay Ind] != "Pay Ind"

- WARNING This is a hard-coded, data depending filter; if the source Excel file ever change its heading, the package will return unexpected results
e. Add a Derived Column transform; this transform introduces the Claim Number and Insured columns with their initial values.
- The Claim Number column expressions is:
Status == "Claim Number :" ? [Pay Ind] : ""
- The Insured column expression is:
Status == "Insured:" ? [Pay Ind] : ""
f. Add an Script transform; this is the critical step, the script will assign the Claim Number and Insured values to the children; the script is written in C# and listed below (you can replace the code below with the default code for the script, as long as you are using the same variable names I used):
public class ScriptMain : UserComponent
{
private string _ClaimNumber = string.Empty;
private string _Insured = string.Empty;
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
/*
Add your code here
*/
if (Row.ClaimNumber.Trim().Length > 0)
{
_ClaimNumber = Row.ClaimNumber;
}
else
{
Row.ClaimNumber = _ClaimNumber;
}
if (Row.Insured.Trim().Length > 0)
{
_Insured = Row.Insured;
}
else
{
Row.Insured = _Insured;
}
}
}
g. Now add another Conditional Split transformation, this one select the real data, as we no longer need those header rows containing the Claim Number and Insured information. Its filter name is: Real Row and its condition shown below:
Status != "Claim Number :" && Status != "Insured:"

- WARNING this is another condition driven by data found inside the Excel file, if the user or anybody change the 'labels' (or header) in these Excel files, the package will fails to select the data properly.
h. Finally a Row Count transform was added with a data viewer to validate the Data Flow was parsing the Excel file as expected; you need to add a package level variable for this Row Count mine was named RowCount
I tested with the Excel file you supplied and it seems to be doing the job.
Please let us know if you go for this solution.
Cheers,
Hope this helps,
Rock from VbCity