RE: Simple Method for Importing Ragged files

SSC-Insane

Points: 23224

March 6, 2008 at 12:53 pm

paul.ibison (3/6/2008)
Brandon,
all due respect but I think you're missing the point.
Judging by some of the questions, some others (Jeff apart) have also never really tried to solve this problem in practice.
Take a csv file with 50 columns and use your conditional split starting with a single column and see how long it takes. You'll have to define each of the 50 columns separately, and define the substring function separately for each one. In the example CSV file have no fixed width so your substring function will take account of looking for the commas, possibly nested inside a string. How long before this is robust? How tedious will it be to do all this coding? And the resulting code required will be huge.
This method can do the same in an hour or so and have a very simple resultant package.
Cheers,
Paul Ibison

I'll admit I've never had to solve this exact problem before, where simply stripping headers and footers away would do the trick. Problems I've had to solve involving headers and footers have sometimes involved files with complex hierarchical structures from legacy systems, like this:

FILE

ORDERS|2

HDR|100293|987|20080326

ITM|897654|9876.87|3

ITM|098643|76.34|12

FTR|100293|2

HDR|100294|456|20080326

ITM|765432|11.99|6

FTR|100294|1

ENDORDERS|2

CUST|2

HDR|987

ITM|Joe|Jackson|98 Palomino Way|Los Angeles|CA|90823

FTR|987

HDR|456

ITM|Lisa|Lewis|123 Sesame Street|New York|NY|10014

FTR|456

ENDCUST|2

ENDFILE

How does your process work for files like this? Ignoring header and footer information in this file isn't an option since you will lose important information during the process, such as the order #s and order dates, the line item count, and other auditing information included in the file like record counts, etc.