importing text files with " " around the text

  • below is the first few lines of a file i'm trying to import

    when i set the text qualifier to " it takes the parentheses off. but i get an error importing the last column, Application Job Description. I think the problem is that the parentheses are way past the text in some cases.

    when i turn the text qualifier off it imports OK but all the text has parentheses around it and it imports as varchar data

    i wanted to change the date columns to date data types. i can create a multi-step SSIS process to change the data types and run update statements with the replace function but i was hoping there was a 1 step way using SSIS.

    "Borough"|"Application House Number"|"Job Street Number"|"Application BIN"|"Application BLOCK"|"Application LOT"|"Application Job Number"|"Application Job Type"|"Application Job Status"|"Application Job Status Date"|"Application Process Date"|"Application Job Description"

    "Bronx"|"2803 "|"WHITE PLAINS ROAD "|"2053549"|"04508"|"00072"|"200682323"|"A2"|"PERMIT ISSUED - PARTIAL JOB "|"2001-08-09 00:00:00"|"2001-08-03 00:00:00"|"INSTALL FIRE SUPPRESION SYSTEM ONLY. NO CHANGE IN USE, EGRESS, OR OCCUPANCY "

    "Brooklyn"|"2828 "|"WEST 16 STREET "|"3339650"|"07021"|"00020"|"302228729"|"A3"|"PLAN EXAM - DISAPPROVED "|"2006-09-26 00:00:00"|"2006-09-19 00:00:00"|"PROVIDE BUILDERS PAVEMENT PLAN FOR A TOTAL OF 80 LINEAR FEET ON WEST 16TH STREET BEGINNING 210'-0" SOUTH OF NEPTUNE AVENUE IN CONJUNCTION WITH NEW BUILDING "

    "Manhattan"|"194 "|"SPRING STREET "|"1007354"|"00489"|"00018"|"104410442"|"A3"|"PERMIT ISSUED - ENTIRE JOB/WORK "|"2006-04-21 00:00:00"|"2006-04-20 00:00:00"|"INTERIOR DEMOLITION AND CLEAN UP AT EXISTING STORE SPACE AS PER PLAN. NO CHANGE IN USE, EGRESS & OCCUPANCY. "

    "Manhattan"|"1 "|"RIVER PL "|"1085803"|"01089"|"00001"|"102719523"|"A2"|"PERMIT ISSUED - ENTIRE JOB/WORK "|"2000-06-27 00:00:00"|"2000-06-23 00:00:00"|"INSTALL 2-NW L.P. BLRS,2-NW COMBINATION GS/OIL BURNERS. INSTALL 1- NEW 20,000 GLN OIL TANK IN NEW 3HR ENCLOSURE ALL STRUCTURAL DETAILS SUPPORT ENCLOSUR CHIMNEY AND GAS LINE FILED UNDER N.B. 10 109359.6-NEW GAS FIRED HOT WATER HEATER CONNECT LOW PRESSURE GAS LINES.NO CHANGE IN MEANS OF EGRESS,USE GROUP OCCUPANCY."

    "Manhattan"|"15 "|"CLAREMONT AVENUE "|"1059818"|"01990"|"00005"|"104357385"|"A2"|"PERMIT ISSUED - PARTIAL JOB "|"2006-03-02 00:00:00"|"2006-02-17 00:00:00"|"PROPOSED INTERIOR RENOVATION TO APT.103ON 10TH FLR.NEW KITCHEN CABINETS ,BATHROOM FIXTURES OAK FLOORS DOORSAND PARTITIONS AS PER PLANS.REPAIR WALLS CEILING AND TRIM NO CHANGE IN USE EGRESSAND OCCUPANCY "

  • alen teplitsky (7/21/2010)


    below is the first few lines of a file i'm trying to import

    when i set the text qualifier to " it takes the parentheses off. but i get an error importing the last column, Application Job Description. I think the problem is that the parentheses are way past the text in some cases.

    when i turn the text qualifier off it imports OK but all the text has parentheses around it and it imports as varchar data

    i wanted to change the date columns to date data types. i can create a multi-step SSIS process to change the data types and run update statements with the replace function but i was hoping there was a 1 step way using SSIS.

    "Borough"|"Application House Number"|"Job Street Number"|"Application BIN"|"Application BLOCK"|"Application LOT"|"Application Job Number"|"Application Job Type"|"Application Job Status"|"Application Job Status Date"|"Application Process Date"|"Application Job Description"

    "Bronx"|"2803 "|"WHITE PLAINS ROAD "|"2053549"|"04508"|"00072"|"200682323"|"A2"|"PERMIT ISSUED - PARTIAL JOB "|"2001-08-09 00:00:00"|"2001-08-03 00:00:00"|"INSTALL FIRE SUPPRESION SYSTEM ONLY. NO CHANGE IN USE, EGRESS, OR OCCUPANCY "

    "Brooklyn"|"2828 "|"WEST 16 STREET "|"3339650"|"07021"|"00020"|"302228729"|"A3"|"PLAN EXAM - DISAPPROVED "|"2006-09-26 00:00:00"|"2006-09-19 00:00:00"|"PROVIDE BUILDERS PAVEMENT PLAN FOR A TOTAL OF 80 LINEAR FEET ON WEST 16TH STREET BEGINNING 210'-0" SOUTH OF NEPTUNE AVENUE IN CONJUNCTION WITH NEW BUILDING "

    "Manhattan"|"194 "|"SPRING STREET "|"1007354"|"00489"|"00018"|"104410442"|"A3"|"PERMIT ISSUED - ENTIRE JOB/WORK "|"2006-04-21 00:00:00"|"2006-04-20 00:00:00"|"INTERIOR DEMOLITION AND CLEAN UP AT EXISTING STORE SPACE AS PER PLAN. NO CHANGE IN USE, EGRESS & OCCUPANCY. "

    "Manhattan"|"1 "|"RIVER PL "|"1085803"|"01089"|"00001"|"102719523"|"A2"|"PERMIT ISSUED - ENTIRE JOB/WORK "|"2000-06-27 00:00:00"|"2000-06-23 00:00:00"|"INSTALL 2-NW L.P. BLRS,2-NW COMBINATION GS/OIL BURNERS. INSTALL 1- NEW 20,000 GLN OIL TANK IN NEW 3HR ENCLOSURE ALL STRUCTURAL DETAILS SUPPORT ENCLOSUR CHIMNEY AND GAS LINE FILED UNDER N.B. 10 109359.6-NEW GAS FIRED HOT WATER HEATER CONNECT LOW PRESSURE GAS LINES.NO CHANGE IN MEANS OF EGRESS,USE GROUP OCCUPANCY."

    "Manhattan"|"15 "|"CLAREMONT AVENUE "|"1059818"|"01990"|"00005"|"104357385"|"A2"|"PERMIT ISSUED - PARTIAL JOB "|"2006-03-02 00:00:00"|"2006-02-17 00:00:00"|"PROPOSED INTERIOR RENOVATION TO APT.103ON 10TH FLR.NEW KITCHEN CABINETS ,BATHROOM FIXTURES OAK FLOORS DOORSAND PARTITIONS AS PER PLANS.REPAIR WALLS CEILING AND TRIM NO CHANGE IN USE EGRESSAND OCCUPANCY "

    The problem is an actual data error. Whoever created the file didn't follow the rules for quotes and they've included...STREET BEGINNING 210'-0" SOUTH in the data which, of course, blows the Text Qualifier thing.

    You could get around this using a BCP format file. \ " (without the space) is the magic code for how to identify double quotes as part of a delimiter.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • thx

    as soon as i got rid of that quote it went further than line 3 and to line 41. now i have to check the entire file which is over 400MB

    didn't see that quote, too many letters around it

  • alen teplitsky (7/21/2010)


    thx

    as soon as i got rid of that quote it went further than line 3 and to line 41. now i have to check the entire file which is over 400MB

    didn't see that quote, too many letters around it

    A BCP format file would be easier. Either that or import based on the pipe delimiter and then replace the value with a substring of itself that doesn't include the first or last character which would be a double quote, of course.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply