EBCDIC TO ASCII conversion in SSIS, packed decimal field

  • I need to convert an EBCIDIC file to ASCII. The caveat is the incoming EBCDIC file has packed decimal fields in it. Can this be done in SSIS? Has anyone done this before? Thx

  • I was asked some time ago to do this in SSIS, but I was able to convince my upstream data provider to use SAS on their mainframe to convert the packed fields to ASCII. I don't know how familiar you are with packed decimals, but when I was researching it, I found a good page about their structure over at http://www.discinterchange.com/TechTalk_Packed_fields_.html.

    The approach that I was hoping to use was to grab a row of the raw EBCDIC hex stream, locate the bytes of the packed field using data from the picture statements I was provided, and parse them out. The problem that I ran into was the variablility of the number of bytes in the packed fields. I couldn't get them isolated from the surrounding bytes in the stream.

    I'm sorry that I can't offer more help on this one.

  • I have had to create this in the past, and came up with a simple technique that relies on a UDF to translate the field. The best approach requires grabbing an entire row of the raw EBCDIC hex stream (using the BINARY datatype), and using the SUBSTRING function, pass the data you wish to unpack as a binary string. Here is the function:

    [font="Courier New"]

    CREATE FUNCTION [dbo].[fnUnpack] (

    @InputStr varbinary(1000) = NULL

    )

    RETURNS varchar(2000)

    AS

    BEGIN

    DECLARE @RetStr varchar(2000)

    DECLARE @pos int

    DECLARE @length int

    DECLARE @temp int

    DECLARE @nibble1 int

    DECLARE @nibble2 int

    DECLARE @hexstring char(16)

    SET @RetStr = ''

    SET @pos = 1

    SET @length = DATALENGTH(@InputStr)

    SET @hexstring = '0123456789ABCDEF'

    WHILE (@pos <= @length)

    BEGIN

    SET @temp = CONVERT(int, SUBSTRING(@InputStr, @pos, 1))

    SET @nibble1 = FLOOR(@temp / 16)

    SET @nibble2 = @temp - (@nibble1 * 16)

    SET @RetStr = @RetStr + SUBSTRING(@hexstring, @nibble1 + 1, 1)

    IF @pos < @length

    SET @RetStr = @RetStr + SUBSTRING(@hexstring, @nibble2 + 1, 1)

    ELSE

    IF SUBSTRING(@hexstring, @nibble2 + 1, 1) = 'D'

    SET @RetStr = '-' + @RetStr

    SET @pos = @pos + 1

    END

    RETURN @RetStr

    END

    [/font]

    I'm not sure if this response will allow me to paste binary text as is, but here is an example of using the function:

    [font="Courier New"]

    SELECT dbo.fnUnpack(CONVERT(varbinary(5), 'P@ '))

    [/font]

    The function should return the string '1050402'. I chose to return all values as a string so type conversion may be necessary.

    To demonstrate some flexibility using this process, the above example returns '1050402' which is actually a date (2005-04-02). It is easy to create another function that takes this type of 7 character date string as an input parameter and converts it to a SQL datetime data type. Something like this:

    [font="Courier New"]

    SELECT dbo.fnUnpackDate(dbo.fnUnpack(CONVERT(varbinary(5), 'P@ ')))

    [/font]

    Or even create a second function that call internally calls the fnUnpack function like:

    [font="Courier New"]

    SELECT dbo.fnUnpackDate(CONVERT(varbinary(5), 'P@ '))

    [/font]

    Personally, I find the second method much easier to read and modify. You may have functions that can unpack data to native data types like Int, Money, Decimal, etc. Here is an example of an UPDATE query that you can use to extract an entire row of EBCDIC data:

    [font="Courier New"]

    UPDATE s

    SET

    -- Data Fields

    LoanID = dbo.fnTranslate(SUBSTRING(s.RawData, 4, 10)), -- This function simply converts an EBCDIC string to ASCII

    [ML-DIST-SEQ] = dbo.fnUnpackInt(SUBSTRING(s.RawData, 50, 1)),

    [ML-PD-TO-DATE] = dbo.fnUnpackDate(SUBSTRING(s.RawData, 54, 4)),

    [ML-INT-RATE] = dbo.fnUnpackDecimal(SUBSTRING(s.RawData, 58, 4), 5),

    [ML-P-I-PYMT] = dbo.fnUnpackMoney(SUBSTRING(s.RawData, 62, 6), 2)

    FROM ProductionTable p

    JOIN StagingEBCDICTable s

    ON p.RecordKey = s.RecordKey

    [/font]

    You could take an entire COBOL copybook, and process the entire row using this technique. It's tedious, but it does work. I have been mapping and translating over 50 binary files daily for the past three years using this same method. Good luck!

  • Michael:

    Thanks for the post. Just by the example you provided looks like you are dealing with data from MortgageServ also. Is that correct?

  • You are absolutely right. Welcome back to the "dark ages" of programming...:(

  • Michael: My EBCDIC file has the record length of 150 bytes per record, fixed block. How do you take such a file and put it into a table? When I try to DTS it in, there is no CrLF at the 150th character-so DTS does not work. I tried to do an activeX script to read the file in and write out a record at every 150th character and that does not work either. I am sure you have figured this out already. Thanks in advance for any light you can shed in this matter.

  • One way you could do this (which I believe is also the fastest), is to use BULK INSERT into a staging table. First, create a staging table:

    [font="Courier New"]

    CREATE TABLE StageEBCDIC (RawData varbinary (5000) NULL)

    [/font]

    Next, create a custom format file using for the extract you will be copying into the staging table. Please refer to "Format File Structure" in BOL or online for detailed info. The contents of this file will look similar to the following:

    [font="Courier New"]

    8.0

    1

    1 SQLBINARY 0 4519 "" 1 RawData ""

    [/font]

    Basically, this format file is set up in the following manner:

    Line 1 = Version

    Line 2 = Number of columns

    Line 3 = Col #, Data Type, Prefix Len, Data Len, Terminator, Col Order, Col Name, File Collation

    As you can see, using this format file will split each stream of 4519 characters into rows of a single column called 'RawData'. This works because what you are importing is fixed-width. Not all extracts are the same width, so you'll need a separate format file for each extract.

    Once the Staging table and format file is created, then you can perform the bulk insert:

    [font="Courier New"]

    BULK INSERT StageEBCDIC

    FROM '\\SERVER01\EBCDICFiles\Filename.txt'

    WITH (FORMATFILE='\\SERVER01\FormatFiles\Filename.fmt', CODEPAGE='RAW', TABLOCK)

    [/font]

    Once the file has been loaded into the Staging table, you can use the functions to extract the raw data into the master table.

  • you also might consider a 3rd party component like www.aminosoftware.com

    EBCDIC gets much more complicated than packed fields...

  • Just a quick update, the EBCDIC component shipped...check out the free trial here

    http://www.aminosoftware.com/lysine/download/

    Good Luck!

    -Ben

  • Did anyone ever converted data to Packed Decimal?

    Thanks

  • Paulo,

    Not sure about your question. Do you mean you want to convert data to packed decimal (as in export) or read in a data field as a decimal value? In that case you would read in the same as any packed field, then supply the precision and do the conversion. In other words, say you read in the following value as a string and it translated to "1234". It takes three bytes to hold this value (which includes the sign). The precision should be less than the number of bytes, so if the copybook shows a value of (5,2) then there will be a precision of 2. to calculate, simply convert the value of the string to an integer and convert that to DECIMAL(6, 2). Remember, three bytes packed will result in six bytes unpacked which is how the DECIMAL(6, 2) was derived. To place the decimal, we divide by 10 to the power of the precision, so the calculation would look something like this:

    [font="Courier New"]SELECT CONVERT(DECIMAL(6, 2), CONVERT(INT, '1234')) / POWER(10, 2)[/font]

    As mentioned earlier in this thread, you could easily create a function to handle packed decimal when translating an entire row of data. Hope this explanation can help you out.

  • I should have explained, I will in fact read a database table, using SSIS, get a decimal value, for example 12.34, and I will have to create a file which will be sent to a mainframe computer, meaning that the number 12.34 will have to be saved as packed decimal into the flat file.

  • There is an error in this

    "@i" is undifined.

  • That code was written so long ago, but by the looks of things, I probably changed the [font="Courier New"]@i[/font] variable to [font="Courier New"]@pos[/font] at the last minute to make the code a little easier to follow, and I overlooked it.

  • Thanks 🙂

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply