SQL 2000, iSeries linked server, CPDTS.DLL and Arabic

  • Hi,

    I'm trying to import physical files from an iSeries that contain mixed English-Arabic data into SQL 2000 tables and to retain the correct Arabic script including Lam-Alef glyphs. Can anyone offer advice on how to achieve this?

    The following is what I have tried so far:

    I have created a system DSN using the iSeries Access ODBC driver from Personnal Communication iSeries Access for Windows 5.7 and setup up a linked server using Microsoft OLE DB provider for ODBC driver referencing the system DSN.

    The physical file on the iSeries is defined as CCSID 65535 so as to enable the input and display of character sets as appropriate to that job and so on the system DSN I have [Convert binary 65535 data to text] enabled. The data contained in the table will have been input using a job CCSID of 420.

    If I use Query Analyser and select from the iSeries physical file via the linked server then the correct English characters are displayed but the Arabic data is shown as accented Latin characters not Arabic. If I disable the [Convert binary 65535 data to text] option on the DSN then the data returned is shown as the hex representation of the EBCDIC data.

    I have created the table in SQL 2000 with NVARCHAR columns and sized larger than the original fields on the iSeries so as to hold the doubled up Lam-Alef glyphs. (On the iSeries Lam-Alef glyphs are held as a single EBCDIC byte, in Windows Lam-Alef glyphs are held as two separate ASCII bytes that are combined by the OS to create a single visual Lam-Alef glyph.)

    I have created a DTS package to import the data and, not suprisingly, I get the same results as with Query Analyser.

    I have loaded and registered the Codepage Data Transformation for SQL 2000 DLL available from Microsoft at http://www.microsoft.com/middleeast/msdn/CPDTS.aspx. This DLL is specifically to handle language transformations and includes the functionality to handle EBCDIC 420 to ASCII 1256 without loss of the Lam-Alef glyphs or so it says.

    The first issue with the example setup that Microsoft gives is that in Step 2 under Data Link Properties I do not have any of these advanced properties, I cannot set the Host CCSID, PC Code page or Process binary as character. I assume these properties are as a result of the input source of the DTS package, and Microsoft do not include details of what the example is using as the input source.

    Anyway, if I continue and setup the package and use the transform data task codepage conversion on a single column and specify 420 to 1256 and run the package then the other columns are imported as before and this specific column with codepage conversion comes through as a series of squares and Chinese squiggles - no English or Arabic in sight.

    Thanks for any suggestions or advice.

    Paul.

     

  • You can use Visual Studio it can let you save you file as Arabic in Microsoft platform then you create your database, tables and columns as Arabic because if you define it as English you will loose the Arabic definition.  Both English and Arabic use the Latin alphabet so with some effort you can do it.  There is a free version of VS but you can test drive the professional or standard because there are six Arabic defined in VS Advanced save option.  Hope this helps.

    http://msdn2.microsoft.com/en-us/library/ms179886.aspx

    http://msdn2.microsoft.com/en-us/library/ms180175.aspx

     

    Kind regards,
    Gift Peddie

  • Gift Peddie,

    Can you offer some more advice on Visual Studio and the processes involved in getting the data into SQL with Lam-Alef and correct sequencing?

    I have VS 2005 up and running and have used VB.NET with Encode() functions to convert a simple file from IBM 420 EBCDIC to ASCII windows-1256 that was FTP'd from the iSeries as binary. The Arabic data has lost the Lam-Alef glyphs and is reversed, the English data is correct.

    Do I need to write reversal routine to flip the Arabic and how do I preserve the single byte Lam-Alef glyphs in EBCDIC to the two byte Lam Alef in windows-1256 or are sending the data in via Unicode?

     

     

     

  • I don't write VB but I can tell you that what you need is more than a simple encode function so I am including .NET text encoding link.  I also found old C# code you can use and I think you can try saving your VB.NET file with all the four or six Arabic defined in VS to see what works.  Try the link below for code you can use to do the conversion, one more thing in VS you can choose UTF 8 which is single byte Unicode instead of UTF16 which is the same with UCS-2.  Remember to use the correct Arabic collation for your SQL Server tables per the previous links I gave you, that is important because I am assuming you are not dealing with the Latin alphabet which creates complications for SQL Server UCS-2. The reason you need both links is the .NET Char is the nineth Integer and it is UTF 16 Unicode by default. Hope this helps.

    http://www.codeproject.com/ce/arabicsupport.asp?df=100&forumid=14111&exp=0&select=795153

    http://www.yoda.arachsys.com/csharp/ebcdic/

    Kind regards,
    Gift Peddie

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply