Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SED and the Big Bad UNIX File


SED and the Big Bad UNIX File

Author
Message
Peter Ward
Peter Ward
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 46
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/pward/sedandthebigbadunixfile.asp
Bill Geake
Bill Geake
SSC Journeyman
SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)SSC Journeyman (78 reputation)

Group: General Forum Members
Points: 78 Visits: 43

Good stuff about dealing with big files, but for the delimiter it's easy enough to set LF rather than CRLF in the file's connection object. I do this for data from a well-known provider of financial information which seems unable to standardize on CRLF or LF for its various files.

Bill.


cneuhold
cneuhold
SSC Veteran
SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)SSC Veteran (247 reputation)

Group: General Forum Members
Points: 247 Visits: 13
You could also achieve this result by a simple Shell script using RegEx-Replace and an ActiveX task before the actual transformation (dealt successfully with such things before)



Johan Kotze
Johan Kotze
Forum Newbie
Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)

Group: General Forum Members
Points: 7 Visits: 1
There are also two *NIX and DOS utilities to convert between the *NIX and DOS LF/CRLF formats. They are:

UNIX2DOS and DOS2UNIX

Simple, open source and widely available. ;-)
SALIM ALI
SALIM ALI
Old Hand
Old Hand (370 reputation)Old Hand (370 reputation)Old Hand (370 reputation)Old Hand (370 reputation)Old Hand (370 reputation)Old Hand (370 reputation)Old Hand (370 reputation)Old Hand (370 reputation)

Group: General Forum Members
Points: 370 Visits: 247
Yeah, I use UNIX2DOS & DOS2UNIX, very handy little utilities.



Peter Thomas Clarke
Peter Thomas Clarke
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 1
or open the file in an editor like Testpad and save it in Windows format.
Edmond Shamon Larson
Edmond Shamon Larson
Forum Newbie
Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)Forum Newbie (9 reputation)

Group: General Forum Members
Points: 9 Visits: 69

Nice article. I would have never thought to use SED on Windows for that purpose.

I'll keep it in my favorites for just in case situations.




Thanks,

Edmond Shamon Larson

Antares686
Antares686
SSCrazy Eights
SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)

Group: Moderators
Points: 8432 Visits: 780

Great article. Fortunately I have not come across this yet but always keep this type of info around just in case.

For me the biggest issue has been mainframe files where the columns are fixed width but the file truncates to the line when a particular coulmn is not there. For me it is simply to import the whole thing into a single char column then export back to a new file and run thru the import as originally design. Would be nice to have a tool to auto check this and fix the file or data (source code would be great so I could just make a DTS object) while importing. Anyone got such an animal per chance?





sqltung
sqltung
SSC Veteran
SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)SSC Veteran (216 reputation)

Group: General Forum Members
Points: 216 Visits: 585
Good article. I'll have to check out SED.

Regarding the bad record: if the record was embarassed, you could call it a rouge record. But I think you meant rogue.
Yelena Varshal
Yelena Varshal
Hall of Fame
Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)

Group: General Forum Members
Points: 3480 Visits: 593

Hi,

Good article.

The problem is even more general then just importing from UNIX-generated files. I had recently to show one of the business ladies why the data import jobs for her application done by the third-party SW fails based on what files another company sends her. I created 5 files generated by DTS, VBScript and export from Excel using different row and field delimeters to show her the file will contan delimeters that a particular programmer specifies. Also, I showed to her that if the Comments field contains one of her row or field delimier characters her data import job will fail too with the message about incorrect number of fields.

We all have to determine what are row and field delimeters before setting up any file processing and after that to work with the file supplier to make sure they don't change their processes and technologies to produce the files. Here is a part of the script that helps me to see what characters are used. It posts a message for each character (for the demo), re-write it to output into the file.

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\Temp\Files\MyTextFile.txt", 1)
Do Until objFile.AtEndOfStream
strCharacters = objFile.Read(1)
Wscript.Echo strCharacters & " " & Cstr(asc(strCharacters))
Loop




Regards,
Yelena Varshal

Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search