Special control characters in file

Hi Guys,

We receive some huge files on to Linux server. Source system use FTP mechanism to transfer these files on our server. Occasionally one record is getting corrupted while transfer, some control characters are injecting into the file. How to fix this issue ? please advice ?

Sample Corrupted Record:

A|06/02/2013|11/03/2013|90|90|90|92|90|99|99|1|90-90-90-92-90-99-99|90-90-90-92-90-99-99-1|FX|US|6|289814345|289814345|SBC AMERITECH|FX|||US|532024303|U108024050|AT&T INC.|FX|US|1||D31887602||FX|US|0||N|N|S|0|||0|0|0||||D|0||152745w]M-LM-^G|599001.74|0|0|FX|FDFR|

Correct Record is as follows:

A|06/02/2013|11/03/2013|90|90|90|92|90|99|99|1|90-90-90-92-90-99-99|90-90-90-92-90-99-99-1|FX|US|6|289814345|289814345|SBC AMERITECH|FX|||US|532024303|U108024050|AT&T INC.|FX|US|1||D31887602||FX|US|0||N|N|S|0|||0|0|0||||D|0||152745444.00|599001.74|0|0|FX|FDFR|

Is this really corruption or is it due to the file having been written using the wrong language settings or UTF settings?

A lot of Linux systems have the iconv command to convert from one NLS setting to another.

Usually "corruption" via ftp or sftp transfer is a seldom thing. Can you get a checksum of a corrupted file, then a checksum on the remote source file to verify that transfer did not corrupt.

If you want a script it would be a one-time deal, because corruption due to file transfers does not cause the same screw-ups in the data time after time. Because of that, scripting is not a great solution. If you have enough virtual memory, the Linux vim editor or an editor on a Windows desktop like Ultraedit is a better choice.

If you are getting the same kind of corruption it is very likely the garbage you see was already in the file before you got it, because of a programming error in the code that created the file, like buffer overflow. Consider fixing the root cause.

And. Yeah, I know -- the people on the creation side of things will fight the idea. But that is politics not computing.

1 Like

Thank you Jim, i will check check sum with source and let you know how it goes.