Help with removing characters like ^M

I have few files in unix which are in dos format. While I am copying these files, ^M, ^@, etc characters are being generated.

I tried dos2unix command in Linux and it doesn't work.
I tried sed to remove these characters but they won't go.
I came to about this 'tr' command and tried to use it but when I try to use it, still it removes M elsewhere in the file but not ^M.

Please advise in how to get rid of these caret characters.

Thanks
Chiru

For sure you removed uppercase M characters at the beginning of the line.

The reason why you failted to remove ^M is because you type it as 2 characters.

In the file however it is a single character.

You can remiove them e.g. in "vi".

Once inside "vi" you can create ^M as a single character by typing first CTRL-V and next CTRL-M.

Make sure you are in command mode (press ESC if necessary) and next you type:

:%s/^M//g

Make sure the ^M is a single character created as mentioned above.

This is just kind of weird. I was going to ask about this very thing when I came to the board, and the question is up on top.

I was wondering about using sed to do the trick. I am running Linux with GNU sed version 4.1.5. Anyway, I had a file which was peppered with ^M characters. I tried this:

sed -i 's/^M/\n/g' some_prog

But it did not seem to do the trick. What did I do wrong?

edit::

Ah - I just tried doing <CTRL-V> <CTRL-M> and it worked. So thanks sb008..

So my next question is about the <CTRL-V> <CTRL-M> sequence. Why is <CTRL-V> necessary? What is happening when it is used?

Just for a second thought you could as well accomplish the same task with sed as well. You dont have to replace ^M with the new line character(\n) just try removing the ^M character.

sed 's/^M//g' filename

As sb008 said

Ctrlv and ctrlM procedure is usefule to remove "^M" ...sometimes "sed" won't work.. I faced this problem earlier

>Just for a second thought you could as well accomplish the same task with sed >as well. You dont have to replace ^M with the new line character(\n) just try >removing the ^M character.

>sed 's/^M//g' filename

Indeed - that is what I actually wanted rather than inserting all the newlines.

I've had to use this command-line often. It will strip carriage returns (^M) and convert null bytes (^@) into spaces.

cat dos.file | tr -d '\r' | tr '\0' ' ' > unix.file

The newliine in a DOS/Windows environment is CRLF.
The newline in an Unix environment is only LF.

When you e.g. frp a TEXT file from a DOS/Windows system to an Unix system in ASCII mode the CRLF is automatically converted to a LF only.

When ftp'ing this same file in BIN mode the conversion doesn't take place.

This means that in this case a file will contains an unnecessary CR at the end of each line on the Unix system .

This CR is visually represented by the ^M.

Therefore this ^M is a SINGLE character, representing the CR, and not the 2 characters ^ and M.

The CTRL-V can be considered as calling the "composed character" function.
So pressig CTRL-V + CTRL-M composes the SINGLE character ^M.

This always works within "vi", however not always on the commad line.

If you are using e.g. a "ksh" with an "emacs" command line interface, pressing CTRL-V on the command line will result in displaying the version of your "ksh". Therefore CTRL-V+CTRL-M will not work on the command line when using ksh/emacs.

Ok - thanks for the info. Quite helpful.

Thankyou all,
I am getting several files with extra characters. So, I cannot do it by going into vi. I will have to do it by command line. As long as I don't use ksh/emacs, I should be ok?? Am I right. I am only using sh(bourne).

Also, I am just not seeing ^M but also ^@. SO will ^V-^@ also will similarly ??

Thanks

You could still go ahead with the sed command i suggested above by changing the characters something like this

sed -e  's/^M//g'
      -e  's/^@//g' sourcefilename >targetfilename

Is the ^@ a single character or are it 2 characters?

Why in the world is everyone playing around with sed and tr and who knows what else?! On Solaris and Linux, you can use dos2unix and unix2dos to remove and add the ^M characters. On HP-UX, dos2ux and ux2dos does the same thing.

Maybe because he mentioned in his initial question he tried dos2unix and stated it didn't work

Is it possible instead to convert the file from dos to unix.
I tried dos2unix <filename> <newfilename>
but I guess Linux doesn't support it. Is there any way I can do that, because the files are being transfered using binary ftp after my control, and then the extra characters are introduced. If I change the files from dos format to unix before I send it out, I believe there won't be any problem.

Any advise ???

Thanks
Chiru

Windows versions of the dos2unix and unix2dos utils:

http://www.freedownloadscenter.com/Utilities/Misc\_\_Utilities/DOS2UNIX_UNIX2DOS.html

Also, what is the error that you face when running dos2unix on Linux systems?

dos2unix is not installed as part of our OS build. So, when I use it, obviously it says "bash: dos2unix: command not found"