Removing Colors and ^M in a log file

Hi,

I'm trying to send a log file to mailx as a "Body Message" but since the file contains so many control and color characters it's making an attachement out of it instead of putting it in the body.

The file looks like this:

Bringing up loopback interface:  ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 19 Bringing up interface eth0:  ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 20 Starting portreserve: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 21 Starting system logger: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 22 Starting irqbalance: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 23 Starting rpcbind: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 24 Starting sssd: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 25 Starting RPC idmapd: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 26 Starting RPC gssd: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 27 Starting kdump:^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 28 Starting system message bus: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 29 Starting Avahi daemon... ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 30 Starting NFS statd: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 31 Initializing OpenCT smart card terminals:  ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 32 ^M�
 33 Starting cups: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 34 Mounting other filesystems:  ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 35 Starting acpi daemon: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 36 Starting HAL daemon: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 37 Retrigger failed udev events^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 38 Starting PC/SC smart card daemon (pcscd): ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 39 Loading autofs4: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 40 Starting automount: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 41 Starting Hyper-V KVP daemon ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 42 Starting sshd: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 43 Starting xinetd: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 44 Starting ntpd: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 45 Starting postfix: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 46 Starting abrt daemon: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�

I've tried the following with no success.

cat boottest.log | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g"

I then did a dos2unix but this remains at the beginning

^[%G Welcome to CentOS��
2 Starting udev: ^[%G[ OK ]^M�

and even after the dos2unix one ^M remains in the file because it's in the middle of the text.

Don't take into account the line numbers, this is only my text editor setup.

Any help would be appreciated.

Do you have dos2unix and have you tried that?

dos2unix infile

--ahamed

Yes As mentioned I did try that on top of the sed command with no success

Try some like this:

awk '{gsub(/\^M|\^\[\[[0-9]*G\[\^\[\[0;32m|\^\[\[0;39m\]/,x)}1' file

Can you upload a sample file with minimal size?

--ahamed

Hum... sorry this one doesn't anything at all the output is exactly the same

Here is the file in question

cat file
Bringing up loopback interface:  ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 19 Bringing up interface eth0:  ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 20 Starting portreserve: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 21 Starting system logger: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
 22 Starting irqbalance: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M^M�
awk '{gsub(/\^M|\^\[\[[0-9]*G\[|\^\[\[0;[0-9]*m]*| *�/,x)}1' file
Bringing up loopback interface:    OK
 19 Bringing up interface eth0:    OK
 20 Starting portreserve:   OK
 21 Starting system logger:   OK
 22 Starting irqbalance:   OK

This is copy and past from code in post #1. It may be that it has changed when you posted it here. Upload the file.

I just did, waiting for the moderator to approve

Try this, just to remove the escape codes.

awk '{gsub(/\^\[\[[0-9]*;*[0-9]*(G\[|m]*)/,x)}1' file

Either I'm doing something really wrong but in my test here the file remains the same thing. But I did the test of pasting the file content into another file and then it works.

Probably once you get the file I uploaded you'll be able to see what is going on

Please show exactly what you do. awk is not like dos2unix -- it won't modify the original file...

awk '{gsub(/\^\[\[[0-9]*;*[0-9]*(G\[|m]*)/,x)}1' boottest.log > boottest_awk.log

This is the content of the boottest_awk.log file:
I'll post a png file of what is seen from within vi:

In this posting, all appearances of the string <ESC> is a visual representation of the escape control character and <NUMBER> is a string of 1 or more decimal digits.

I'm not sure what the escape sequence:

<ESC>%G

is supposed to do, but the following script removes them. The escape sequences of the form:

<ESC>[<NUMBER>;<NUMBER>m

change the background or foreground color (only the foreground color is affected by your sample file). The following script removes them. The escape sequence:

<ESC>[60G

moves the cursor to output column 60 before printing the following text. A much more complicated script could evaluate what has already been output and match this behavior. The following simple script just removes them.

And, as requested, this script removes all carriage return control characters from the file:

CR=$(printf "\r")
ESC=$(printf "\e")
sed "s/$ESC\[[^Gm]*[Gm]//g;s/$ESC%G//g;s/$CR//g" boottest.log

If you weren't aware that there were escape control characters in your file, look at it using:

od -bc boottest.log

Assuming you're on a machine where ASCII is a subset of the code set underlying your current locale, the escape character will appear as the string 033 .

Thanks a million, that did the job!

sed -n 's/ctrl + v,ctrl + M/ /gp' file.log

thanks for the reply but this doesn't work.

The one that did the job was:

CR=$(printf "\r")
ESC=$(printf "\e")
sed "s/$ESC\[[^Gm]*[Gm]//g;s/$ESC%G//g;s/$CR//g" boottest.log
sed -n 's/ctrl+v,ctrl+M/ /gp' file1

First, note that new readers might not realize that ctrl+v is intended to mean press the v key while holding down the key marked ctl , ctrl , or control (depending on your keyboard manufacturer).

Second, note that many readers might not realize that you don't want them to type a comma in the search pattern.

And, third, if they figured out both of those, this wouldn't remove carriage return characters; it would replace them with a space character.

And, by using the -n option to sed you throw away completely any lines that did not contain a carriage return character.

Furthermore, this sed script does absolutely nothing to remove the terminal escape sequences that were the half of the problem noted in the original poster's request.

1 Like