Disabling Backslash Interpretation with "echo -E"?

mrm5102 · August 15, 2012, 4:59pm

Hello All,

In a Bash Script I'm writing I have a section where I loop through a text file that was
outputted by another script. In the text file some of the strings in there are enclosed with
the BOLD "character sequences" (i.e. "\033[1m" ) and the "OFF" sequence (i.e. "\033[0m" ). Now I
am reading the file line-by-line in a while loop and saving each line into an element in an array.

What I would like to do is remove/disable these backslah escapes while saving the line from the
file into the Array element.

But it's weird, because if I run this command:
echo -E "\033[1m Hello World \033[0m"

The Output is:
\033[1m Hello World \033[0m

And running it with "-e" instead, prints:
Hello World

Here is a part of my while loop and I was hoping that it would diable those backslash escapes but it doesn't.

    while read line
     do
        if [[ $line =~ $PATTERN ]]
         then
            FILE_Lines[$x]=$(echo -E "$line")
            x=$(($x+1))
        fi
    done < $outputFile

So is there another way to read in the file or while saving the line to remove those sequences..?
I also tried adding a "sed 's/\\033\[1m//g'" to the end of the line in the while loop that assigns the line to the
FILE_Lines Array, but that didn't do anything either... I've also tried every different combination of quotes,
single and double, and still nothing...

If anyone has any suggestions, that would be great!

Thanks in Advance,
Matt

spacebar · August 15, 2012, 5:57pm

Try this:

$ line="\033[1m Hello World \033[0m"
$ echo $line | sed 's/\\033\[1m \(.\{1,\}\) \\033\[0m/\1/'
Hello World

Chubler_XL · August 15, 2012, 6:23pm

I did a search for "Strip VT100 escape sequences" and came up with this perl code (thanks Anonymous Monk):

perl -pe  's/\e\[[\d,\s]*[a-zA-Z]//g; s/\e\][\d];//g; s/\r\n/\n/g; s/[\000-\011]//g; s/[\013-\037]//g'

so..

FILE_Lines[$x]=$(echo -e "$line"|perl -pe  's/\e\[[\d,\s]*[a-zA-Z]//g; s/\e\][\d];//g; s/\r\n/\n/g; s/[\000-\011]//g; s/[\013-\037]//g')

mrm5102 · August 16, 2012, 9:44am

Hey Guys, thank you both for the replies.

I'll give those a try and post back....
I had no idea though that they were called VT100 Escape Sequences... Good to know!

Thanks Again,
Matt

---------- Post updated at 09:44 AM ---------- Previous update was at 09:24 AM ----------

Hey spacebar, I gave that a try and it didn't seem to work...
Probably some character missing or something like that within the REGEX Pattern somewhere...

Hey Chubler_XL thanks again for your suggestion!
Perfect! Thanks that did the trick! That's gotta be one of the longest REGEX's I've ever seen lol...

Many thanks...!

Thanks Again,
Matt

Corona688 · August 16, 2012, 10:50am

Take a closer look; it's five little regexes.

mrm5102 · August 16, 2012, 11:42am

Hey Corona, thanks for the reply...

Yeah funny you say that, I actually just noticed that a little while before I read your post... Thanks!

UPDATE:
So I'm guessing I ran into a little Perl version problem on some of our older servers...
When I ran the REGEX on the machine I'm testing this on, it worked perfectly... But when I transferred it over
to another server running Perl v5.8.2 and another with v5.8.8 it didn't do anything to the "strings" at all.

So what I did was, I tried to simplify the REGEX, because I'm guessing it matched ANY/ALL V100 Character
Sequences, but I ONLY needed to look out for the ones for BOLD "\033[1m", and the one for OFF "\033[0m"
and possibly the one for UNDERLINE "\033[4m"...

Strangely though, those escape sequences above are how you would enter them into a bash script, like this:
i.e. --> echo -e "\033[1m THIS PRINTS BOLD \033[0m"

Then if I redirect the echo statement into a txt file, they show like this instead of what's above...
i.e. --> "^[[1m THIS PRINTS BOLD ^[[0m"

So here's my new REGEX (also I changed the while loop to just a 'cat' command):

outputFile="/usr/local/test/Output.txt"

### Split the output of the 'cat' command on newlines, and insert it into the array FILE_Lines...
IFS='
'
FILE_Lines=($(cat $outputFile | perl -pe 's/\^\[\[(1|0|4)m//g'))

for (( x=0; x<=${#NRPE_Lines[@]}; x++ ))
 do
    echo ${NRPE_Lines[$x]}
done

So far this seems to remove ALL the escape sequences from the file, so I guess
I'll use this one until or if/when something else goes wrong lol...

Thanks again you guys for your suggestions... Much appreciated!!

Thanks Again,
Matt

Corona688 · August 16, 2012, 11:59am

You get a useless use of cat award.

All your code amounts to is

perl -pe 's/\^\[\[(1|0|4)m//g' < file

Do you really need to store the entire file into an array? Why not just read it line-by-line?

perl -pe 's/\^\[\[(1|0|4)m//g' < file | while read LINE
do
        echo "do something"
done

mrm5102 · August 16, 2012, 4:23pm

Hey Guys,

EDIT:
READ THIS FIRST:
Not sure what the heck is going on but now the same thing is happening on all the machines I'm trying this on...
So now it is changing the Escape Sequences when viewing inside the text file....
SO YOU CAN IGNORE MY QUESTION BELOW... NOT SURE WHAT HAPPENED..????

I had another question if anyone knows why this is happening?

On the one server I'm testing this on, when I print stuff out to stdout that contains Bolded Text and also
redirect the execution of the script to a file it seems to change the escape sequences like this:
From this --> \033[1m for Bold to this --> ^[[1m...

The Server that, that was run on is:
OS: SLES 11.1
Shell: /bin/bash Version 3.2.51

But then testing this on another server which is:
OS: AIX 6
Shell: /usr/bin/ksh
*But I have Bash Version 3.2.16 installed on this server...

So I guess my question is, is there something that changed between those 2 version of Bash that would
cause those Escape Sequences to be replaced by different sequences?

I can include an example if I'm not making sense...

UPDATE:
I also just went and tested the exact same code on another server which is:
OS: OpenSUSE 11.4
Shell: Bash Version 4.1.10
And after testing this on this machine as well, it does the same thing of changing the Bold
Escape Sequences from "\033[1m" --to--> "^[[1m"

Thanks in Advance,
Matt

---------- Post updated at 04:23 PM ---------- Previous update was at 02:40 PM ----------

Hey Corona, sorry didn't see your reply.

I forget exactly why I had it being read into an array first... I think it was because like every other way I tried reading the file
it would not preserve whitespace (i.e. empty lines that needed to be there).

Thanks Again,
Matt

244an · August 16, 2012, 9:00pm

I think the escape-character (\033) is represented in different ways in different programs. "\033" is one alternative and "^[" is another, but it's still the same character, e.g. "vi" is showing it as "^[".
I'm using bash on FreeBSD, and if I want to take away VT100 escape sequences with sed I use this

sed -E 's/'$(echo -e "\033")'\[[0-9;]+m//g'

The "escape codes" can be combined so "\033[1;4m" turns on both bold and underline. In the answer from Chubler with perl there are some other escape sequences that are taken away, I don't know anything about them, but one is changing windows linefeed to unix linefeed - I think.

mrm5102 · August 17, 2012, 10:37am

Hey 244an, thanks for the reply!

That's pretty cool... Thanks!
So that part of the REGEX that has $(echo -e "\033"), does this change the Escape Sequence to whatever that local machine uses..?

I came up with this sed command below that seems to work. The problem that I ran into was that some of the servers that will contain this script
when I'm done are pretty old... Most of the older ones are AIX 5-or-6 and after checking out their sed command's man page I realized it doesn't
have the "-r" (and I just checked for your "-E" option) or the "-E" option either...
So this is what I came up with:

while read line
 do

        # This matches "^[[" then any single digit number followed by and "m"...
        echo "$line" | sed 's/\^\[\[[0-9]m//g'

done < $myFile

Thanks again for the info!

Thanks Again,
Matt

244an · August 17, 2012, 9:12pm

No, it doesn't change to anything, the character with ASCII oct 033 is escape (ESC), the same on all OS. How this character is shown can differ though. As an example it's shown as ^[ in "vi". And that is one character, as you will notice if you move the cursor in "vi" through e.g. ^[[0m - 4 characters.
I don't understand how you can do this with the regex \^\[\[[0-9]m , the only explanation I have is that someone have done a copy-paste when it have been shown as e.g. ^[[1m on the screen. But in that case it's no longer a VT100 escape sequence, and I don't think it will do any underline or bold if you don't replace them. I mean the one character ESC is sometimes shown as ^[ , but if you type the two characters ^[ it doesn't mean ESC.

mrm5102 · August 20, 2012, 10:49am

Hey 244an, thanks for the reply...
Thanks for the insight..!

Sorry I'm not exactly sure what your asking?
Are you saying that I "shouldn't" be able to remove these character sequences with the REGEX,
because "^[" is NOT literally a "^" or a "[", but "literally" an "ESC"...?

If that's what your saying, then I'm not sure why it would be working then..? Maybe it's the way I'm
writing the output to the file...

EDIT:
Ok... I see what your saying now. I just checked the output file in VI and you were right about the "^[", if
I scroll through the file it skips over the "^[" as if it were one character. I see what you mean by it being
weird that it actually works.!?

Maybe it has something to do with the "cat -v" command. The "-v" or aka "--show-nonprinting" will use ^
and M- notation, except for LFD and TAB. I'm assuming this probably is the reason why it's working..?

Here is my code for reading the file and removing those sequences:

EXPECT_OUTPUT="/path/to/output/file.txt"

IFS='
'
### If the output file exists, then read in the Output file and split it line-by-line...
if [ -e $EXPECT_OUTPUT ]
 then
    TEMP_Lines=( $(cat -v "$EXPECT_OUTPUT" | sed 's/\^\[\[[0-9]m//g') )
fi

Now from the command line if I run "cat myOutputFile.txt" I get:

Then running "cat -v myOutputFile.txt":

Soo I'm not exactly sure why it's working, but I just know i'm glad it does lol...
But anyway thanks again for your reply...

Thanks Again,
Matt

mrm5102 · August 21, 2012, 11:24am

Hey All,

I just accidentally came across this page while Googling for something totally unrelated (GO FIGURE!!!)...
This link has some pretty cool stuff in terms of removing those color codes from your output...

Remove color codes (special characters) with sed | commandlinefu.com

Hope that may help someone!

Thanks,
Matt