Pattern to replace ^M and ^Y in a 4.2 AIX text file

Browser_ice · May 16, 2009, 3:14pm

I have files on my AIX 4.2 client system where I need to do the following replacements below but have no clue how ? They are control characters (linefeed, chariage return, ...).

First, replace "^M^Y^M" with ^char_for_end_of_line
Then replace "^M" with " "
Trim all left spaces

In VI, my files contents look like this :

aaaa zzzzzzzzzzzzzzzzzzzzzz^M
zzzzzzzzzzzzzzzzzzzzzz^M
zzzzzzzzzzzzzzzzzzzz^M
^M
^Y^M
aaaa zzzzzzzzzzzzzzzzzzzzzz^M
zzzzzzzzzzzzzzzzzzzzzz^M
zzzzzzzzzzzzzzzzzzzz^M
^M
^Y^M
aaaa zzzzzzzzzzzzzzzzzzzzzz^M
zzzzzzzzzzzzzzzzzzzzzz^M
zzzzzzzzzzzzzzzzzzzz^M
^M
^Y^M
...

I want it to be:
aaaa zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz
aaaa zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz
aaaa zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz
aaaa zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz
aaaa zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz zzzzzzzzzzzzzzzzzzzzzz
...

Nb of records is unknown.
'zzzzzz' can have any combinations of "(", ")", "'", """, ",", "[", "]", ".", ";" (in other words anything with printable characters)

TonyFullerMalv · May 16, 2009, 4:16pm

What you could really do with is dos2unix(1) that comes with Solaris and Linux but not with AIX IIRC, so instead you can use sed:

dos2unix:

sed -i 's/\r//' file

unix2dos:

sed -i 's/\n/\n\r/' file

from: UNIX BASH scripting: Linux flip command - alternative of dos2unix,unix2dos

or

dos2unix:

$ sed 's/^M$//'  input.txt > output.txt

unix2dos:

$ sed 's/$'"/`echo \\\r`/"   input.txt > output.txt

from: Howto: UNIX or Linux convert DOS newlines CR-LF to Unix/Linux format

The dos2unix examples will get rid of the carriage returns for you I will leave a scripting guru to work out the removal of particular unwanted line feeds.

bakunin · May 17, 2009, 5:44am

In your example it looks like you have groups of 3 lines of text followed by 2 lines. You want to combine the three lines of text into a single line and remove the two separating lines completely.

If this is the case:

sed -n 'N;N;s/[^M^Y]//g;s/\n//gp;N;N

This will first read two additional lines (to the first read line) from the file and combine these into the pattern space. The first replacement then throws out the control characters (^M and ^Y, enter them via <CTRL-V> in vi), the second replacement removes the newline characters combining the lines to one line and prints it. Then two additional lines (the separator lines) are read and discarded, since they are not printed at all, then repeat from start.

I hope this helps.

bakunin

Browser_ice · May 19, 2009, 2:12pm

bakunin:

In your example it looks like you have groups of 3 lines of text followed by 2 lines. You want to combine the three lines of text into a single line and remove the two separating lines completely.

If this is the case:
sed -n 'N;N;s/[^M^Y]//g;s/\n//gp;N;N
This will first read two additional lines (to the first read line) from the file and combine these into the pattern space. The first replacement then throws out the control characters (^M and ^Y, enter them via <CTRL-V> in vi), the second replacement removes the newline characters combining the lines to one line and prints it. Then two additional lines (the separator lines) are read and discarded, since they are not printed at all, then repeat from start.

I hope this helps.

bakunin

What if the number of lines of the original file is unknown ?

In my example I gave 3 lines but it can be anything between 1 and 20 lines. The file contains any multi-line amount of records. Each records is totally independent from the previous one. One record could have 2 lines, the next 20, the next 5, ... No regular patterns for the amount of lines. The file contains a list of system generated alarms coming from 20 different servers, numerous amount of workstations, ...

Sorry I forgot to mention it.

Browser_ice · May 21, 2009, 12:47pm

I tried the combinations below which do not change anything or are not recognized

\n
\^m
\^Y
Ctrl-V + Ctrl-M
Ctrl-V + Ctrl-Y => nothing is typed in the console, I have to do a Ctrl-C to get out
\x0D$
\xC1$
[^M^Y]
[^M]
[^Y]
\c[m => not recognized

sed 's/.$//' does remove the ^M at the end of each line but then it is still a multi-line format. Its like removing the last character of each line but keeping the end-of-line linefeed.

[added comments]
Is there a way to find out in VI what is the ascii value of the character under the cursor ?
It would help me identify the right decimal value to use in a replacement string.

[added comments]
I found out that ^M is actually \015. So I can remove it with tr -d '\015'
But I still haven't found out what ^Y is.

bakunin · May 21, 2009, 8:31pm

In this case you will have to have some indication for a "record" being complete. Maybe you will need some record starting criteria too, for which one could match. Provide some data and i will provide some solution.

This is just a way to enter non-printing (control-) characters into vi: enter input mode, press "CTRL-V", then press CTRL-M (for example for "^M"). You should be still in input mode and see "^M" under the cursor.

It removes the last character in a line, regardless which character this is - this is the problem. You have to specifically match "^M" (CTRL-M) and throw that out. You can throw out linefeeds by searching for "\n". Try the following with some test file:

sed 'N;s/\n/@/' /some/file

to see the effect: two lines combined to one and the linefeed is replaced by an at.

[quote]
Is there a way to find out in VI what is the ascii value of the character under the cursor ?[/qoute]

No, but you can use "od -ax <file> | more".

I hope this helps.

bakunin

ravager · May 22, 2009, 6:39am

You may also use vi

vi file

:.%s/^M//

THIS STRING WILL NOT WORK JUST LIKE THIS AS YOU HAVE TO USE CTRL SET

SO THE COMMAND TO GET THIS SAME STRING IS THIS
:.%s/(ctrl+v)(ctrl+M)//

I hope this helps you

Browser_ice · May 22, 2009, 11:10am

The end of the record is marked with the ^Y value which I still haven't figured out how to replace it. Entering Ctrl-V + Ctrl-Y prints nothing on the console. I have to do a Ctrl-C to get out or another Ctrl char that actual prints something on the command prompt (like Ctrl-M).

sed 'N;s/\n/@/' /some/file

I tried something similar but more like removing the \n. When I tried opening the result with VI, it said the line was too long. I realized after that Dah ! Sure, if I remove the line feed the whole file will just be one single record ! So by using your example, it should produce the same problem.