A portion of my input is as follows:
1087 IKON01,49 A WA- -1 . -1 . 0 W WA- -1 . -1 . 0 . -1 . -1 -1 -1 -1 -1 -1 W
1088 IKON01,49 A J.@QU80MW. 2 !J.@! 0 . 0 QWM[ QUM 7 [W. 0 . 0 . 0 . 0 11 3 3 2 -1 JQMW
1089 IKON01,49 A K.@L& -1 . -1 . -6 KL/ K.@L -1 . 1 / 0 . 0 . -1 -1 -1 1 2 1 KL
I would like the following desired output:
1087 IKON01,49 A WA- -1 . -1 . 0 W WA- -1 . -1 . 0 . -1 . -1 -1 -1 -1 -1 -1 W
1088 IKON01,49 A J.@QU80MW. 2 !J.@! 0 . 0 QM[ QUM 7 [W. 0 . 0 . 0 . 0 11 3 3 2 -1 JQMW
1089 IKON01,49 A K.@L& -1 . -1 . -6 KL/ K.@L -1 . 1 / 0 . 0 . -1 -1 -1 1 2 1 KL
In essence, I would like to delete every W in field $9 while preserving the original, pre-substitution formatting, given the following regex condition:
if($9 ~/^.W[^H]=*\[$/)
However, I would want the formatting of the file to be preserved. I realize this has been dealt with in previous posts and I know how to use
printf
and/or
FIELDWIDTH
(with gawk), but since my file is 61 fields long (NF==61; I've only presented a portion here), this is tremendously cumbersome and messy. In addition, I do not know every fieldwidth and so would like to avoid figuring this out to reformat the file.
I've had a similar issue in the past, and RudiC helped me via a very nifty trick taking advantage of NF being recomputed when there is an assignment to $0. Thus, I attempted the following:
gawk '$9 ~/^.W[^H]=*\[$/{X=$9; sub(/W/,"",X); sub ($9, X, $0)}' file.txt
This time however, it seems as those when doing the field substitution, the operation is aborted because of the non-escaped meta-character "[" that is in my data. This produces the following error in field 9 of a different line:
fatal: Invalid regular expression: /BW>[/
In light of this, I've also attempted:
gawk '{if($9 ~/^.W[^H]=*\[$/); sub(/W/,"",$9); print}' file.txt
Not only does this ruin the formatting of the file, but it is also matching lines I wouldn't expect it to such as:
1 IKON01,01 A W:- -1 . -1 . 0 W W:- -1 . -1 . 0 . -1 . -1 -1 -1 -1 -1 -1 W
Thank you so much in advance for helping me through this quagmire.
---------- Post updated at 11:00 AM ---------- Previous update was at 10:49 AM ----------
This is probably fairly obvious, but I should say that in my data examples from my post, the numbers on the far left are line numbers and not $1.
Thank you.