Hi,
I have numerous files which have data in the following format
A|B|123.|Mr.|45.66|33|zz
L|16.|33.45|AC.|45.
I want to remove decimal point only if it is last character in a number.
O/p should be
A|B|123|Mr.|45.66|33|zz
L|16|33.45|AC.|45
I tried this
sed -e 's/.|/|/g'
Problem with above is that it removes the '.' for Mr.
Also I want to remove '.' for last field in second line 45. should be 45
Basically any numeric filed which has a decimal but no number after decimal, should have decimal point removed
Note that the suggestions test if the preceding character is a digit, not if a field that consists of a number ends with a dot. For example if one of the fields would be A1. then this approach would fail.
An alternative would be to split it into fields and test each field if it is numeric and if it ends in a dot. For example:
Works almost. The colon needs to go in front of the start label, and, due to the |* , it will remove A1.'s dot as well. Removing the star from the pipe, it will not catch the line start. Right now, I can't see a solution...
---------- Post updated at 10:42 ---------- Previous update was at 10:39 ----------
Even with ERE, you need to run it twice, perhaps conditionally if the first hits, as you used both start and end field '|'. Otherwise, you miss the following adjacent fields on a line like: "123.|456."
You can add pipes to both ends for the substitute and then remove them:
sed '
s/.*/|&|/
s/\(|[0-9]\{1,99\}\)\.|/\1|/g
t again
b end
:again
s/\(|[0-9]\{1,99\}\)\.|/\1|/g
:end
s/^|//
s/|$//
' in_file
I could have removed both added pipes with one substitute "s/^|\(.*\)|$/\1/" but these back references are a bit slower, in my experience, so I avoid them where possible.
Two passes can be avoided if a less careful pattern is sought, like "s/\([0-9]\).|/\1/", as it encompasses only one pipe. It mangles any non-numeric field with a trailing numer and dot, like "123|The field count of this line is 3.|xyz".