Pattern replacing


I have a text file with lots of text (strings,numbers,special characters etc). I am trying to replace any occurrence of these strings :


I want to replace them with :


I am now using 10 sed commands for replacement but I know it's stupid. There should be a better way. How do I do it with an one liner?

sed -e 's/[0-9]\{1,\}%/"&"/g' file
perl -pe 's/\d+%/"$&"/g' file
1 Like

Thanks for the quick reply. I am facing a small issue. Its replacing other numbers too. I want to specifically replace only percentages from 90 to 100.

sed -e 's/\(^\| \)\(9[0-9]%\|100%\)/\1"\2"/g' file
perl -pe 's/\b(?:9[0-9]|100)%/"$&"/g' file
1 Like

Standard sed uses BREs; not EREs. Basic regular expressions do not include alternation (i.e., BRE|BRE ). Furthermore, we don't know what separates a percentage to be quoted from its surroundings. With the following input:

123% 992% 100% 90% 93%92%,100%(90%+10%) 77.99%

a standard sed (with your suggested command) produces the output:

123% 992% 100% 90% 93%92%,100%(90%+10%) 77.99%

(which I do not believe is what is wanted) and your perl script produces:

123% 992% "100%" "90%" "93%""92%","100%"("90%"+10%) 77."99%"

(which I assume is closer to what is wanted).

I believe that what was requested was:

123% 9"92%" "100%" "90%" "93%""92%","100%"("90%"+10%) 77."99%"

but, without confirmation that that really is what is wanted from ctrid, I'm not going to try to produce a different sed or awk script that does what I might think would be a more reasonable interpretation.

Please give us a clear specification of what, if any, characters or strings appearing adjacent to a percentage should keep it from being quoted. (If a period or comma is to be interpreted as part of a percentage, are these characters locale specific?) Should something like 91.50% (in the C Locale) be quoted (since it is in the range 90% to 100%, inclusive)?

1 Like

A times overthinking it is paralyzing, as it just happened to you.
Appreciate you.

1 Like

Hi Don,

Very good catch. Even I didn't anticipate the decimals.
As you said there could be decimals.







would not be in my input text file. All percentages are delimited by space and no periods or any other characters appear anywhere.

Hence only danger I see is of decimals. Don, Thanks once again for pointing out this.
Aia, your one liner is cool, it works for now. As Don said I have an issue only if my input file changes with decimals. How do I modify this perl statement to take care of that?

With a standards conforming sed utility you could try:

sed -e 's/^100\([.]0*\)\{0,1\}%/"&"/g' \
    -e 's/ \(100\([.]0*\)\{0,1\}%\)/ "\1"/g' \
    -e 's/^9[0-9]\([.][0-9]*\)\{0,1\}%/"&"/g' \
    -e 's/ \(9[0-9]\([.][0-9]*\)\{0,1\}%\)/ "\1"/g' file

On a system using a GNU sed utility, you'd have to change that to:

sed --posix -e 's/^100\([.]0*\)\{0,1\}%/"&"/g' \
    -e 's/ \(100\([.]0*\)\{0,1\}%\)/ "\1"/g' \
    -e 's/^9[0-9]\([.][0-9]*\)\{0,1\}%/"&"/g' \
    -e 's/ \(9[0-9]\([.][0-9]*\)\{0,1\}%\)/ "\1"/g' file

You can flatten these to one-liners by removing the backslashes and <newline>s, but I find them easier to read this way.

If the file named file contains:

100% 100%
100.00000000% 100.00%
100.0000000000000001% 100.1%
101% 123%
10% 10%
10.0% 10.0%
89.9999999% 89.9%
90% 90%
90.123% 90.987%
99.94% 99.94%
90% 90% 8.98% 92% 193% 96.96%
9.98% 9.98%

it produces the output:

"100%" "100%"
"100.00000000%" "100.00%"
100.0000000000000001% 100.1%
101% 123%
10% 10%
10.0% 10.0%
89.9999999% 89.9%
"90%" "90%"
"90.123%" "90.987%"
"99.94%" "99.94%"
"90%" "90%" 8.98% "92%" 193% "96.96%"
9.98% 9.98%
1 Like

Try and see if does what you want.

perl -pe 's/(?:9\d(?<![0-8]\d)\.\d+|(?<!\S)9\d|(?<!\S)100(\.0+)?)%/"$&"/g' file
1 Like

That doesn't quote either of the following in my sample input file:

100.00000000% 100.00%

which both seem to meet the stated requirements.

1 Like

Thanks a ton for the suggestions. Its a privilege to interact with experts like you. Thanks Aia and Don.

This awk should do it.

awk ' { gsub("[0-9]%","&\"",$0); gsub("^[0-9]","\"&",$0); gsub(" [0-9]","\"&",$0); gsub("\" ",FS"\"",$0) }1'

Anything wrong with this ?

The request was to quote percentages between 90% and 100% inclusive when the percentage appears at the start of a line or is preceded by a space character. So, with input like:

Get a loan of 1000 dollars at 5% interest.

no change should be made. But your suggestion changes it to:

Get a loan of "1000 dollars at "5% "interest.