Search and replace specific positions of specific lines

dsid · March 29, 2017, 10:56am

Hi,

I have a file with hundreds of lines. I want to search for particular lines starting with 4000, search and replace the 137-139 position characters; which will be '000', with '036'. Can all of this be done without opening a temp file and then moving that temp file to the original file name.

sample input

400300000038655300580061482900007000000287546004100457338000010000000850510006001389990000100000008391600060013799700016000000081649009300137011000020000000773100012000
4004001259770000300000006090000170010599300018000000058339010400095541000080000000463550046000759680001000000002942500580005                                            
4000422985400050462239065593606500000007422985707771046154054910075641MC0318AMWAY OF AUSTRALIA       CASTLE HILL  AU 5599   0000000097950007077Y10001 6022100000     0 0
40010000000000 00000000000000000000                                                                                                                                     
4002467076359965203 0042298585118131 2422240200000000N4597104019M462239          7                                                               0000000000000000000    
4000422985400050406669990541353200000007422985707371044941247610075641MC0314ANGLICARE WA             EAST PERTH   AU 5969   0000000002100007073N10001 6051800000     CR0

$ uname -a
Linux lxserv01 2.6.18-417.el5 #1 SMP Sat Nov 19 14:54:59 EST 2016 x86_64 x86_64 x86_64 GNU/Linux

Scrutinizer · March 29, 2017, 11:30am

Hi, try:

sed '/^4000/s/\(.\{136\}\).../\1036/' file

or

sed -r '/^4000/s/(.{136}).../\1036/' file

You can use the -i in place option with sed on your system, but it is best to then also specify a backup file with command, for example -ibak , unless you already have a backup...

dsid · March 29, 2017, 11:41am

scrutinizer:

Hi, try:
sed '/^4000/s/\(.\{136\}\).../\1036/' file
or
sed -r '/^4000/s/(.{136}).../\1036/' file
You can use the -i in place option with sed on your system, but it is best to then also specify a backup file with command, for example -ibak , unless you already have a backup...

if i understand correctly, you are searching for lines starting with 4000, go on with any characters till the 136th position; the next 3 dots (...) should be replaced with a 036;\1 ->would be the 1st instance of the search yielding a positive result.

Is the above explanation correct? Also why did you specify a -r (extended regular expressions) flag

And yes, I will be using the -i.bak option

RudiC · March 29, 2017, 12:34pm

If you compare the two sed scripts, the one with the -r option is way simpler (not that many escapes needed).
The \1 ("back reference") substitutes the 1. paranthesized regex part with itself.
In Scrutinizer's proposal there's no check the three to-be-replaced characters really are "000" as specified in post#1.

Scrutinizer · March 29, 2017, 1:13pm

That is correct, a dot in regex means "any character" .

--

Hi RudiC, IMO that was not a specification, but rather a remark. It says: which will be '000', therefore I used '...' instead..

dsid · March 29, 2017, 1:49pm

position 137-139 will definitely be 000 and that needs to be replaced by 036.

thank you both for the explanations

actually I just thought about it RudiC is correct, (...) would mean any characters, can I replace the 3 dots with 000, since I am sure positions 137-139 will be zeros.

Scrutinizer · March 29, 2017, 2:33pm

You can use 000 instead of ... , which means that it will only be replaced by 036 if positions 137-139 will be zeroes on a line that starts with 4000 . Any other value will not get replaced, and be left as is..

If you use ... then any value there will be replaced.

Also, if on a line that starts with 4000 positions 137-139 will always be zeroes (the way I interpreted your specification) , then it does not matter what you use..

--- edit ---
As is apparent in another thread if 000 is used instead of ... then anchoring is needed to the beginning of the line in the form of a caret ( ^ ) otherwise it will try to match 000 with 136 arbitrary characters before it ..

dsid · March 29, 2017, 2:38pm

this is exactly what i want

I am pretty sure there will never be any other characters than 000..but its good to learn how i can use the 3 dots while searching for a pattern