Replace dashes positions 351-357 & 024-043 with 0 & replace " " if exis with 04 at position 381-382

lancesunny · October 1, 2012, 9:42am

I need to replace dashes (i.e. -) if present from positions 351-357 with zero (i.e. 0), I also need to replace dash (i.e �-�) if present between position 024-043 with zero (i.e. 0) & I replace " " (i.e. 2 space characters) if present at position 381-382 with "04". Total length of record is 413.
Here is the example of the one record
B123456Aero 12345678901234-1234 abcd Abcdef 1234 Abc Rd Ste 1 Abc Treak WI12345 0000123 0-1234567890 000000000
B223456Bero 09876543214321-2345 abcd Abcdef 1234 Abc Rd Ste 1 Abc Treak WI12345 0000123 002345678901 02 000000000
B323456Cero 23456789012345-3236 abcd Abcdef 1234 Abc Rd Ste 1 Abc Treak WI12345 0000123 00-456789012 000000000
B423456Dero 32345678901234-4237 abcd Abcdef 1234 Abc Rd Ste 1 Abc Treak WI12345 0000123 004567890123 01 000000000

any help on this will be appreciated.
Thanks in advance.

jim_mcnamara · October 1, 2012, 10:06am

How long is the longest record.
What OS and shell are you using? (some tools are limited as to the max size of a record)

lancesunny · October 1, 2012, 10:10am

The OS used is redhat linux and every record length = is 421. We are using bash shell.

msabhi · October 1, 2012, 10:29am

awk -F "" '{for(i=1;i<=NF;i++){if((i>=351 && i<=357) || (i>=24 && i <=43)){gsub("-","0",$i);} else if(i>=381 && i<=382){gsub("  ","04",$i);}}}1' OFS="" input_file

lancesunny · October 1, 2012, 10:40am

Hey msabhi,
awk '{if((NR>=351 && NR<=357) || (NR>=24 && NR <=43)){gsub("-","0");} else if(NR>=381 && NR<=382){gsub(" ","04");}}1' input_file
is not working.
We need to replace character - (i.e. dash) with 0 if exist between position 351 to 357. we also need to replace character - (i.e. dash) with 0 if exist between position 24 to 43. And we also need to replace two blank characters " " with 04 if exist between position 381 to 382 in same record. Each record length is 413.
I have attached input file with this thread. If we open this file with notepad++ or any other editor we can see presence of dashes at 351 to 357 & 24 to 43. for some or all records and presence " " at 81 to 382 for some records.
Thanks for you help in advance.

pamu · October 1, 2012, 10:52am

Hi msabhi,

I think lancesunny is talking about position of the string..

so NR can be used like this...

i know this is not the efficient way..
try..

while read line
do
echo "$line" | sed -e 's/.\{1\}/&\n/g' | awk '{if((NR>=351 && NR<=357) || (NR>=24 && NR <=43)){gsub("-","0");} else if(NR>=381 && NR<=382){gsub("  ","04");}}1' | paste -sd ""
done<file

msabhi · October 1, 2012, 11:01am

I got you guys pamu and lancesunny
my vision :wall:

Lancesunny : I have updated new untested code in my previous post...Can you please check once..

RudiC · October 1, 2012, 11:10am

Is that ONE record that you present in four lines in post #1? Line 1 and 3 have 110 chars, line 2 and 4 have 113 char, totalling to 446 char, not 413 nor 421. Are the lines <lf> terminated/separated, and do you count the <lf> chars (=449 in total)? There's no dashes in pos 351 to 357 (which would be line 4 pos 18 - 24), even if you'd count the <lf>s. One dash in line 1 pos 27 could be replaced, but that's all. There's no two blanks in your input example that could be replaced either.

Pls. provide a meaningful problem and an adequate input sample as well as desired output.

msabhi · October 1, 2012, 11:14am

The latest tested code :

awk -F "" '{for(i=1;i<=NF;i++){if((i>=351 && i<=357) || (i>=24 && i <=43)){gsub("-","0",$i);} 
else if(i>=381 && i<=382){gsub("  ","04",$i);}}}1' OFS="" input_file

:o

lancesunny · October 1, 2012, 11:14am

Thanks pamu and msabhi for you help on this.
But after executing recommended command the output file is still not giving us the expected results.1) for example position 24 to 436 values were 01234-1234,14321-2345, 12345-3236 & 01234-4237 for record 1,2,3,4. after command these values remain same.
2) Position 351 to 357 values were 0-1234567890, 002345678901, 00-456789012 & 004567890123 for record 1,2,3,4 which never got modified.
3) Position 381-382 values were " ", 02, " ", 01 for record 1,2,3 & 4. Here please ignore "" since in order to specify 2 space i used " " these 2 spaces from two record never got changed after our modifed command execution.

I have attached both input and output file for the reference. Please see the attached.
Thanks a lot for helping me out.

RudiC · October 1, 2012, 11:34am

Try this:

awk     'BEGIN{OFS=FS=""}
         {
          if ($381$382=="  ") {$381="0";$382="4"}
          for (i= 24; i<= 43; i++) if ($i=="-") $i="0"
          for (i=351; i<=357; i++) if ($i=="-") $i="0"
         }1
        ' InputFile.dat

---------- Post updated at 05:34 PM ---------- Previous update was at 05:28 PM ----------

Input.dat and Output.dat are identical.

$ diff InputFile.dat Output.dat 
$

lancesunny · October 1, 2012, 11:36am

Thank you all for you help on this. Thanks for your time on this.
Thanks a lot RudiC and msabhi.
The code given by RudiC worked like Charm and expected.
Thank you so much for your help on this guys. You guys made my day.
-Lancesunny