editing line in text file adding number to value in file

say170 · January 19, 2012, 5:49pm

I have a text file that has data like:

Data    "12345#22"
Fred
ID 12345
Age 45
Wilma
Dino
Data  "123#22"
Tarzan
ID 123
Age 33
Jane

I need to figure out a way of adding 1,000,000 to the specific lines (always same format) in the file, so it becomes:

Data    "1012345#22"
Fred
ID 1012345
Age 45
Wilma
Dino
Data  "1000123#22"
Tarzan
ID 1000123
Age 33
Jane

Of course this data is completely made up, but the principal is the same. One number is "VVVV#22" or "VVVVV#22" (ie, number is always after a single quote and before a # on a line called Data), and the other number is always the second word on an ID line. (both number the same in each stanza)

I can prepend the number, but can't figure out how to treat it as a number and add a value to it and replace...

Thanks in advance for any help...

rmohanty · January 19, 2012, 8:02pm

Hi,

You can use the below script.

y=1
while read line
do
fld1=`echo $line |cut -d' ' -f1|tr -d ' '`
if [ "$fld1" = 'Data' ]; then
   fld2=`echo $line|cut -d'"' -f2|cut -d'#' -f1|tr -d ' '`
   fld3=`expr $fld2 + 1000000`
   sed -i "$y s/$fld2/$fld3/" test1
elif [ "$fld1" = 'ID' ]; then
   fld2=`echo $line|cut -d' ' -f2|tr -d ' '`
   fld3=`expr $fld2 + 1000000`
   sed -i "$y s/$fld2/$fld3/" test1
fi
y=`expr $y + 1`
done < inputfile
exit 0

Input file :
 
Data "12345#22"
Fred
ID 12345
Age 45
Wilma
Dino
Data "123#22"
Tarzan
ID 123
Age 33
Jane

Output file : 
 
Data "1012345#22" 
Fred 
ID 1012345 
Age 45 
Wilma 
Dino 
Data "1000123#22" 
Tarzan 
ID 1000123 
Age 33 
Jane

If the final number has to be < 2000000 then you will need to put a check before using 'sed' so that the original value is not updated.

Regards,
RM

rdcwayx · January 20, 2012, 1:37am

awk  '/^Data/{split($2,a,"[#|\"]");$2="\"" a[2]+1000000 "#" a[3] "\""} 
      /^ID/{$2+=1000000}1' infile

say170 · January 20, 2012, 5:44am

Thanks. The AWK line works - as I can put it as a one liner. The only downside is it destroys the 'look' or of the original file. It was:

Data        "1#15"
  CreateTime           ""
  Id                   1

and is now

Data "1001#15"
{
  CreateTime           ""
Id 1001

I had to add an extra space, ie, ' Id' as I had Id and xxxId and yyyId in the file and it was changing all 3

awk '/^Data/{split($2,a,"[#|\"]");$2="\"" a[2]+1000 "#" a[3] "\""}
/ Id/{$2+=1000}1' infile

as SED would just replace, I don't think it would destroy the formatting.

I can do it by making two changes. The first one is easy...add spaces before the \ :

....{split($2,a,"[#|\"]");$2="       \""....

I can change the second one by piping through SED

.... | sed -e 's/^Id /  Id                   /g'

but this seems an ugly way of doing it

Scrutinizer · January 20, 2012, 7:31am

You can preserve formatting like so:

awk -F '[ \t"#]*' '/ID|Data/{sub($2,$2+1000000)}1' infile

say170 · January 20, 2012, 9:23am

That doesn't add to the ID field for me, it does to the data line though. The ID line doesn't have quotes around the value - I think that's why it doesn't work.

This is my data file:

Data        "1#15"
{
  CreateTime           ""
  Id                   1

Scrutinizer · January 20, 2012, 9:49am

No, it is because - different from your sample -, here Id is written with a lowercase d.
Try:

awk -F '[ \t"#]*' '/Id|Data/{sub($2,$2+1000000)}1' infile

say170 · January 20, 2012, 10:04am

I changed that of course, but when I run exactly as you've put it, I get:

Data        "1000001#15"
{
  CreateTime           ""
  1000000                   1

I'm actually running this on AIX, but I wouldn't expect the implementation of awk to be that different.

Scrutinizer · January 20, 2012, 10:17am

I think it is because Id is not at the beginning of the line...
Try this instead:

awk '/^Data/{split($2,N,"[#\"]");n=N[2]} /^ID/{n=$2} {sub(n,n+1000000)}1' infile

Change to Id if required...

say170 · January 20, 2012, 10:33am

I thought it worked fine, but when I applied to my full file, it seems to be adding 1000000 to other occurrences of 1, so

  Updated           "UTC 20120112 16:30:05"

becomes

  Updated           "UTC 20100000120112 16:30:05"

it seems to be adding 1000000 to every first occurrence of 1 on each line

Scrutinizer · January 20, 2012, 10:47am

Slowly, but surely

awk '/^Data/{split($2,N,"[#\"]");n=N[2]} /^I[dD]/{n=$2} n""{sub(n,n+1000000);n=x}1' infile

say170 · January 20, 2012, 11:25am

Now it ignores Id completely

Data        "1000001#15"
{
  CreateTime           ""
  Id                   1

mirni · January 20, 2012, 11:53am

Id doesn't start at the beginning of the line. Modifying the regex slightly should fix this:

awk '/^Data/{split($2,N,"[#\"]");n=N[2]} / *I[dD]/{n=$2} n""{sub(n,n+1000000);n=x}1'

[/FONT]

say170 · January 23, 2012, 6:51am

It's now doing something very odd:

Data        "1#15"
{
  CreateTime           ""
  Id                   1
  AAId                 20
  LongName             "NO VALID ENTRY"

is becoming:

Data        "1000001#15"
{
  CreateTime           ""
  Id                   1000001
  AAId                 1000020
  LongName             1000000 VALID ENTRY"

so the AAId is being changed, and "NO is being changed too

mirni · January 23, 2012, 8:58am

It's because it's matching "ID" in "VALID"
Add an anchor here:

awk '/^Data/{split($2,N,"[#\"]");n=N[2]} /^ *I[dD]/{n=$2} n""{sub(n,n+1000000);n=x}1'

I should have put it there right away...

say170 · January 23, 2012, 9:27am

Great - working now. Thanks for you help.

mirni · January 23, 2012, 10:13am

I am glad it worked. But really, all the credit should go to Scrutinizer. I did not contribute much at all.