substract column based on some criteria

Please guide if you know how to solve this.

I have a tab delimited INPUT FILE where each record is separated by -----

-----
ABC      4935402        4936680          Pattern=Cheers07080.1
ABC      4932216        4932368          Pattern=Cheers07080.1
ABC      4931932        4932122          Pattern=Cheers07080.1
-----
ABC      4675209        4676057          Pattern=Cheers06520.1
ABC      4676269        4676713          Pattern=Cheers06520.1
ABC      4682346        4682510          Pattern=Cheers06520.1
ABC      4682606        4682796          Pattern=Cheers06520.1
-----
ABC      48341587       48344548         Pattern=Cheers45590.1
-----
ABC      34297519       34298743         Pattern=Cheers31410.1
ABC      34298957       34299678         Pattern=Cheers31410.1
-----

The OUTPUT file required is :

-----
Xyz    (4935402-4932368)-1    Pattern=Cheers07080.1
Xyz    (4932216-4932122)-1    Pattern=Cheers07080.1
-----
Xyz    (4676269-4676057)-1    Pattern=Cheers06520.1
Xyz    (4682346-4676713)-1    Pattern=Cheers06520.1
Xyz    (4682606-4682510)-1    Pattern=Cheers06520.1
-----
Xyz    0    Pattern=Cheers45590.1
-----
Xyz    (34298957-34298743)-1    Pattern=Cheers31410.1
-----

Output is based on this criteria:

In a record, If column2(row1) > column2(row2) then subtract row2(column3) from row1(column2) and so on till the rows are found. But if column2(row1) < column2(row2) then subtract row1(column3) from row2(column2) and so on.
If there is only 1 row in a record then print 'Xyz 0 value of Column4'

(4935402-4932368)-1 has been written only for clarity but the value of this expression is required.

Thanks in advance.

Something like this?

$ cat file
-----
ABC      4935402        4936680          Pattern=Cheers07080.1
ABC      4932216        4932368          Pattern=Cheers07080.1
ABC      4931932        4932122          Pattern=Cheers07080.1
-----
ABC      4675209        4676057          Pattern=Cheers06520.1
ABC      4676269        4676713          Pattern=Cheers06520.1
ABC      4682346        4682510          Pattern=Cheers06520.1
ABC      4682606        4682796          Pattern=Cheers06520.1
-----
ABC      48341587       48344548         Pattern=Cheers45590.1
-----
ABC      34297519       34298743         Pattern=Cheers31410.1
ABC      34298957       34299678         Pattern=Cheers31410.1
-----
$
$ awk '/-----/{
  if(f){
    print "Xyz\t0"  "\t" s
  }
  print; getline
  a=$2; s=$NF; f=1
  next
}
/ABC/{
  print "Xyz\t" a-$3-1 "\t" $NF
  a=$2; f=0
}' file
-----
Xyz     3033    Pattern=Cheers07080.1
Xyz     93      Pattern=Cheers07080.1
-----
Xyz     -1505   Pattern=Cheers06520.1
Xyz     -6242   Pattern=Cheers06520.1
Xyz     -451    Pattern=Cheers06520.1
-----
Xyz     0       Pattern=Cheers45590.1
-----
Xyz     -2160   Pattern=Cheers31410.1
-----
$

Thanks for your response Franklin. I'll take care of the text formatting. But there is some problem with the output as there are negative values in the output, whereas a smaller number has to be subtracted from a larger number each time.

Can you post the desired output from the given input file?

The desired OUTPUT File is :

-----
Xyz    3033    Pattern=Cheers07080.1
Xyz    93        Pattern=Cheers07080.1
-----
Xyz    211       Pattern=Cheers06520.1
Xyz    5632    Pattern=Cheers06520.1
Xyz    97        Pattern=Cheers06520.1
-----
Xyz    0          Pattern=Cheers45590.1
-----
Xyz    213      Pattern=Cheers31410.1
-----

The difference between records is - numbers in row 2 are either in descending order or ascending order and the subtraction varies accordingly.

Thanks.

If I understand your question then this should be the criteria:

In that case you can't get the desired output as you posted.

This command uses the criteria above:

awk '/-----/{
  if(f){
    print "Xyz\t0"  "\t" s
  }
  print; getline
  a=$2; b=$3; s=$NF; f=1		# a = column2(row1), b = row1(column3)
  next
}
/ABC/{
  if(a>$2){				# if column2(row1) > column2(row2)
    print "Xyz\t" a-$3-1 "\t" $NF	# + print row1(column2)-row2(column3)-1
  }
  else {
    print "Xyz\t" $2-b-1 "\t" $NF	# else print row2(column2)-row1(column3)-1
  }
   
  a=$2; f=0
}' file

and the output is:

-----
Xyz     3033    Pattern=Cheers07080.1
Xyz     93      Pattern=Cheers07080.1
-----
Xyz     211     Pattern=Cheers06520.1
Xyz     6288    Pattern=Cheers06520.1
Xyz     6548    Pattern=Cheers06520.1
-----
Xyz     0       Pattern=Cheers45590.1
-----
Xyz     213     Pattern=Cheers31410.1
-----

Regards

I have tried to simplify my problem. Please see if you can help.
Now there is only increasing numbers in column.

INPUT FILE

 -----
ABC      4675209        4676057          Pattern01
ABC      4676269        4676713          Pattern01
ABC      4682346        4682510          Pattern01
ABC      4682606        4682796          Pattern01
-----
ABC      48341587       48344548         Pattern09
-----
ABC      34297519       34298743         Pattern10
ABC      34298957       34299678         Pattern10
-----

OUTPUT FILE

-----
Xyz    212 [4676269 - 4676057]	  Pattern01
Xyz    5633 [4682346 - 4676713]	  Pattern01
Xyz    96 [4682606 - 4682510]      Pattern01
-----
Xyz    0           Pattern09
-----
Xyz    214 [34298957 - 34298743]   Pattern10
-----

values written in [ ] are only for explanation purpose.

Thanks in advance.

Where is the -1? Anyway, there was a bug in the code (forgot to set a variable at the end: b=$3) but this should work:

$ cat file
-----
ABC      4675209        4676057          Pattern01
ABC      4676269        4676713          Pattern01
ABC      4682346        4682510          Pattern01
ABC      4682606        4682796          Pattern01
-----
ABC      48341587       48344548         Pattern09
-----
ABC      34297519       34298743         Pattern10
ABC      34298957       34299678         Pattern10
-----
$ awk '/-----/{
  if(f){
    print "Xyz\t0"  "\t" s
  }
  print; getline
  a=$2; b=$3; s=$NF; f=1		# a = column2(row1), b = row1(column3)
  next
}
/ABC/{
  if(a>$2){				# if column2(row1) > column2(row2)
    print "Xyz\t" a-$3 "\t" $NF	# + print row1(column2)-row2(column3)-1
  }
  else {
    print "Xyz\t" $2-b "\t" $NF	# else print row2(column2)-row1(column3)-1
  }
   
  a=$2; b=$3; f=0
}' file
-----
Xyz     212     Pattern01
Xyz     5633    Pattern01
Xyz     96      Pattern01
-----
Xyz     0       Pattern09
-----
Xyz     214     Pattern10
-----

Thanks a lot Franklin. :slight_smile: