Using awk to add length of matching characters between field in file

The awk below produces the current output, which will add +1 to $3 . However, I am trying to add the length of the matching characters between $5 and $6 to $3 . I have tried using sub as a variable to store the length but am not able to do so correctly. I added comments to each line and the description has the rules for each line and the math is zero-based . Thank you :).

description

since line 1 has 4 matching characters  between $5 and $6 (GAAA), 4 is added to $3
since line 1 has 5 matching characters between $5 and $6 (GAAAA), 5 is added to $3

file tab-delimited

id1	1	116268178		GAAA	GAAAA
id2	1	116268200		GAAAA	GAAAAA

current output tab-delimeted

id1	1	116268179	116268179	GAAA	GAAAA
id2	1	116268201	116268201	GAAAA	GAAAAA

desired output tab-delimeted

id1	1	116268181	116268181	GAAA	GAAAA
id2	1	116268204	116268204	GAAAA	GAAAAA

awk

awk 'BEGIN{FS=OFS="\t"}  # define fs and output
         FNR==NR{ # process each field in each line of file
           if(length($5) < length($6)) {  # condition 2
               sub($5,"",$6) && sub($6,"",$5)       # removing matching
               print $1,$2,$3+1,$3+1,"-",$6  # print desired output
                 next
}
}' file > output

See:

       match(s, r)
              the  position  in s where the regular expression r occurs, or 0 if it does not.  The variables RSTART and RLENGTH are set
              to the position and length of the matched string.
1 Like

Hello cmccabe,

Could you please try following and let me know if this helps you.

 
awk 'BEGIN{FS=OFS="\t"} match($NF,$(NF-1)){$3+=RLENGTH-1;$3=$3 OFS $3} 1'  Input_file

Thanks,
R. Singh

1 Like

116268178 + 4 needs to be 116268182 ?
Having doubt on requirement?

/bin/awk 'BEGIN {
		FS=OFS="\t";
	}
	match($NF,$(NF-1)) {
	$3+=RLENGTH;
	$3=$3OFS$3;
} 1' ./Input_file
1 Like

Thank you all :).