awk to adjust text and count based on value in field

The below awk executes as is and produces the current output. It isvery close but what Ican not seem to do is add the -exon... , the ... portion comes from $1 and the _exon is static and will never change. If there is + sign in $4 then the ... is in acending order or sequential. If there is a - in $4 then the order is descending or in reverse. I think I need and if statement but not sure how to increment or subtract the value correctly. Thank you :).

example of ordering based on $4

+ = exon 1,2,3
- = exon 3,2,1

file tab-delimited

208	NR_120664.1	chr5	+	141704857	141843619	141843619	141843619	4	141704857,141724980,141732790,141843534,	141704935,141725050,141733148,141843619,	0	SPRY4-AS1	unk	unk	-1,-1,-1,-1,
1161	NM_021615.4	chr16	-	75507021	75528926	75512538	75513726	3	75507021,75515714,75528837,	75513742,75515789,75528926,	0	CHST6	cmpl	cmpl	0,-1,-1,
1799	NM_002036.3	chr1	+	159173802	159176290	159174749	159176240	2	159173802,159175250,	159174770,159176290,	0	ACKR1	cmpl	cmpl	0,0,

current output tab-delimited

4	+	SPRY4-AS1	NR_120664.1	chr5:141704857-141704935     chr5:141724980-141725050     chr5:141732790-141733148     chr5:141843534-141843619     
3	-	CHST6	NM_021615.4	chr16:75507021-75513742     chr16:75515714-75515789     chr16:75528837-75528926     
2	+	ACKR1	NM_002036.3	chr1:159173802-159174770     chr1:159175250-159176290

desired output tab-delimited

4	+	SPRY4-AS1	NR_120664.1	chr5:141704857-141704935_exon1,chr5:141724980-141725050_exon2,chr5:141732790-141733148_exon3,chr5:141843534-141843619_exon4
3	-	CHST6	NM_021615.4	chr16:75507021-75513742_exon3,chr16:75515714-75515789_exon2,chr16:75528837-75528926_exon1
2	+	ACKR1	NM_002036.3	chr1:159173802-159174770_exon1	chr1:159175250-159176290_exon2

awk

awk -F '\t' '{sf="";len1=split($10,s1,",");split($11,s2,","); for (i=1;i<len1;i++){sf=sf $3":"s1"-"s2"     "}print $9,$4,$13,$2,sf}' OFS='\t' file > out
BEGIN {
  FS=OFS="\t"
  suf="_exon"
}
{
   sf=""
   len1=split($10,s1,",")
   split($11,s2,",")
   for (i=1;i<len1;i++)
     sf=sf $3 ":" s1 "-" s2 suf (($4=="+")?i:len1-i) ","
   print $9,$4,$13,$2,sf
}
1 Like

Thank you very much :).