Hi,
I need your kind help to get min and max values from file based on value in $5 .
File1
SP12.3 stc 2240806 2240808 + ID1_N003 ID2_N003T0
SP12.3 sto 2241682 2241684 + ID1_N003 ID2_N003T0
SP12.3 XE 2239943 2240011 + ID1_N003 ID2_N003T0
SP12.3 XE 2240077 2241254 + ID1_N003 ID2_N003T0
SP12.3 CD 2240806 2241254 + ID1_N003 ID2_N003T0
SP12.3 XE 2241471 2241684 + ID1_N003 ID2_N003T0
SP12.3 CD 2241471 2241681 + ID1_N003 ID2_N003T0
SP12.3 stc 2245127 2245129 + ID1_N005 ID2_N005T0
SP12.3 sto 2246954 2246956 + ID1_N005 ID2_N005T0
SP12.3 XE 2244762 2247195 + ID1_N005 ID2_N005T0
SP12.3 CD 2245127 2246953 + ID1_N005 ID2_N005T0
SP12.3 stc 2253115 2253117 - ID1_N006 ID2_N006T0
SP12.3 sto 2249759 2249761 - ID1_N006 ID2_N006T0
SP12.3 XE 2253090 2254054 - ID1_N006 ID2_N006T0
SP12.3 CD 2253090 2253117 - ID1_N006 ID2_N006T0
SP12.3 XE 2252492 2252908 - ID1_N006 ID2_N006T0
SP12.3 CD 2252492 2252908 - ID1_N006 ID2_N006T0
SP12.3 XE 2251730 2251882 - ID1_N006 ID2_N006T0
SP12.3 CD 2251730 2251882 - ID1_N006 ID2_N006T0
SP12.3 XE 2251591 2251664 - ID1_N006 ID2_N006T0
SP12.3 CD 2251591 2251664 - ID1_N006 ID2_N006T0
SP12.3 XE 2249887 2251530 - ID1_N006 ID2_N006T0
SP12.3 CD 2249887 2251530 - ID1_N006 ID2_N006T0
SP12.3 XE 2249087 2249821 - ID1_N006 ID2_N006T0
SP12.3 CD 2249762 2249821 - ID1_N006 ID2_N006T0
SP12.3 stc 2252073 2252075 - ID1_N006 ID2_N006T1
SP12.3 sto 2249759 2249761 - ID1_N006 ID2_N006T1
SP12.3 XE 2252492 2252973 - ID1_N006 ID2_N006T1
SP12.3 XE 2251730 2252227 - ID1_N006 ID2_N006T1
SP12.3 CD 2251730 2252075 - ID1_N006 ID2_N006T1
SP12.3 XE 2251591 2251664 - ID1_N006 ID2_N006T1
SP12.3 CD 2251591 2251664 - ID1_N006 ID2_N006T1
SP12.3 XE 2249887 2251530 - ID1_N006 ID2_N006T1
SP12.3 CD 2249887 2251530 - ID1_N006 ID2_N006T1
SP12.3 XE 2249090 2249821 - ID1_N006 ID2_N006T1
SP12.3 CD 2249762 2249821 - ID1_N006 ID2_N006T1
SP12.5 stc 3001307 3001309 + ID1_N01140 ID2_N01140T0
SP12.5 sto 3005026 3005028 + ID1_N01140 ID2_N01140T0
SP12.5 XE 3000439 3001397 + ID1_N01140 ID2_N01140T0
SP12.5 CD 3001307 3001397 + ID1_N01140 ID2_N01140T0
SP12.5 XE 3001572 3002765 + ID1_N01140 ID2_N01140T0
SP12.5 CD 3001572 3002765 + ID1_N01140 ID2_N01140T0
SP12.5 XE 3002821 3004797 + ID1_N01140 ID2_N01140T0
SP12.5 CD 3002821 3004797 + ID1_N01140 ID2_N01140T0
SP12.5 XE 3004855 3004929 + ID1_N01140 ID2_N01140T0
SP12.5 CD 3004855 3004929 + ID1_N01140 ID2_N01140T0
SP12.5 XE 3004994 3005417 + ID1_N01140 ID2_N01140T0
SP12.5 CD 3004994 3005025 + ID1_N01140 ID2_N01140T0
I did the following codes:-
awk -F"\t" '$2=="CD"{if ($5~/\+/) {print $1"\t"$3"\t"$4"\t"$5"\t"$6"\t"$7} else {print $1"\t"$4"\t"$3"\t"$5"\t"$6"\t"$7}}' file1
But the results shows all lines containing "CD" patterns like below:
SP12.3 CD 2240806 2241254 + ID1_N003 ID2_N003T0
SP12.3 CD 2241471 2241681 + ID1_N003 ID2_N003T0
SP12.3 CD 2245127 2246953 + ID1_N005 ID2_N005T0
SP12.3 CD 2253090 2253117 - ID1_N006 ID2_N006T0
SP12.3 CD 2252492 2252908 - ID1_N006 ID2_N006T0
SP12.3 CD 2251730 2251882 - ID1_N006 ID2_N006T0
SP12.3 CD 2251591 2251664 - ID1_N006 ID2_N006T0
SP12.3 CD 2249887 2251530 - ID1_N006 ID2_N006T0
SP12.3 CD 2249762 2249821 - ID1_N006 ID2_N006T0
SP12.3 CD 2251730 2252075 - ID1_N006 ID2_N006T1
SP12.3 CD 2251591 2251664 - ID1_N006 ID2_N006T1
SP12.3 CD 2249887 2251530 - ID1_N006 ID2_N006T1
SP12.3 CD 2249762 2249821 - ID1_N006 ID2_N006T1
SP12.5 CD 3001307 3001397 + ID1_N01140 ID2_N01140T0
SP12.5 CD 3001572 3002765 + ID1_N01140 ID2_N01140T0
SP12.5 CD 3002821 3004797 + ID1_N01140 ID2_N01140T0
SP12.5 CD 3004855 3004929 + ID1_N01140 ID2_N01140T0
SP12.5 CD 3004994 3005025 + ID1_N01140 ID2_N01140T0
The real output that i want will only show min and max value if "CD" pattern is found, and it should be based on value in $5. If "+", then the value in $3 for the first "CD" found and value in $4 for the last "CD" found for each ID2 ($6) will be printed in $3 and $4 of output file respectively. If "-", then the value in $4 for the first "CD" found and value in $3 for the last "CD" found for each ID2($6) will be printed in $4 and $3 respectively like below:-
SP12.3 CD 2240806 2241681 + ID1_N003 ID2_N003T0
SP12.3 CD 2249762 2253117 - ID1_N006 ID2_N006T0
SP12.3 CD 2249762 2252075 - ID1_N006 ID2_N006T1
SP12.5 CD 3001307 3005025 + ID1_N01140 ID2_N01140T0
If there is only 1 CD for any ID2 ($7), the line will also be omitted. Would appreciate if you can help me on this. thanks