Hi,
I need help to find values between two different fields based on $6 (NUM) AND $1 (CD), within the same ID. The result should show the values between the NUMs which will be extracted from within $3 and $2 in data.txt file below.
data.txt
ex 139 142 Sc_1000004 ID 4
CD 139 142 Sc_1000004 ID 4 Num1
sta 139 140 Sc_1000004
ex 143 144 Sc_1000004 ID 4
CD 148 150 Sc_1000004 ID 4 Num2
ex 153 156 Sc_1000004 ID 4
CD 153 156 Sc_1000004 ID 4 Num3
sto 156 158 Sc_1000004
ex 160 163 Sc_1000005 ID 5
CD 160 163 Sc_1000005 ID 5 Num1
sta 160 161 Sc_1000005
ex 167 170 Sc_1000005 ID 5
CD 167 170 Sc_1000005 ID 5 Num2
ex 175 205 Sc_1000005 ID 5
CD 175 205 Sc_1000005 ID 5 Num3
sto 205 207 Sc_1000005
ex 212 221 Sc_1000006 ID 6
CD 212 221 Sc_1000006 ID 6 Num2
sto 212 215 Sc_1000006
ex 224 227 Sc_1000006 ID 6
CD 224 227 Sc_1000006 ID 6 Num1
sta 227 229 Sc_1000006
ex 243 248 Sc_1000007 ID 7
CD 243 248 Sc_1000007 ID 7 Num1
sta 243 243 Sc_1000007
ex 251 257 Sc_1000007 ID 7
CD 251 257 Sc_1000007 ID 7 Num2
ex 261 263 Sc_1000007 ID 7
CD 261 263 Sc_1000007 ID 7 Num3
sto 263 265 Sc_1000007
ex 275 288 Sc_1000008 ID 8
CD 275 288 Sc_1000008 ID 8 Num1
sta 275 277 Sc_1000008
i want to have output like this:-
NewVal 143 - 147 ID 4
NewVal 151 - 152 ID 4
NewVal 164 - 166 ID 5
NewVal 170 - 174 ID 5
NewVal 222 - 223 ID 6
NewVal 249 - 250 ID 7
NewVal 257 - 260 ID 7
in the above output, for eg., "143 - 147" are the CD values extracted between NUM 1 and NUM 2. While "151 - 152" are the CD values extracted between NUM 2 and NUM 3 for ID 4 in the input file (data.txt)... and so on..but if there is only 1 NUM (such as NUM1 for ID 8), which means that only 1 CD exist for that ID, then no NewVal will be extracted.
I have thousands of values that i need to extract from hundreds of files like this :(. Would appreciate your kind help or advise to do this in awk or sed. Thanks...