It is not.
I'm new to shell scripting.
I tried to grep specific patterns before I posted. I was successful with individual specific pattern extracting. But I want the same in a global scale.
My grep commnds
grep "DC" input1
c3 100 120 TF03_X2 + AABDDAAABDDBCBADCBBC
c4 100 120 TF03_X3 + AABDCAAABDDBCBADCBBC
grep "DBCA" input1
c1 100 120 TF01_X1 + AABDDAAABDDBCADBDABC
And my awk commnd for bumblebee is
awk '{ if ($5 == "+") print $0}' input1| grep DBCADB | awk '{print $1,"\t",$2,"\t",$3,"\t",$4,"\t",$5,"\t","DBCADB",$6}'
output
c1 100 120 TF01_X1 + DBCADB AABDDAAABDDBCADBDABC
I can write a code same for Megatron but it is becoming complex because I have many patterns.
Thanx for your time
---------- Post updated 09-14-10 at 12:43 AM ---------- Previous update was 09-13-10 at 10:10 PM ----------
awk -F'\t' '{ for(i=1; i<=NF; i++) if($5 == "+" && $6 ~/DC/) { print $0,"\t", "YDCY";} else if ($5 == "-" && $6 ~/CD/) { print $0,"\t", "YCDY";} else if ($6 ~/DBCADB/) {print $0,"DBCADB";} }' input1 | awk '!a[$0]++'
c1 100 120 TF01_X1 + AABDDAAABDDBCADBDABC DBCADB
c2 100 120 TF02_X2 - AABDDAAABDDBCBACDBBC YCDY
c3 100 120 TF03_X2 + AABDDAAABDDBCBADCBBC YDCY
c4 100 120 TF03_X3 + AABDCAAABDDBCBADCBBC YDCY
---------- Post updated at 03:44 AM ---------- Previous update was at 12:43 AM ----------
Here is my tried code. Every thing is fine except mnual work. and a small bug (Could n't able to pick up duplicate expression (CD))
input1
c1 100 120 TF01_X1 + AABDDAAABDDBCADBDABC
c2 100 120 TF02_X2 - AABDDAAABDDBCBACDBBC
c3 100 120 TF03_X2 + AABDDAAABDDBCBACDBBC
c4 100 120 TF03_X3 + AABCDAAABDDBCBACDBBC
Script
awk '{ for(i=1; i<=NF; i++) if($5 == "+" && $6 ~/CD/) { print index($6, "CD"),"\t",length($6),"\t",$0,"\t", "YCDY";} else if ($5 == "-" && $6 ~/CD/) { print index($6,"CD"),"\t",length($6),"\t",$0,"\t", "YDCY";} else if ($5 == "+" && $6 ~/DBCADB/) { print index($6,"DBCADB"),"\t",length($6),"\t",$0,"\t", "DBCADB";} else if ($5 == "-" && $6 ~/DBCADB/) { print index($6,"DBCADB"),"\t",length($6),"\t",$0,"\t", "BDACBD";} }' input1 |awk '{if ($7 == "+" && $9 == "DBCADB") print $3,"\t",$4+$1,"\t",($4+$1)+6,"\t",$0; else if ( $7 == "-" && $9 == "BDACBD") print $3,"\t", ($4+$1)-6,"\t",$4+$1,"\t",$0; else if ($7 == "+" && $9 == "YCDY") print $3,"\t",($4+$1)-2,"\t",($4+$1)+2,"\t",$0; else if ($7 == "-" && $9 == "YDCY") print $3,"\t",($4+$1)-2,"\t",($4+$1)+2,"\t",$0}'|awk '!a[$0]++'|awk '{print $1,"\t",$2,"\t",$3,"\t", $9,"\t",$10,"\t",$12,"\t",$11}'
output
c1 111 117 TF01_X1 + DBCADB AABDDAAABDDBCADBDABC
c2 114 118 TF02_X2 - YDCY AABDDAAABDDBCBACDBBC
c3 114 118 TF03_X2 + YCDY AABDDAAABDDBCBACDBBC
c4 102 106 TF03_X3 + YCDY AABCDAAABDDBCBACDBBC