Pattern match with awk/sed - help

I need to grep for the pattern text inside the square brackets which are in red and not in green..my current code greps patterns both of them, which i don't want

Input file

ref|XP_002371341.1| oxoacyl-ACP reductase, putative [Toxoplasma gondii ME49] gb|EPT24759.1| 3-ketoacyl-(acyl-carrier-protein) reductase [Toxoplasma gondii ME49] gb|ESS34081.1| 3-ketoacyl-(acyl-carrier-protein) reductase [Toxoplasma gondii VEG](376)	-	243	134	61.4617940199336	1	230	2e-71	80.7308970099668
gb|EPR63881.1| 3-ketoacyl-(acyl-carrier-protein) reductase [Toxoplasma gondii GT1](376)	-	243	134	61.4617940199336	1	230	2e-71	80.7308970099668
ref|XP_003885852.1| 3-ketoacyl-(Acyl-carrier-protein) reductase, related [Neospora caninum Liverpool] emb|CBZ55826.1| 3-ketoacyl-(Acyl-carrier-protein) reductase, related [Neospora caninum Liverpool](376)	-	242	137	61.7940199335548	1	229	8e-71	80.3986710963455
emb|CDJ42835.1| oxoacyl-ACP reductase, putative [Eimeria tenella](347)	-	240	141	61.7940199335548	1	211	3e-64	79.734219269103
emb|CDJ64722.1| oxoacyl-ACP reductase, putative [Eimeria necatrix](347)

My current code

while read line
do
echo $line |  awk 'NR>1{print $1}' RS=[ FS=] >> $OUTPUTFILE
done <$list

any help or suggestions please..

Hint: only positive is for the patterns in red there is a number in brackets next to the pattern like=> (347), which can be used as markers

You do not need the shell loop, since awk has an implicit loop built in in the middle section:

awk 'NR>1{print $1}' RS=[ FS=] "$list" >> "$OUTPUTFILE" 

will accomplish the same.

It does not print the part in parentheses which you also indicated in red. So it is unclear whether you want that printed or not.

If not, try this modification:

awk 'NR>1 && $2~/^\(/{print $1}' RS=[ FS=] "$list" >> "$OUTPUTFILE"

If so, try:

awk 'NR>1 && $2~/^\(/{sub(/\).*/,")",$2); print $1 $2}' RS=[ FS=] "$list" >> "$OUTPUTFILE"

or if your grep has the -o option, try:

grep -o '\[[^]]*\]([^)]*)' "$list" >> "$OUTPUTFILE"

But that will include that square brackets

Try (making use of your footnote hint):

sed 's/\[[^][]*\]([0-9]\{1,3\})//' file3
ref|XP_002371341.1| oxoacyl-ACP reductase, putative [Toxoplasma gondii ME49] gb|EPT24759.1| 3-ketoacyl-(acyl-carrier-protein) reductase [Toxoplasma gondii ME49] gb|ESS34081.1| 3-ketoacyl-(acyl-carrier-protein) reductase     -    243    134    61.4617940199336    1    230    2e-71    80.7308970099668
gb|EPR63881.1| 3-ketoacyl-(acyl-carrier-protein) reductase     -    243    134    61.4617940199336    1    230    2e-71    80.7308970099668
ref|XP_003885852.1| 3-ketoacyl-(Acyl-carrier-protein) reductase, related [Neospora caninum Liverpool] emb|CBZ55826.1| 3-ketoacyl-(Acyl-carrier-protein) reductase, related     -    242    137    61.7940199335548    1    229    8e-71    80.3986710963455
emb|CDJ42835.1| oxoacyl-ACP reductase, putative     -    240    141    61.7940199335548    1    211    3e-64    79.734219269103
emb|CDJ64722.1| oxoacyl-ACP reductase, putative