I have two files:
file-gene_families.txt that contains 30,000 rows of 30 columns. Column 1 is the ID column and contains the
Col1 Col2 Col3 ...
One gene-encoded CBPs ABC 111 ...
One gene-encoded CBPs ABC 222 ...
One gene-encoded CBPs ABC 212 ...
Two gene encoded CBPs EFC 223 ...
Two gene encoded CBPs EFC 133 ...
Two gene encoded CBPs EFC 103 ...
Two gene encoded CBPs EFC 323 ...
Three gene(encoded) CBPs CGC 20 ...
Four gene/encoded (CBPs) GGH NULL ...
Four gene/encoded (CBPs) GGH 0 ...
Four gene/encoded (CBPs) GGH 1 ...
Four gene/encoded (CBPs) GGH 2 ...
Four gene/encoded (CBPs) GGH 3 ...
Four gene/encoded (CBPs) GGH 56 ...
and
file-group.list.
One gene-encoded CBPs
Two gene encoded CBPs
Three gene(encoded) CBPs
Four gene/encoded (CBPs)
I want separate file-gene_families.txt based on the file-group.list using the each line of file-group.list as the file names of the output, substitute these brackets space and slash with hyphen "-".
One-gene-encoded-CBPs.tmp
Two-gene-encoded CBPs.tmp
Three-gene-encoded-CBPs.tmp
Four-gene-encoded-CBPs.tmp
for example in One-gene-encoded-CBPs.tmp
One gene-encoded CBPs ABC 111 ...
One gene-encoded CBPs ABC 222 ...
One gene-encoded CBPs ABC 212 ...
Could not get my script working. Can someone help me out?
#!/usr/bin/bash
IFS=$'\n'
for line in $(cat At-GeneFamily-Unique-group.list)
do
name=$(sed 's/\ |\//-/g' $line)
grep $line gene_families.txt >> $name.tmp
done
Thanks a lot!
YF