Dear Users,
I have installed a standalone program to do multiple sequence alignment which takes user parameters to run the program. I have multiple sequence files and want to automate this process through a bash script. I have tried to write a small bash code but its throwing errors.
Kindly help me to debug the program.
Thanks in advance.
#!/bin/bash
cat <<EOF > data.txt
(here goes the file name but dont know how to pass the filename argument through the file )
auto
n
/home/multalign/dna.tab
n
msf
y
aligned
100.90
EOF
for x in {A..H}
do
echo $x
for y in PreS1 PreS2 S X
do
/home/multalign/tab/ma < data.txt
echo Done: ${x}_${y}.fas
done
done
Throwing which errors? Be specific.
Why are you putting the filename into data.txt ? To put it on the commandline, put it on the commandline.
/home/multalign/tab/ma filename < data.txt
Or if you meant you want the contents to be dynamic, don't store it in a file at all:
/home/multalign/tab/ma <<EOF
Text which can include ${VARIABLES}
EOF
Dear Corona688,
Thank you for your reply. I incorporated the changes but it seems its not able to find the data file ( although i put explicit path name).
Below is the code and error:
#!/bin/bash
cat <<EOF > data.txt
auto
n
/home/multalign/tab/dna.tab
n
msf
y
aligned
100.90
EOF
#for x in {A..H}
for x in H
do
echo $x
for y in PreS1 PreS2 S X
do
/home/multalign/tab/ma ${x}_${y}.fas < data.txt
echo Done: ${x}_${y}.fas
done
done
Error:
H
Multalin version 5.4.1
Copyright I.N.R.A. France 1989, 1991, 1994, 1996
Published research using this software should cite
Multiple sequence alignment with hierarchical clustering
F. CORPET, 1988, Nucl. Acids Res., 16 (22), 10881-10890
Reading H_PreS1.fas
Sequence # 30
Error: Error reading blosum62.tab.
Done: H_PreS1.fas
Multalin version 5.4.1
Copyright I.N.R.A. France 1989, 1991, 1994, 1996
Published research using this software should cite
Multiple sequence alignment with hierarchical clustering
F. CORPET, 1988, Nucl. Acids Res., 16 (22), 10881-10890
Reading H_PreS2.fas
Sequence # 30
Error: Error reading blosum62.tab.
Done: H_PreS2.fas
Multalin version 5.4.1
Copyright I.N.R.A. France 1989, 1991, 1994, 1996
Published research using this software should cite
Multiple sequence alignment with hierarchical clustering
F. CORPET, 1988, Nucl. Acids Res., 16 (22), 10881-10890
Reading H_S.fas
Sequence # 72
Error: Error reading blosum62.tab.
Done: H_S.fas
Multalin version 5.4.1
Copyright I.N.R.A. France 1989, 1991, 1994, 1996
Published research using this software should cite
Multiple sequence alignment with hierarchical clustering
F. CORPET, 1988, Nucl. Acids Res., 16 (22), 10881-10890
Reading H_X.fas
Sequence # 26
Error: Error reading blosum62.tab.
Done: H_X.fas
Thanks
It's giving errors about a file you didn't even ask it to read. I think there's more wrong with it than your input.
How exactly would you be running this by hand? Show me one successful sequence, word for word, letter for letter, keystroke for keystroke.
Dear Corona688,
Below the word by word execution of program from terminal: I have the executable set in the path:
[host@localhost temp]$ ma
Multalin version 5.4.1
Copyright I.N.R.A. France 1989, 1991, 1994, 1996
Published research using this software should cite
Multiple sequence alignment with hierarchical clustering
F. CORPET, 1988, Nucl. Acids Res., 16 (22), 10881-10890
Sequence file: H_X.fas
Input format (gcg, mul, embl, genbank, auto): (def = auto) auto
Other Input parameters (y/n) ?: (def=n) n
Symbol comparison table: (def = blosum62.tab) /home/multalign/dna.tab
Gap value ? (def = value in Symbol comparison file):
Other Alignment parameters (y/n) ?: (def=n) n
Output format (msf, mul, doc): (def = msf) msf
Other Output parameters (y/n) ?: (def=n) y
Ouput order as (aligned, input)) ?: (def=aligned) aligned
Save cluster as a drawing or as a list (d/l) ?: (def = l) l
Consensus levels (default = 90 and 50) ? : 100.90
Reading H_X.fas
Sequence # 26
Reading /home/multalign/dna.tab
Saving Configuration.
Clustering with fast
Action successful
Saving H_X.clu
Aligning
gnl|hbvcds|AB059659_ ..................with gnl|hbvcds|EF157291_ Position # gnl|hbvcds|AB059659_ ...................with gnl|hbvcds|EU498228_ Position # gnl|hbvcds|AB059659_ ....................with gnl|hbvcds|FJ356715_ Position # gnl|hbvcds|AB059659_ .....................with gnl|hbvcds|FJ356716_ Position # gnl|hbvcds|AB059659_ ......................with gnl|hbvcds|HM066946_ Position #gnl|hbvcds|AB059659_ .......................with gnl|hbvcds|HM117850_ Position gnl|hbvcds|AB059659_ ........................with gnl|hbvcds|HM117851_ Positiongnl|hbvcds|AB179747_ ................with gnl|hbvcds|AB059660_ ..Position # gnl|hbvcds|AB059660_ ...................with gnl|hbvcds|AB059659_ Position # gnl|hbvcds|AB059659_ ....................with gnl|hbvcds|AB205010_ Position # gnl|hbvcds|AB059659_ .....................with gnl|hbvcds|HM117851_ Position # gnl|hbvcds|AB059659_ ......................with gnl|hbvcds|AB064315_ Position #gnl|hbvcds|AB059659_ .......................with gnl|hbvcds|AY090454_ .PositionAction successful 2 iteration(s)
Saving H_X.cl2
Saving H_X.msf
[host@localhost temp]$
Above options comes up when trying to execute the program from terminal. I am trying to run this program automatically so that I don't have to do this interactively for 1000 of sequences.
I really appreciate your time and response Corona688.
Regards
For the third time, use [code] instead of [icode]
tags please. Moderators have been fixing your posts to make them legible.
Extremely sorry, this is coming from the code icon from the text editor.
Again apologize.
Which code icon are you clicking? Use this one: If that's the one you're clicking, then the editor must have a bug.
I notice that, in your shell, you just type 'ma' but in the program, you give /full/path/to/ma. Does your script have a different PATH than your program? Is it being run by cron or CGI or some other automatic method? Important things may be missing from your environment which ma needs to work.
Also, interactive programs don't always work well with automatic input. Some can, some don't. It might even decide "OK, I don't have a terminal -- this means I shouldn't bother asking questions and just use all-default options". Let's test it. Does this work, typed into your terminal by itself?
ma <<EOF
H_X.fas
auto
n
/home/multalign/dna.tab
n
msf
y
aligned
l
100.90
EOF
Dear Corona688, the program is executing successfully using the above commands. Below is the terminal output:
[host@localhost temp]$ ma <<EOF
H_X.fas
auto
n
/home/multalign/dna.tab
n
msf
y
aligned
l
100.90
EOF
Multalin version 5.4.1
Copyright I.N.R.A. France 1989, 1991, 1994, 1996
Published research using this software should cite
Multiple sequence alignment with hierarchical clustering
F. CORPET, 1988, Nucl. Acids Res., 16 (22), 10881-10890
Sequence file:
Input format (gcg, mul, embl, genbank, auto): (def = auto)
Other Input parameters (y/n) ?: (def=n)
Symbol comparison table: (def = blosum62.tab)
Gap value ? (def = value in Symbol comparison file):
Other Alignment parameters (y/n) ?: (def=n)
Output format (msf, mul, doc): (def = msf)
Other Output parameters (y/n) ?: (def=n)
Ouput order as (aligned, input)) ?: (def=aligned)
Save cluster as a drawing or as a list (d/l) ?: (def = l)
Consensus levels (default = 90 and 50) ? : Reading H_X.fas
Sequence # 26
Reading /home/multalign/dna.tab
Saving Configuration.
Clustering with fast
Action successful
Saving H_X.clu
Aligning
gnl|hbvcds|AB059659_ ..................with gnl|hbvcds|EF157291_ Position # gnl|hbvcds|AB059659_ ...................with gnl|hbvcds|EU498228_ Position # gnl|hbvcds|AB059659_ ....................with gnl|hbvcds|FJ356715_ Position # gnl|hbvcds|AB059659_ .....................with gnl|hbvcds|FJ356716_ Position # gnl|hbvcds|AB059659_ ......................with gnl|hbvcds|HM066946_ Position #gnl|hbvcds|AB059659_ .......................with gnl|hbvcds|HM117850_ Position gnl|hbvcds|AB059659_ ........................with gnl|hbvcds|HM117851_ Positiongnl|hbvcds|AB179747_ ................with gnl|hbvcds|AB059660_ ..Position # gnl|hbvcds|AB059660_ ...................with gnl|hbvcds|AB059659_ Position # gnl|hbvcds|AB059659_ ....................with gnl|hbvcds|AB205010_ Position # gnl|hbvcds|AB059659_ .....................with gnl|hbvcds|HM117851_ Position # gnl|hbvcds|AB059659_ ......................with gnl|hbvcds|AB064315_ Position #gnl|hbvcds|AB059659_ .......................with gnl|hbvcds|AY090454_ .PositionAction successful 2 iteration(s)
Saving H_X.cl2
Saving H_X.msf
[host@localhost temp]$
so it means it should take the commands from the here documents also?
Regards
1 Like
It already is taking it from a here document. Hopefully it should see no difference from this one to one in a script. Though -- I just found documentation for multalin here. It has a command-line mode which should be used instead of cramming input into it.
ma -c:/home/multalign/dna.tab -k:100.90 H_X.fas
Other options may be needed, I can't test this here, but here's a starting point. Get something that works when typed, then you should be able to script a loop around it.
If all else fails, you can use the here-document as above, but that's not really how it's intended to be used.
Just an observation...
Your error says Error: Error reading blosum62.tab.
...
Note the period at the end of .......tab. <- Is this correct?
Dear Corona688,
You are awesome and really really thanks for your help. The command line options worked like charm. Below is the sample script i put together for the automatic alignment:
#!/bin/bash
for x in H #This has more variable which i will loop like {A..H}
do
echo $x
for y in PreS1 PreS2 S X
do
/home/multalign/tab/ma -c:/home/multalign/dna.tab -k:100.90 ${x}_${y}.fas
echo
echo Done: ${x}_${y}.msf
echo
done
done
Kindly suggest me if this need further optimization or everything looks ok syntax wise.
Again thankyou for your help.
Regards
It looks OK, assuming that ma -c:... -k:... file
is all you need for options. I don't know that. I would test it with a single operation before bothering to put it in a loop.
Dear Corona688,
I first checked with a single operation including other command-line options and then put it into the loop for all the sequences. So far working good
Thanks again.
Regards
1 Like