Hi All,
I am very new in programming. I need some help.
I have one input file like:
Number of disabled taxa: 9
Loading mapping file: ncbi.map
Load mapping:
taxId2TaxLevel: 469951
--- Subsample reads (20%): 66680 of 334386
Processing: tree-from-summary
Running tree-from-summary algorithm
Taxonomy:
Gammaproteobacteria: 2767
Alphaproteobacteria: 4123
Deltaproteobacteria: 1343
Epsilonproteobacteria: 26
Not assigned: 1445
No hits: 220253
+++++++++++End of summary for file: B-Red-sum.txt
--- Subsample reads (20%): 67037 of 334386
Processing: tree-from-summary
Running tree-from-summary algorithm
Taxonomy:
Gammaproteobacteria: 2809
Alphaproteobacteria: 4001
Deltaproteobacteria: 1208
Epsilonproteobacteria: 15
Not assigned: 299
No hits: 461890
+++++++++++End of summary for file: B-Red-sum.txt
::::: and so on
I want to create some output like:
Out file1.txt(which grep from, next line of "Taxonomy:" upto "+++++++++++End" ) with no space in front of line and so on.
So the desired ouput will be:
outfile1.txt
Gammaproteobacteria: 2767
Alphaproteobacteria: 4123
Deltaproteobacteria: 1343
Epsilonproteobacteria: 26
Not assigned: 1445
No hits: 220253
outfile2.txt
Gammaproteobacteria: 2809
Alphaproteobacteria: 4001
Deltaproteobacteria: 1208
Epsilonproteobacteria: 15
Not assigned: 299
No hits: 461890
and so on.
Can anybody please help me in this matter?
I tried with some code like this. But didn't workout.
--------------------------------------------------------------------------
#!/bin/tcsh
if $#argv != "1" then
echo "Usage: process-file-script 1st-output-file-as-inputfile"
exit 0
endif
FIL_NM=$1
str=""
cat $FIL_NM | while read LINE
do
if [ "`echo $LINE | awk '{print $1}'`" = "+++++++++++Begin" ] ; then
n=1
c=1
fi
if [ "`echo $LINE |grep Gamma`"] ; then
NEW_FIL_NM=$FIL_NM"_"$n.txt"
fi
fi
if [ "`echo $LINE | awk '{print $1}'`" = "+++++++++++End" ] ; then
n=0
fi
done
--------------------------------------------------------
Please help...
Many thanks in advance...
Best wishes,
Mitra
if you have Python, here's an alternative solution
f=0;i=0
for line in open("file"):
line=line.strip()
if line.startswith("+++++++++++"):
f=0
o.close()
if "Taxonomy:" in line:
f=1;i=i+1
o=open("out_"+str(i)+".txt","w")
if f:
print >>o, line
Hallo ghostdog74,
Thanks for your reply. But I am sorry to say that I forgot to mention : in my input file there are not always only 6 lines. I just copied some lines.. This lines varies from 100 to 200. So it is necessary for the program to read +++++++++End.
Hallo durden_tyler,
your perl code works. Thanks a lot. But there is still one problem.
As I told in my input file there are several amount of spaces in front desired lines.
Is there any possibility to get rid of these space directly?
Now it is giving:
The spaces disappear here because you do not enclose your file data or code within the "code" tags. (Notice how the actual code posted by the forum members has a nice little box around it with the title "Code:" at the top.)
If you sandwich the desired text within "code" tags, without any space between "code", "]", "[" and "/" :
[ code ] <your_text_here> [ / code ]
then the leading spaces will be preserved.
Alternatively, if you are feeling lazy to actually type the "code" tags, then you can do this -
(a) select the desired text, and
(b) click on the "#" icon in your Message Box right above the response area
The dynamic script associated with the web page will put the "code" tags for you.
HTH,
tyler_durden
____________________________________________________________
"This is your life and it's ending one minute at a time."
Hallo durden_tyler,
At first I want to thank you for your help. Thanks a lot...I am very new in scripting. Can you please explain the filed (.*?)\+{11}/msgi) for your code in my thread help?
Actually I am trying to learn. So it will be really helpful. And one more question How can I make this script executable.
My try was:
#!/usr/bin/perl -w
$#ARGV==1 or die "Usage: 2ndprocess-script 1st-output-file-as-inputfile\n";
$input=shift;
perl -ne '{$/=""; $i=1;
while (/^Taxonomy:.(.*?)\+{11}/msgi) {
$x = $1; $x =~ s/(^|\n)\s+/\1/g;
open(OUT,">outfile".$i++.".txt"); print OUT $x; close(OUT);
}}' $1;
-----------------------------
which didn't work.
Can you please help me to learn this?
Thank you very much once again.
Have anice time.
Best wishes,
Mitra
Hallo durden_tyler,
At first I want to thank you for your help. Thanks for the help in writing also. Now I can use that.Thanks a lot...I am very new in scripting. Can you please explain the filed (.*?)\+{11}/msgi) for your code in my thread help?
Actually I am trying to learn. So it will be really helpful. And one more question How can I make this script executable.
My try was:
-----------------------------
which didn't work.
Can you please help me to learn this?
Thank you very much once again.
Have anice time.
Best wishes,
Mitra
if you want to use Perl, here's another version more "understandable" as there's less of regular expression.
$i=0;
while (<>){
chomp;
if (/\+*End of summary for file/ ){
$f=0;close(FH);next;
}
if (/Taxonomy:/ ) {
open(FH,">>","output_".$i++) or die "Cannot open for writing:$!\n";
$f=1; next;
}
if ($f) {
s/^\s+//g; #get rid of spaces in front
print FH $_."\n";
}
}
Dear ghostdog74,
My main problem is I am very new in programming. I am trying to learn. So I am not habituated with either perl or python. Both are new to me. Can you please help me to understand how should I make this files executable, like a script? In case of other reply also, when I use the code directly in the terminal then it works, but in all the cases, still I am unable to make these as an executable script with a given input file like $1.
Can you or anyone else please help me in this matter?
Thanks a lot for your help.
With best regards,
Mitra.
Dear All,
Thanks for your replies, codes and advices.
My main problem is I am very new in programming. I am trying to learn. So I am not habituated with either perl or python. Both are new to me. Can anybody please help me to understand how should I make this files executable, like a script, which I can call afterwords? Suppose if I call the script like code.perl or code.anything else
Everytime I want to give ./code.perl input.txt
My 1st try was:
#!/usr/bin/perl -w
$#ARGV==1 or die "Usage: 2ndprocess-script 1st-output-file-as-inputfile\n";
$name=shift;
$inputfile="`pwd`/$name";
perl -ne '{$/=""; $i=1;
while (/^Taxonomy:.(.*?)\+{11}/msgi) {
$x = $1; $x =~ s/(^|\n)\s+/\1/g;
open(OUT,">outfile".$i++.".txt"); print OUT $x; close(OUT);
}}' inputfile;
and 2nd try was:
#!/usr/bin/perl -w
$#ARGV==1 or die "Usage: 2ndprocess-script 1st-output-file-as-inputfile\n";
$name=shift;
$inputfile="`pwd`/$name";
open $fh,"<", $inputfile;
my ($flag,$n)=(0,0);
while(<$fh>){
if(/Taxonomy:/){
$n++;
$file=sprintf("outfile%s.txt",$n);
open FH,"+>$file";
$flag=1;
next;
}
if(/\++/){
$flag=0;
next;
}
print FH $_ if $flag==1;
}
But both of them didn't work in a desired way.
Can anybody please help me?
With best regards and many thanks,
Mitra.
ghostdog74,
Thank you for your help. Your last help for the script works. but still it produces files will spaces in front of lines. How I can get rid of the spaces.
The output looks like
mitra:testNextPart mitra$ more output_0
Dear All,
I was trying like below to get rid off the space in front of the line(see the previous post).
#!/usr/bin/perl -w
$#ARGV==0 or die "Usage: 2ndprocess-megan-script 1st-output-file-as-inputfile\n";
$i=1;
while (<>){
chomp;
if (/Taxonomy:/ ) {
$x = $1; $x =~ s/^\s+|\s+$//g;
open(OUT,">>","output_".$i++) or die "Cannot open for writing:$!\n";
$f=1; next;
}
if (/\+*End of summary for file/ ){
$f=0;close(OUT);next;
}
if ($f) { print OUT $_."\n";}
}
But its not working.
Can anybody please help me to have the out put in the form:
Gammaproteobacteria: 2767
Alphaproteobacteria: 4123
Deltaproteobacteria: 1343
Epsilonproteobacteria: 26
Betaproteobacteria: 397
unclassified Proteobacteria: 48
Elusimicrobium: 2
candidate division WWE1: 9
Flavobacteria: 2358
Sphingobacteria: 136
Bacteroidia: 162
environmental samples: 21
Chlorobia: 77
Planctomycetacia: 40
Spirochaetes (class): 15
Nitrospira (class): 1
Bacilli: 25
Not assigned: 1445
No hits: 220253