Hi
I have a file
name1 xxxxx
name1 xxxxx
name1 yyyyy
name 1 zzzzz
name1 Uniprot Id 1234
name2 sssss
name2 eeeee
name2 bengamine
name2 Uniprot Id 3456
......................and so on
I have to capture Uniprot IDs only in a separate file so that output contain only
Uniprot ID for name 1
Uniprot ID for name 2............... and so on
Can any body help to write a perl programm for this.
My another questions is regarding compare 2 files:
one file
Uniprot ID 1(any number)
Unipro ID 6
Unirot ID 7
.......so on randomly
another file contains
Uniprot ID1
Uniprot ID2
Uniprot ID 6
Uniprot ID8
..so on
I have to compare 2 files to find which and hw many entries are and common in both files and if these are common than print before entry common other wise not common.
Please let me know Perl programm for these question... highly thankful fo r somebody..indeed
Thanks
Mani
Possibly something like this:
cat yourfile | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot Id\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}' > outfile
Thanks for the reply. I m receiving followin error:
Unquoted string "perl" may clash with future reserved word at test.pl line 3.
syntax error at test.pl line 3, near "n -e"
Execution of test.pl aborted due to compilation errors.
can anybody guide.
Are you using windows ?
when you are asking the question, mention the OS and shell
Hi Kamraj
I am using Unix OS with Unix shell
can you post your command
Hi
My programm is in a file named test.pl
#!/usr/bin/perl -w
cat TTDtargets | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot ID\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}'> TTD2;
END
my command and error is :
bash-3.2$ perl test.pl
Unquoted string "perl" may clash with future reserved word at test.pl line 3.
syntax error at test.pl line 3, near "n-e "
Execution of test.pl aborted due to compilation errors.
---------- Post updated at 01:20 AM ---------- Previous update was at 01:19 AM ----------
Here input file name is TTD Targets and outut file name is TTD2
just execute the below line in the command line.
Dont put it in your perl file
cat TTDtargets | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot ID\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}'> TTD2
Thanks Kamraj
But this time it gave me the file TTD2 but its completely blank
is there somethign wrong with out put command
my input file contains
TTDS00001 Antagonist GSK233705 DCL000823
TTDS00001 Antagonist NVA237 DCL000901
TTDS00001 Antagonist Org-23366 DCL000911
TTDS00001 Antagonist OrM3 DCL000913
TTDS00001 Multitarget Org-23366 DCL000911
TTDS00001 Antagonist Aprophen DNC000245
TTDS00001 Antagonist Benactyzine DNC000293
TTDS00001 Antagonist Hyoscine DNC000757
TTDS00001 Antagonist Hyoscyamine sulfate DNC000758
TTDS00001 Antagonist Ipratropium bromide DNC000806
TTDS00001 Agonist Muscarine DNC000970
TTDS00001 Agonist RS 86 DNC001236
TTDS00001 Target Validation TTDS00001
TTDS00002 UniProt ID P11229
TTDS00002 Name Muscarinic acetylcholine receptor M1
TTDS00002 Type of target Successful target
TTDS00002 Synonyms M1 receptor
TTDS00002 Disease Alzheimer's disease
TTDS00002 Disease Bronchospasm (histamine induced)
TTDS00002 Disease Cognitive deficits
TTDS00002 Disease Schizophrenia
TTDS00002 Function The muscarinic acetylcholine receptor mediates various cellular responses, including inhibition of adenylate cyclase.................. so on for TTD 03/04/..14447
I have to capture only Uniprot ID and put in Output
so, as per the above input, the expected output is P11229 ?
Please use
tag while posting the data and command
http://www.unix.com/how-post-unix-linux-forums/167686-forum-video-tutorial-how-use-code-tags.html
---------- Post updated at 12:15 PM ---------- Previous update was at 12:14 PM ----------
Try this.
$ perl -lane 'print $F[-1] if $_=~/UniProt ID/' TTDtargets
P11229
Thank for reply.My Input contains many Uniprot entries like one Uniprot ID which I mentioned in one part of so file conatin 1400 approx Uniprot entries..
Code:
Method 1:
cat TTD Targets | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot ID\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}'> TTD2;
Error:Blank TTD2 file I m receiving
Method 2:$ perl -lane 'print $F[-1] if $_=~/UniProt ID/' TTDtargets
Error:command not found
can you post your code with code tag
otherwise, i cant help. sorry
post the entire error message and the command what you tried.
Method 1:
bash-3.2$ $ perl -lane 'print $F[-1] if $_=~/UniProt ID/' TTDtargets
bash: $: command not found
Method2:
bash-3.2$ cat Uniprot | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot ID\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}'> TTD2
why the $ before the perl ?
bash-3.2$ $
[b]still you are not posting with
tag
you will be banned if you not take this as serious. (So no more help from this forum )
(B) Use tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags and
by hand.)
[/COLOR]
ok thanks kamraj
I got it now but its not coming as output file separately with all IDs and I cant copy all Uniprot IDs from Unix shell..
And, Video wasn't working in my system I checked now on my iPAD so here it is code tag..thanks for reply
$ perl -lane 'print $F[-1] if $_=~/UniProt ID/' Uniprot
---------- Post updated at 02:33 AM ---------- Previous update was at 02:31 AM ----------
And, I was planning to do this in perl script in perl file is it possible?
redirect the output to another file
perl -lane 'print $F[-1] if $_=~/UniProt ID/' input.txt > output.txt
read basics about the perl ( search in google )
The output file is still blank file without any output
perl -lane 'print $F[-1] if $_=~/UniProt ID/' input.txt > output.txt
As per the below code
1) your input file name is test.txt
2) your output file name is output.txt
When you are saying "its not working", "no output", etc....
Always post the command which you tried.
$ cat test.txt
TTDS00001 Antagonist GSK233705 DCL000823
TTDS00001 Antagonist NVA237 DCL000901
TTDS00001 Antagonist Org-23366 DCL000911
TTDS00001 Antagonist OrM3 DCL000913
TTDS00001 Multitarget Org-23366 DCL000911
TTDS00001 Antagonist Aprophen DNC000245
TTDS00001 Antagonist Benactyzine DNC000293
TTDS00001 Antagonist Hyoscine DNC000757
TTDS00001 Antagonist Hyoscyamine sulfate DNC000758
TTDS00001 Antagonist Ipratropium bromide DNC000806
TTDS00001 Agonist Muscarine DNC000970
TTDS00001 Agonist RS 86 DNC001236
TTDS00001 Target Validation TTDS00001
TTDS00002 UniProt ID P11229
TTDS00002 Name Muscarinic acetylcholine receptor M1
TTDS00002 Type of target Successful target
TTDS00002 Synonyms M1 receptor
TTDS00002 Disease Alzheimer's disease
TTDS00002 Disease Bronchospasm (histamine induced)
TTDS00002 Disease Cognitive deficits
TTDS00002 Disease Schizophrenia
TTDS00002 Function The muscarinic acetylcholine receptor mediates various cellular responses, including inhibition of adenylate cyclase.................. so on for TTD 03/04/..14447
$ perl -lane 'print $F[-1] if $_=~/UniProt ID/' test.txt > output.txt
$ cat output.txt
P11229
Okk now I got the Results..Indeed so many thanks for u...
actually I was not writing star before input and after output file name
now i changed file name as input Uniprot and output Uniprot2 I got results.
bash-3.2$ perl -lane 'print $F[-1] if $_=~/UniProt ID/' * Uniprot> Uniprot2*
So may thanks once again...now I cant print here whole because if I will run gain it says Uniprot2 ambiguous redirect.. so but I got results
Answer my questions.
1) What is your input file name ? ( is it multiple files or only one file )
2) What should be your outputfile name ?