Perl program for find one entry and put in different file

Hi

I have a file

name1 xxxxx
name1 xxxxx
name1 yyyyy
name 1 zzzzz
name1 Uniprot Id 1234
name2 sssss
name2 eeeee
name2 bengamine
name2 Uniprot Id 3456
......................and so on

I have to capture Uniprot IDs only in a separate file so that output contain only

Uniprot ID for name 1
Uniprot ID for name 2............... and so on

Can any body help to write a perl programm for this.

My another questions is regarding compare 2 files:

one file
Uniprot ID 1(any number)
Unipro ID 6
Unirot ID 7
.......so on randomly
another file contains

Uniprot ID1
Uniprot ID2
Uniprot ID 6
Uniprot ID8
..so on

I have to compare 2 files to find which and hw many entries are and common in both files and if these are common than print before entry common other wise not common.

Please let me know Perl programm for these question... highly thankful fo r somebody..indeed

Thanks
Mani

Possibly something like this:

cat yourfile | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot Id\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}' > outfile

Thanks for the reply. I m receiving followin error:

Unquoted string "perl" may clash with future reserved word at test.pl line 3.
syntax error at test.pl line 3, near "n -e"
Execution of test.pl aborted due to compilation errors.

can anybody guide.

Are you using windows ?

when you are asking the question, mention the OS and shell

Hi Kamraj

I am using Unix OS with Unix shell

can you post your command

Hi

My programm is in a file named test.pl

#!/usr/bin/perl -w

cat TTDtargets | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot ID\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}'> TTD2;

END

my command and error is :

bash-3.2$ perl test.pl
Unquoted string "perl" may clash with future reserved word at test.pl line 3.
syntax error at test.pl line 3, near "n-e "
Execution of test.pl aborted due to compilation errors.

---------- Post updated at 01:20 AM ---------- Previous update was at 01:19 AM ----------

Here input file name is TTD Targets and outut file name is TTD2

just execute the below line in the command line.

Dont put it in your perl file

 
cat TTDtargets | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot ID\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}'> TTD2

Thanks Kamraj

But this time it gave me the file TTD2 but its completely blank

is there somethign wrong with out put command

my input file contains

TTDS00001 Antagonist GSK233705 DCL000823
TTDS00001 Antagonist NVA237 DCL000901
TTDS00001 Antagonist Org-23366 DCL000911
TTDS00001 Antagonist OrM3 DCL000913
TTDS00001 Multitarget Org-23366 DCL000911
TTDS00001 Antagonist Aprophen DNC000245
TTDS00001 Antagonist Benactyzine DNC000293
TTDS00001 Antagonist Hyoscine DNC000757
TTDS00001 Antagonist Hyoscyamine sulfate DNC000758
TTDS00001 Antagonist Ipratropium bromide DNC000806
TTDS00001 Agonist Muscarine DNC000970
TTDS00001 Agonist RS 86 DNC001236
TTDS00001 Target Validation TTDS00001
TTDS00002 UniProt ID P11229
TTDS00002 Name Muscarinic acetylcholine receptor M1
TTDS00002 Type of target Successful target
TTDS00002 Synonyms M1 receptor
TTDS00002 Disease Alzheimer's disease
TTDS00002 Disease Bronchospasm (histamine induced)
TTDS00002 Disease Cognitive deficits
TTDS00002 Disease Schizophrenia
TTDS00002 Function The muscarinic acetylcholine receptor mediates various cellular responses, including inhibition of adenylate cyclase.................. so on for TTD 03/04/..14447

I have to capture only Uniprot ID and put in Output

so, as per the above input, the expected output is P11229 ?

Please use

 tag while posting the data and command
 
http://www.unix.com/how-post-unix-linux-forums/167686-forum-video-tutorial-how-use-code-tags.html

---------- Post updated at 12:15 PM ---------- Previous update was at 12:14 PM ----------

Try this.
 
 
$ perl -lane 'print $F[-1] if $_=~/UniProt ID/' TTDtargets
P11229

Thank for reply.My Input contains many Uniprot entries like one Uniprot ID which I mentioned in one part of so file conatin 1400 approx Uniprot entries..

Code:

Method 1:

cat TTD Targets | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot ID\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}'> TTD2;

Error:Blank TTD2 file I m receiving

Method 2:$ perl -lane 'print $F[-1] if $_=~/UniProt ID/' TTDtargets

Error:command not found

can you post your code with code tag

otherwise, i cant help. sorry

post the entire error message and the command what you tried.

Method 1:

bash-3.2$ $ perl -lane 'print $F[-1] if $_=~/UniProt ID/' TTDtargets
bash: $: command not found

Method2:

bash-3.2$ cat Uniprot | perl -n -e 'chomp; if(m/^\s*(\D+)\s*(\d+)\s+Uniprot ID\s+(\d+)/){printf("%s for %s %s\n",$3,$1,$2)}'> TTD2

why the $ before the perl ?

 
bash-3.2$ $ 

[b]still you are not posting with

 tag 

you will be banned if you not take this as serious. (So no more help from this forum )
 

(B) Use  tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags  and 

by hand.)

[/COLOR]

ok thanks kamraj
I got it now but its not coming as output file separately with all IDs and I cant copy all Uniprot IDs from Unix shell..

And, Video wasn't working in my system I checked now on my iPAD so here it is code tag..thanks for reply

$ perl -lane 'print $F[-1] if $_=~/UniProt ID/' Uniprot

---------- Post updated at 02:33 AM ---------- Previous update was at 02:31 AM ----------

And, I was planning to do this in perl script in perl file is it possible?

redirect the output to another file

perl -lane 'print $F[-1] if $_=~/UniProt ID/' input.txt  > output.txt

read basics about the perl ( search in google )

The output file is still blank file without any output

perl -lane 'print $F[-1] if $_=~/UniProt ID/' input.txt  > output.txt

As per the below code

1) your input file name is test.txt
2) your output file name is output.txt

When you are saying "its not working", "no output", etc....

Always post the command which you tried.

 
$ cat test.txt 
TTDS00001 Antagonist GSK233705 DCL000823
TTDS00001 Antagonist NVA237 DCL000901
TTDS00001 Antagonist Org-23366 DCL000911
TTDS00001 Antagonist OrM3 DCL000913
TTDS00001 Multitarget Org-23366 DCL000911
TTDS00001 Antagonist Aprophen DNC000245
TTDS00001 Antagonist Benactyzine DNC000293
TTDS00001 Antagonist Hyoscine DNC000757
TTDS00001 Antagonist Hyoscyamine sulfate DNC000758
TTDS00001 Antagonist Ipratropium bromide DNC000806
TTDS00001 Agonist Muscarine DNC000970
TTDS00001 Agonist RS 86 DNC001236
TTDS00001 Target Validation TTDS00001
TTDS00002 UniProt ID P11229
TTDS00002 Name Muscarinic acetylcholine receptor M1
TTDS00002 Type of target Successful target
TTDS00002 Synonyms M1 receptor
TTDS00002 Disease Alzheimer's disease
TTDS00002 Disease Bronchospasm (histamine induced)
TTDS00002 Disease Cognitive deficits
TTDS00002 Disease Schizophrenia
TTDS00002 Function The muscarinic acetylcholine receptor mediates various cellular responses, including inhibition of adenylate cyclase.................. so on for TTD 03/04/..14447

$ perl -lane 'print $F[-1] if $_=~/UniProt ID/' test.txt > output.txt
 
$ cat output.txt 
P11229

Okk now I got the Results..Indeed so many thanks for u...

actually I was not writing star before input and after output file name
now i changed file name as input Uniprot and output Uniprot2 I got results.

bash-3.2$  perl -lane 'print $F[-1] if $_=~/UniProt ID/' * Uniprot> Uniprot2*

So may thanks once again...now I cant print here whole because if I will run gain it says Uniprot2 ambiguous redirect.. so but I got results

Answer my questions.

1) What is your input file name ? ( is it multiple files or only one file )
2) What should be your outputfile name ?