Hi all
I have a file with following input
It contains 5 columns
gene name drug drug ID disease approved
Now the same gene is repeated many times with different data in column2,3 ,4,5
I want to arrange dat in such a way that there shuld be one entry in the column(no repeated entries) column 2,3,4,5 shuld remain as it is
so output shuld be like this:
Kindly let me know scripting regarding this.
awk '!arr[$0]++' inputfile | sort > outputfile
Please try that.
Hi all
sorry the output shuld contain only once the entry in first columns like thisL
---------- Post updated at 10:32 PM ---------- Previous update was at 10:22 PM ----------
Hi Jim
The output stiil contain repeated entries its just sorted it alphabetically. using this coding
it shows
I want the output shuld be
1,3-Beta-Glucan synthase Anidulafungin DAP000546 Fungal infections Approved
Caspofungin DAP000547 Fungal infections Approved
Cilofungin DCL000331 Candida infections Discontinued
Eraxis/Vfend DCL000522 Beta-D Glucan Synthase Inhibitor, Cyp P450 Mediated Alpha-lanosterol Demethylation Phase III
Micafungin DAP000548 Fungal infections Approved
16S rRNA
[/quote]
---------- Post updated 07-24-12 at 10:18 AM ---------- Previous update was 07-23-12 at 10:32 PM ----------
[/quote]