Hello everybody,
I want to compute a data file in awk. I am new in awk and I need your help. The data file has the following fields. It has thousands of records.
Col1 Col2 Col3 Col4 Col5
0.85 0.07 Fre 42:86 25
0.73 0.03 frp 21:10 28
0.64 0.04 Fre 42:86 63
0.47 0.08 nie 25:76 32
0.37 0.01 veb 00:71 26
0.63 0.48 Fre 42:86 55
0.65 0.32 frp 21:10 19
0.53 0.56 nie 25:76 52
0.32 0.43 veb 00:71 18
Now I want to search this data for every specific record (e.g Fre in column3) and then select the minimum value of that specific record in column5.Consider Col4 as a primary key(unique value for every record).
The output should be like this:
Col1 Col2 Col3 Col4 Col5
0.85 0.07 Fre 42:86 25
0.65 0.32 frp 21:10 19
0.47 0.08 nie 25:76 32
0.32 0.43 veb 00:71 18
Thank you so much for your cooperation.
Regards,
Ubee
kshji
July 14, 2009, 11:24am
2
This ex. give tools to create dynamic tool for your needs.
# search only some lines
awk -v searchfld=3 -v searchvalue="Fre" -v minfld=5 -v keyfld=4 '
BEGIN {
minvalue=9999999999999999
}
( $minfld < minvalue ) && ( $searchfld == searchvalue ) {
minkey=$keyfld
minvalue=$minfld
}
END {
print "minkey:",minkey," minvalue:",minvalue
} ' inputfile
And solution for your needs:
awk -v keyfld=3 -v minfld=5 '
BEGIN {
minvalue=9999999999999999
}
Min[$keyfld]<1 { Min[$keyfld]=minvalue }
$minfld < Min[$keyfld] {
lines[$keyfld]=$0
Min[$keyfld]=$minfld
}
END {
for (id in lines) {
print lines[id]
}
} ' inputfile
thanks alot kshji
it is working perfectly!!!
well done
Try:
awk ' NR>1 {if(arr[$3]=="") arr[$3]=$5; if(arr[$3] >= $5) { arr[$3]=$5; sarr[$3]=$0; }} END { for (i in sarr) { print sarr; } }' filename
yes dennis , it is also working but in this case we are considering Col3 as a key field.
I am thankful for your response.
---------- Post updated at 11:02 PM ---------- Previous update was at 09:43 PM ----------
ok after finding the specific record with minimum Col5 value, just like the output as
Col1 Col2 Col3 Col4 Col5
0.85 0.07 Fre 42:86 25
0.65 0.32 frp 21:10 19
0.47 0.08 nie 25:76 32
0.32 0.43 veb 00:71 18
now how to find the specific record with max Col5 value , just like the output as
0.47 0.08 nie 25:76 32
I appreciate your help.
Regards,
Ubee
kshji
July 15, 2009, 5:02am
6
# max value of some fld
awk -v fld=5 '
BEGIN {
value=-9999999999999999
}
$fld > value {
line=$0
value=$fld
}
END {
print line
} ' inputfile
awk 'NR > 1 && $NF > a{a=$NF;line=$0}END{print line}' file
while(<DATA>){
chomp;
if($.==1){
print $_,"\n";
next;
}
my @tmp =split;
my $key=$tmp[2]." ".$tmp[3];
if(not exists $hash{$key}){
$hash{$key}->{val}=$_;
$hash{$key}->{min}=$tmp[4];
}
else{
if($tmp[4] < $hash{$key}->{min}){
$hash{$key}->{val}=$_;
}
}
}
foreach my $key(keys %hash){
print $hash{$key}->{val},"\n";
}
__DATA__
Col1 Col2 Col3 Col4 Col5
0.85 0.07 Fre 42:86 25
0.73 0.03 frp 21:10 28
0.64 0.04 Fre 42:86 63
0.47 0.08 nie 25:76 32
0.37 0.01 veb 00:71 26
0.63 0.48 Fre 42:86 55
0.65 0.32 frp 21:10 19
0.53 0.56 nie 25:76 52
0.32 0.43 veb 00:71 18
Thank you so much for your help
---------- Post updated 07-16-09 at 05:13 PM ---------- Previous update was 07-15-09 at 11:02 PM ----------
ok as I mentioned in my first post that if we have the following dataset:
Col1 Col2 Col3 Col4 Col5
0.85 0.07 Fre 42:86 25
0.73 0.03 frp 21:10 28
0.64 0.04 Fre 42:86 63
0.47 0.08 nie 25:76 32
0.37 0.01 veb 00:71 26
0.63 0.48 Fre 42:86 55
0.65 0.32 frp 21:10 19
0.53 0.56 nie 25:76 52
0.32 0.43 veb 00:71 18
after computing it and selecting the records with minimum Col5 value.
Then we will get the following dataset
Col1 Col2 Col3 Col4 Col5
0.85 0.07 Fre 42:86 25
0.65 0.32 frp 21:10 19
0.47 0.08 nie 25:76 32
0.32 0.43 veb 00:71 18
Now after getting this dataset, I want to select the record with max Col5 value.
So the final record should be
0.47 0.08 nie 25:76 32
Is it possible to do it in one program?
As in the previous post, it is a separate program, i mean how can we merge the two programs together.
Thanks a lot,
Regards,
Ubee
kshji
July 16, 2009, 2:11pm
10
You can put to the same, but
Why to put in same ? You have two working small nice tools.
run first and output > newfile
run second using newfile as input ?
Or write bigger and harder ruleset to understand ? Why ?
Better solution is to make "mother script", which handle input and output using those little solutions. Main idea of *nix. (for me).
Hi,
Can any one please tell me that what's wrong in the following code
BEGIN {FS="\t"; keyfield=4; minfield=5; minvalue=999999999999 }
$0!="" {
min[$keyfield]<1 { min[$keyfield]=minvalue }
$minfield<min[$keyfield] { lines[$keyfield]=$0; min[$keyfield]=$minfield }
}
END {
for (id in lines) {
print lines[id]
}
}
When I run this program, I get two syntax errors,
min[$keyfield]<1 { min[$keyfield]=minvalue }
^ syntax error
$minfield<min[$keyfield] { lines[$keyfield]=$0; min[$keyfield]=$minfield }
^ syntax error
Syntax Errors at two opening brackets.Please I need your help.
Thanks,
Regards,
Ubee
Ygor
July 16, 2009, 10:57pm
12
You can't nest pattern/action pairs. Try...
BEGIN {FS="\t"; keyfield=4; minfield=5; minvalue=999999999999 }
$0=="" {next}
min[$keyfield]<1 { min[$keyfield]=minvalue }
$minfield<min[$keyfield] { lines[$keyfield]=$0; min[$keyfield]=$minfield }
END {
for (id in lines) {
print lines[id]
}
}
awk ' NR>1 {if(a[$3]=="") a[$3]=$5;if(a[$3]>=$5) {a[$3]=$5; sa[$3]=$0;}}NR>1&&$NF>x{x=$NF;line=$0}END{for(i in sa){print sa};printf "\n%s",line}' file
0.85 0.07 Fre 42:86 25
0.47 0.08 nie 25:76 32
0.65 0.32 frp 21:10 19
0.32 0.43 veb 00:71 18
0.47 0.08 nie 25:76 32
Please read the Forum Rules .