Hi,
I am new to awk/nawk, needs help.
I want to merge the rows having emplid attribute same into a single row in the following file. In actual this kind of file will have around 50k rows.
Here is my input file
id|emplid|firstname|dep|lastname
1|001234|test|1001|1
2|002345|test|1032|2
3|001234|test|1020|1
4|123456|test|1044|4
If we see closly,lines 1 & 3 (as 001234 matches) are same but dep has different values.
I want to merge 1 & 3 lines into one line like following
id|emplid|firstname|dep|lastname
1|001234|test|1001 1020|1
2|002345|test|1032|2
3|123456|test|1044|4
Essentially I am trying to combine the rows and attribute dep where emplid is same or matching in another row(s).
Can you pl. help me how can I do in awk/nawk.
Please don't hesitate to ask if it needs more explanation.
Thanks in advance for your help.
Regards,
There're plenty of similar threads - please use the 'Search' function first.
Please do come back if you have specific implementation question.
hi , seems perl is a little bit easier to address your issue.
use strict;
my ($n,%hash)=(1);
open FH,"<a.txt";
while(<FH>){
chomp;
my @tmp=split("[|]",$_);
if ( not exists $hash{$tmp[1]}){
$hash{$tmp[1]}->{SEQ}=$n;
$hash{$tmp[1]}->{VAL}=$_;
$n++;
}
else{
my @t=split("[|]",$hash{$tmp[1]}->{VAL});
my $pre=join "|",@t[0..2];
my $mid=$t[3]." ".$tmp[3];
my $post=$t[4];
$hash{$tmp[1]}->{VAL}=$pre."|".$mid."|".$post;
}
}
close FH;
for my $key (sort { $hash{$a}->{SEQ} <=> $hash{$b}->{SEQ} } keys %hash){
print $hash{$key}->{SEQ}."|".substr($hash{$key}->{VAL},index($hash{$key}->{VAL},"|")+1),"\n";
}
Thank you for your valuable inputs. I 'll try the scripts. regards
Hello Summer Cherry,
Its really a great help!! Appreciate from my heart. You are great !!
-Kumar
ripat
March 3, 2009, 2:22pm
6
Don't know about performance but awk solution seems more terse than perl:
awk -F'|' '{a[$2] = a[$2] " " $4} END {for (i in a) {nb += 1; print nb, i, a}}' file
Although id numbering may not be what you expect.
Another approach:
awk -F "|" '
NR==FNR{a[$2]=a[$2]?a[$2]" "$4:$4;next}
FNR==1{print;next}
a[$2]{$4=a[$2];a[$2]="";$1=++c;print}
' OFS="|" file file
Regards
Hi,
The above given perl code works fine. Can any one please explain me the following lines from the above perl code.
if ( not exists $hash{$tmp[1]}){
for my $key (sort { $hash{$a}->{SEQ} <=> $hash{$b}->{SEQ} } keys %hash){
print $hash{$key}->{SEQ}."|".substr($hash{$key}->{VAL},index($hash{$key}->{VAL},"|")+1),"\n";
Appreciate your help,
Regards
-Kumar