Display combination of 4 field uniqe record and along with concatenate 5th and 6th field.

udhal · October 16, 2015, 6:10pm

Table

ACN|NAME|CITY|CTY|NO1|NO2
115|AKKK|ASH|IND|10|15
115|AKKK|ASH|IND|20|20
115|AKKK|ASH|IND|30|35
115|AKKK|ASH|IND|30|35
112|ABC|FL|USA|15|15
112|ABC|FL|USA|25|20
112|ABC|FL|USA|25|45

i have written shell script using cut command
and awk programming getting error correct it and add header we can use echo for header is it other way display the header

Please ignore space consider | separtor

sed -n '2,$p' test > test1
for j in `cat test1|cut -d "|" -f1|uniq`
do
a=" "
b=" "
for i in `cat test1`
do
  if [ `echo $i|cut -d "|" -f1` -eq $j ]; then
 no1=`echo $i|grep "$j"|cut -d "|" -f5`
 no1=`echo $i|grep "$j"|cut -d "|" -f6`
 acn=`echo $i|cut -d "|" -f1`
 name=`echo $i|cut -d "|" -f2`
 city=`echo $i|cut -d "|" -f3`
cnt=`echo $i|cut -d "|" -f4`
a=$a$no1"|"
b=$b$no2"|"
fi
done
echo $acn "|" $name "|" $city "|" $cnt "|" $a "|" $b
done
-------
awk '
  {print $1 FS $2 FS $3
             
        }
        {IX=$1 FS $2 FS $3 
         MAX[IX]=MAX[IX] DL[IX] $4
         MIN[IX]=MIN[IX] DL[IX] $5
         DL[IX]="|"
        }
END     {for (m in MAX) print m, MAX[m], MIN[m]}
' FS="|" file

below output should be display
First 4 filed will be display unique and last two field should be concatenate(no1)$(no2)

115|AKKK|ASH|IND|10|20|30|30|15|20|35|35
112|ABC|FL|USA|15|25|25|15|20|45

Appreciate ur replay

Don_Cragun · October 17, 2015, 2:26am

udhal:

Table

ACN|NAME|CITY|CTY|NO1|NO2
115|AKKK|ASH|IND|10|15
115|AKKK|ASH|IND|20|20
115|AKKK|ASH|IND|30|35
115|AKKK|ASH|IND|30|35
112|ABC|FL|USA|15|15
112|ABC|FL|USA|25|20
112|ABC|FL|USA|25|45

i have written shell script using cut command
and awk programming getting error correct it and add header we can use echo for header is it other way display the header

Please ignore space consider | separtor

sed -n '2,$p' test > test1
for j in `cat test1|cut -d "|" -f1|uniq`
do
a=" "
b=" "
for i in `cat test1`
do
  if [ `echo $i|cut -d "|" -f1` -eq $j ]; then
 no1=`echo $i|grep "$j"|cut -d "|" -f5`
 no1=`echo $i|grep "$j"|cut -d "|" -f6`
 acn=`echo $i|cut -d "|" -f1`
 name=`echo $i|cut -d "|" -f2`
 city=`echo $i|cut -d "|" -f3`
cnt=`echo $i|cut -d "|" -f4`
a=$a$no1"|"
b=$b$no2"|"
fi
done
echo $acn "|" $name "|" $city "|" $cnt "|" $a "|" $b
done
-------
awk '
  {print $1 FS $2 FS $3
   
   }
   {IX=$1 FS $2 FS $3 
   MAX[IX]=MAX[IX] DL[IX] $4
   MIN[IX]=MIN[IX] DL[IX] $5
   DL[IX]="|"
   }
END     {for (m in MAX) print m, MAX[m], MIN[m]}
' FS="|" file

below output should be display
First 4 filed will be display unique and last two field should be concatenate(no1)$(no2)

115|AKKK|ASH|IND|10|20|30|30|15|20|35|35
112|ABC|FL|USA|15|25|25|15|20|45

Appreciate ur replay

There are some strange things in your scripts that don't seem to match your stated requirements:

Your shell script works on 6 variables per line; your awk script works on 5 variables per line.
Your shell script gathers the input from field #5 into a variable named no1 and then overwrites that variable with the input gathered from field #6.
Neither script handles field separators consistently and the echo in your shell script is adding unwanted spaces.
You talk about using echo to add a header, but neither script does that and your desired output does not show any header.
I don't understand why you name your awk arrays MIN[] and MAX[] when the next to the last line in your sample input has MIN[IX]=25 < MAX[IX]=20 .
And, I don't understand why you use a file named test as the input file for your shell script and a file named file as the input for your awk script.

Assuming that your input file is named file and that you do want to keep the header that appears on the first line in your input file, you could try something like:

awk '
BEGIN {	FS = OFS = "|"
}
NR == 1 {
	print
	next
}
{	IX = $1 OFS $2 OFS $3 OFS $4
	n1[IX] = n1[IX] OFS $5
	n2[IX] = n2[IX] OFS $6
}
END {	for(IX in n1)
		printf("%s%s%s\n", IX, n1[IX], n2[IX])
}' file

which, with the sample input you provided, produces the output:

ACN|NAME|CITY|CTY|NO1|NO2
115|AKKK|ASH|IND|10|20|30|30|15|20|35|35
112|ABC|FL|USA|15|25|25|15|20|45

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .

Note that invoking awk once to process the entire input file contents instead of invoking cut six times for each line in the input file multiplied by the number of different values of the 1st field in the file and once more, invoking cat one plus the number of different values of the 1st field times, invoking uniq once, and invoking sed once obviously makes using awk MUCH faster and more efficient than using the shell script as it is currently written. Note, however, that the shell script could also be rewritten without invoking cat , cut , sed , or grep just using one or two while read loops perhaps with one invocation of sort or uniq .

udhal · November 1, 2015, 10:44am

Hi Don,

If 2 extra fields $7(SUM1) and $8(SUM2) is there how can we will
Display combination of 4 field unique record and along with concatenate 5th and 6th field and summations of 7 and 8 fields

Table
---
ACN|NAME|CITY|CTY|NO1|NO2|SUM1|SUM2
115|AKKK|ASH|IND|10|15|20|10
115|AKKK|ASH|IND|20|20|40|50
115|AKKK|ASH|IND|30|35|45|35
115|AKKK|ASH|IND|30|35|25|25
112|ABC|FL|USA|15|15|45|25
112|ABC|FL|USA|25|20|45|25
112|ABC|FL|USA|25|45|25|35

Output-----
115|AKKK|ASH|IND|10|20|30|30|15|20|35|35|130|120
112|ABC|FL|USA|15|25|25|15|20|45|115|85

Appreciate your replay

Aia · November 1, 2015, 12:50pm

Please, try:

perl -anlF'\|' -e '
     # ignore header
     if ($. != 1) {
         # create an unique id
         $id = join "|", @F[0,1,2,3];
         # structure the information
         for $i (0..3) {   
             push @{$record{$id}{$i}}, $F[4+$i];
         }
     }
     # format and display data structure
     END { for $r (keys %record){
               $sum7 = 0;
               $sum8 = 0;
               # sum all seventh fields
               map {$sum7 += $_} @{$record{$r}{2}};
               # sum all eighth fields
               map {$sum8 += $_} @{$record{$r}{3}};
               # produce the pipe-formatted record
               print join "|", ($r, @{$record{$r}{0}}, @{$record{$r}{1}}, $sum7, $sum8);
           }
     }
' udhal.file

Don_Cragun · November 1, 2015, 1:36pm

Or you could make some trivial changes to the awk script I suggested before:

awk '
BEGIN {	FS = OFS = "|"
}
NR == 1 {
	#print	# Header is no longer desired.
	next
}
{	IX = $1 OFS $2 OFS $3 OFS $4
	n1[IX] = n1[IX] OFS $5
	n2[IX] = n2[IX] OFS $6
	# Accumulate field 7 & 8 totals.
	s1[IX] += $7
	s2[IX] += $8
}
END {	for(IX in n1)
		printf("%s%s%s|%d|%d\n", IX, n1[IX], n2[IX], s1[IX], s2[IX])
}' file2

udhal · November 2, 2015, 12:00pm

Hi Don,
Thanks a lot for yours quick replay.