kshuser
October 26, 2009, 12:30pm
1
I have a .DAT file like below
23666483030000653-B94030001OLFXXX000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227
22885068900000652 B86860003ODL-SP592123420081227
22885068900000652-B94030001ODL-CH592123520081227
I would like to combine duplicate records into a single record with the new single record containing additional fields appending at the end of line record (for example see below ) . In the example file above, the first field is the unique field. So I would like my output to be like below:
If any duplicate record exists in this case 288506890 has 3 records check only for the position ODL-SP & ODL-CH
if ODL-SP exists then get the amount position 34:40
if ODL-CH exists then get the amount position 34:40
then get/append the final record (288506890) for this no is like below, if no duplcate record exists just create the line as is
2885068900000652 B86860003OLFXXX592123320081227 5921234 5921235
ben_type=`echo $line|cut -c28-33`
(you get ODL-SP spouse, ODL-CH child)
amount=`echo $line|cut -c34-40`
(you get spouse=5921234, child=5921235)
23666483030000653-B94030001OLFXXX000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227 5921234 5921235
Can someone please please help me with a solution using Unix ksh scripting Thank you.
[/FONT][/SIZE][/SIZE][/FONT][/FONT][/SIZE][/SIZE][/FONT]
binlib
October 26, 2009, 2:07pm
2
With a name like kshuser and asked for ksh only solution, I assume you use ksh93.
while read x; do
k=${x:0:17}
if [ "$k" = "$ok" ]; then
p="$p ${x:33:7}"
else
[ -n "$p" ] && echo "$p"
ok=$k
p=$x
fi
done
echo "$p"
If you can add a blank line at the end of file (e.g.
(cat file;echo)
), you can omit the last echo outside the loop.
I am kind of new to KSH scripting.
FILE1.DAT has the following records.
23666483030000653-B94030001ODL-Ch000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227
22885068900000652 B86860003ODL-Sp592123420081227
22885068900000652-B94030001ODL-Ch592123520081227
i am writing the script like below and getting the outfile mondaytest.txt
rec_cnt=1
while read line
do
no=`echo $line|cut -c2-10`
ben_type=`echo $line|cut -c28-33`
amount=`echo $line|cut -c34-40`
if [[ $rec_cnt -eq 1 ]]
then
echo $line >> mondaytest.txt
prior_no=$no
prev_line=$line
else
if [[ $no -eq $prior_no ]]
then
if [[ $ben_type = "ODL-SP" ]]
then
spouse_amt=$amount
prev_line="$prev_line $spouse_amt"
elif [[ $ben_type = "ODL-CH" ]]
then
child_amt=$amount
#prev_line="$prev_line $spouse_amt"
else
echo 'invalid ben_type'
fi
#echo $prev_line $spouse_amt $child_amt>> mondaytest.txt
echo 'Insert_1' $prev_line $child_amt >> mondaytest.txt
else
echo 'Insert_2' $line >> mondaytest.txt
prev_line=$line
fi
spouse_amt=""
child_amt=""
fi
(( rec_cnt=rec_cnt + 1 ))
prior_no=$no
done <FILE.DAT
OUT FILE mondaytest.txt
23666483030000653-B94030001ODL-Ch000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227
22885068900000652 B86860003OLFXXX592123320081227 5921234
22885068900000652 B86860003OLFXXX592123320081227 5921234 5921235
I want the outfile should have only 4 records like this.
23666483030000653-B94030001ODL-Ch000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227 5921234 5921235
Can you please correct me in my code to get the above expected result.
binlib:
With a name like kshuser and asked for ksh only solution, I assume you use ksh93.
while read x; do
k=${x:0:17}
if [ "$k" = "$ok" ]; then
p="$p ${x:33:7}"
else
[ -n "$p" ] && echo "$p"
ok=$k
p=$x
fi
done
echo "$p"
If you can add a blank line at the end of file (e.g.
(cat file;echo)
), you can omit the last echo outside the loop.
E.g. like so?
#!/bin/ksh
echo|cat infile -|while read line; do
case ${line:27:6} in
ODL-SP|ODL-CH)
prev+=" ${line:33:7}" ;;
*) [[ -n $prev ]] && print $prev
prev=$line ;;
esac
done > outfile
But when i ran your code it is generating the outfile file but no changes compared to INPUT file.
>echo|cat FILE2.DAT -|while read line
> do
> case {$line:27:6} in
> ODL-SP|ODL-CH)
> prev+=" ${line:33:7}" ;;
> *) [[ -n $prev ]] && print $prev
> prev=$line ;;
> esac
> done > OUT.txt
OUT.txt ...is the same as input file FILE2.DAT
23666483030000653-B94030001OLFXXX000000120081227
23797049900000654-E71060001OLFXXX000000220081227
23699281320000655 E71060002OLFXXX000000320081227
22885068900000652 B86860003OLFXXX592123320081227
22885068900000652 B86860003ODL-SP592123420081227
22885068900000652-B94030001ODL-CH592123520081227
What about:
# awk 'NF{a[substr($0,0,9)]=(a[substr($0,0,9)])?a[substr($0,0,9)] FS substr($0,34,7):$0}END{for(i in a)print a}' file
22885068900000652 B86860003OLFXXX592123320081227 5921234 5921235
23797049900000654-E71060001OLFXXX000000220081227
23666483030000653-B94030001OLFXXX000000120081227
23699281320000655 E71060002OLFXXX000000320081227
In your awk code below where is the input file we are passing and where is outfile, i see "a" in your code is this the input file name....??? i also see word "file"..is this INPUT or OUTPUT file..?? what is this doing...???
thanks
# awk 'NF{a[substr($0,0,9)]=(a[substr($0,0,9)])?a[substr($0,0,9)] FS substr($0,34,7):$0}END{for(i in a)print a[i]}' file
awk 'NF{a[substr($0,0,9)]=(a[substr($0,0,9)])?a[substr($0,0,9)] FS substr($0,34,7):$0}END{for(i in a)print a}' Input_file > Output_file