Need help in Unix shell script

srichakra · July 19, 2007, 12:51pm

Hi
I am new to this forum.I need a help in the following:
We receve pipe delimited file with
transaction ID,tran_date,Quest_cd,Ans_cd,ans_value.
Same transaction ID can be repeated with different quest_cd and ans_cd.

Basically I need to check if a perticular pair of quest_cd and ans_cd (say Q1234,A1234) is present under a transaction ,then another pair of quest_cd and ans_cd (say Q5678,A5678) should also be present.

Otherwise I need to remove the isolated row from input file and write to a bad file which should have the complete row in defect and with a comment that Q1234A1234 present but Q5678A5678 not present Or vice versa.

The balance good records sholud be written to a good file.

I hope I am clear in my explanation.As this is VERY urgent I appreciate immediate help
Regards
SRK

matrixmadhan · July 19, 2007, 2:02pm

As this is VERY urgent I appreciate immediate help

This is not appreciated.

Just a few pointers to help you

awk -F"," '{ if ( ( $3 == "Q1234 ) && ( $4 == "A1234" ) ) { print > "good.file" } else { print > "bad.file" } }' inputfile

Had you provided a snippet of the input file, it would have been even more helpful to you !

srichakra · July 19, 2007, 2:14pm

Hi
Thanks for the repsonse . Your code looks at one line only for the Q/A pair but I need to look at two consecutive lines as below.(The file is a sorted file sorted on Tran Id)
AS per your suggestion I am giving few lines of my file:
991320070521330000000003|Q13251|A18741|234567890123456
671320070521330000000003|Q13251|A18741|456787654328977
671320070521330000000003|Q13260|A18765|4567890987654321234
471320070521330000000003|Q13260|A18765|2345678901234561

In the above the two lines with transaction ID 671320070521330000000003 should be written to a good file as they are having both pairs of Q/A and the other should be written to bad file with comments like below:

991320070521330000000003|Q13251|A18741|234567890123456 --> ONLY Driver's Licence is present Credit card number missing
471320070521330000000003|Q13260|A18765|2345678901234561 --> ONLY Credit card number is present Driver's Licence missing

Once again thanks for your help.
Regards
SRK

matrixmadhan · July 19, 2007, 2:27pm

Am sorry I don't understand your requirement really

How do you say the tran id 671... should be written to good file, is there any selection condition for that ?

Because on all the 4 lines I could see Q/A pair.

Sorry about that, could you please explain a bit ?

srichakra · July 19, 2007, 2:33pm

Hi
I am sorry for the confusion.The reason why the two lines with transaction id 671320070521330000000003 should be written to a good file is that they are having both the Q/A pairs ie.Q13251|A18741 and Q13260|A18765 while the other transactions are having either of these pairs but not both the pairs.I should read the file in a while loop and get the lines but I am not very clear in the code how to do that
Regads
SRK

matrixmadhan · July 19, 2007, 2:56pm

#! /opt/third-party/bin/perl

open(FILE, "<", "inputfile") || die "Unable to open file 'inputfile' <$!>";

while(<FILE>) {
  chomp;
  my @arr = split(/\|/);
  $fileHash{$arr[0]} .= ( "-" . $arr[1] . "-" . $arr[2] . "-" . $arr[3] );
}

close(FILE);

foreach my $k ( sort keys %fileHash) {
  my $val = $fileHash{$k};
  if ( $val =~ /Q13251/ && $val =~ /A18741/ && $val =~ /Q13260/ && $val =~ /A18765/ ) {
    ### Writing to good-file code
    print "$k $val in good-file\n";
  }
  else {
    ### Writing to bad-file code
  }
}

exit 0

srichakra · July 19, 2007, 3:07pm

Hi
Thanks for your help.But unfortunately it has to be a UNIX korn shell script as otherwise the production support will not be able to handle any errors.
My sincere apologies for troubling you!
Regards
SRK

matrixmadhan · July 19, 2007, 3:28pm

Does that mean you have to use only shell builtins only. ?

What is the problem in running perl under ksh ?
Its going to run fine.

If you want the solution only with ksh, won't you allow using other utilities like sed, awk to be used.

srichakra · July 20, 2007, 4:07pm

Hi
I checked with the Tech Mgr .It has to be in Unix Kshell as the production people are not well versed with Perl
Please see if you can help me
Thanks
Sri

Shell_Life · July 20, 2007, 4:21pm

sort input_file | sed 's/|.*//' | uniq -d > $$All_Trans
egrep -f $$All_Trans input_file
rm -f $$All_Trans

denn · July 20, 2007, 4:52pm

Assuming I understand your requirements, I modified your example file so that
there was good pairs and bad "single lines" in a file, and the below does what
you're looking for.

file to read:
991320070521330000000003|Q13251|A18741|234567890123456
671320070521330000000003|Q13251|A18741|456787654328977
671320070521330000000003|Q13260|A18765|4567890987654321234
471320070521330000000003|Q13260|A18765|2345678901234561
991320070521330000000003|Q23251|A28741|234567890123456
471320070521330000000003|Q23260|A28765|2345678901234561

---------------
script:

#!/usr/bin/ksh

for i in `cat $1 | awk -F\| '$2~/Q/{ print $2 "|" $3 }' | sort | uniq'`
do
LINES=`grep "$i" one | wc -l`
if [[ $LINES -eq 2 ]]; then
grep "$i" one >> goodfile
else
grep "$i" one >> badfile
fi
done

---------------
just need to type in filename you want to sort as argv1
i.e. ./scriptname filename
---------------
2 output files:

more goodfile
991320070521330000000003|Q13251|A18741|234567890123456
671320070521330000000003|Q13251|A18741|456787654328977
671320070521330000000003|Q13260|A18765|4567890987654321234
471320070521330000000003|Q13260|A18765|2345678901234561

more badfile
991320070521330000000003|Q23251|A28741|234567890123456
471320070521330000000003|Q23260|A28765|2345678901234561

---------------
I notice in the example that you added some text to end of lines on badfiles, but didn't give any logic behind what is inserted. You should be able to add whatever logic testing inside the else statement to accomidate that hopefully without a lot of problems.

matrixmadhan · July 21, 2007, 9:18am

I doubt this solution,

for its uses the first field and then filters out the output but the OP had requested for something different in which specific Q/A pairs should be available.

I went through a scan of this solution and I don't think this would work !

Shell_Life · July 23, 2007, 10:29am

Matrix, as we all agree, the OP was not and is not really clear in his specification,
thus we can have several different interpretations.

Also, the OP did not even reply yet to let us know if our solution works for him.

In any event, if the OP requires a specific Q/A pairs, here is one possible solution:

egrep 'Q13251\|A18741' input_file > TempStrs
egrep 'Q13260\|A18765' input_file >> TempStrs
sort TempStrs | sed 's/|.*//' | uniq -d > TempKeys
egrep -f TempKeys TempStrs
rm -f TempKeys TempStrs

benefactr · August 1, 2007, 11:20am

I'm assuming columns 2 and 3 are codes, and dependant on those is where it goes. Simple code would be below using IF's to define them and write it out to their respective file with the descriptors..

cat $inputfile | awk '
BEGIN{
FS="|"
}

{

if [ $2 == "Q13260" && $3 == "A18741" ]
printf("%s >> Good Transaction\n",$0) >> "/u/data/goodfile.log"

if [ $2 == "<whatever>" && $3 == "<whatever" ]
printf("%s >> BADTransaction\n",$0) >> "/u/data/badfile.log"
}'