Change in Input feed based on condition file

Sorry Guys for not being able to explain in one of my earlier post.

I am now putting my requirement with the input file and desired output file.

In the below input file -

Transaction code is at position 31:40.
Business code is from position 318:321

TSCM00000005837               CM0002N  -0000000001906.072010-12-10XML MM 201002081000000   YORK
 003007XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000008112               CM0002N  -0000000001906.072010-12-10XML MM 201002081000000   YORK
 007777XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM80000005282               CM0019NM +0000000002254.982010-12-10XML MM 201002081000001   YORK
 000440XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000000215               CM0019NM +0000400002254.982010-12-10XML MM 201002081000001   YORK
 000500XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000000215               CN0001N  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 000292XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM80000005282               CN0001N  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 007843XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000008107               CN0001N  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 000012XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000008093               CN0001P  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 000379XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000002646               CN0001P  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 007847XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000002646               CN0001P  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 003400XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045

For record 1-10 the combination of transaction code and business code is as following-

CM0002N  -               3007
CM0002N  -               7777
CM0019NM +            0440
CM0019NM +            0500
CN0001N  -                0292
CN0001N  -                7843
CN0001N  -                0012
CN0001P  -                 0379
CN0001P  -                 7847
CN0001P  -                 3400

Further we have got a condition file which is having 3 values (comma separated file)

The first value is business code, second is transaction code and third is derived transaction code. The file is given below

3007,CM0002N  -,CMCAMTN  -
3037,CM0002N  -,CMCAMTN  -
3059,CM0002N  -,CMCAMTN  -
3007,CM0002N  +,CMCAMTN  +
3037,CM0002N  +,CMCAMTN  +
3059,CM0002N  +,CMCAMTN  +
3007,CM0002P  -,CMCAMTP  -
3037,CM0002P  -,CMCAMTP  -
3059,CM0002P  -,CMCAMTP  -
3007,CM0002P  +,CMCAMTP  +
3037,CM0002P  +,CMCAMTP  +
3059,CM0002P  +,CMCAMTP  +
0440,CM0019N  -,CMDDPTN  -
0440,CM0019N  +,CMDDPTN  +
0440,CM0019NM -,CMDDPTNM -
0440,CM0019NM +,CMDDPTNM +
0292,CN0001N  -,CNQCSHN  -
0379,CN0001N  -,CNQCSHN  -
1038,CN0001N  -,CNQCSHN  -
7810,CN0001N  -,CNQCSHN  -
7811,CN0001N  -,CNQCSHN  -
7812,CN0001N  -,CNQCSHN  -
7842,CN0001N  -,CNQCSHN  -
7843,CN0001N  -,CNQCSHN  -
0292,CN0001N  +,CNQCSHN  +
0379,CN0001N  +,CNQCSHN  +
1038,CN0001N  +,CNQCSHN  +
7810,CN0001N  +,CNQCSHN  +
7811,CN0001N  +,CNQCSHN  +
7812,CN0001N  +,CNQCSHN  +
7842,CN0001N  +,CNQCSHN  +
7843,CN0001N  +,CNQCSHN  +
0292,CN0001P  -,CNQCSHP  -
0379,CN0001P  -,CNQCSHP  -
1038,CN0001P  -,CNQCSHP  -
7810,CN0001P  -,CNQCSHP  -
7811,CN0001P  -,CNQCSHP  -
7812,CN0001P  -,CNQCSHP  -
7842,CN0001P  -,CNQCSHP  -
7843,CN0001P  -,CNQCSHP  -
7719,CN0001P  -,CNCSHBP  -
7800,CN0001P  -,CNCSHBP  -
7801,CN0001P  -,CNCSHBP  -
7802,CN0001P  -,CNCSHBP  -
7830,CN0001P  -,CNCSHBP  -
7831,CN0001P  -,CNCSHBP  -
7846,CN0001P  -,CNCSHBP  -
7847,CN0001P  -,CNCSHBP  -
 

The requirement is-

Check the business code and transaction code from the input file.

Compare the values with the conditions given in condition file. If the business code (position 31:40) and transaction code (position 318:321) matches with the first and second value of any record in condition file, replace the transaction code with the third value of condition sheet.

After running the command/script the output file should be like this (Change in transaction code, position 318:321 wherever condition is met)

TSCM00000005837               CMCAMTN  -0000000001906.072010-12-10XML MM 201002081000000   YORK
 003007XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000008112               CM0002N  -0000000001906.072010-12-10XML MM 201002081000000   YORK
 007777XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM80000005282               CMDDPTNM +0000000002254.982010-12-10XML MM 201002081000001   YORK
 000440XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000000215               CM0019NM +0000400002254.982010-12-10XML MM 201002081000001   YORK
 000500XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000000215               CNQCSHN  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 000292XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM80000005282               CNQCSHN  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 007843XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000008107               CN0001N  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 000012XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000008093               CNQCSHP  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 000379XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000002646               CNCSHBP  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 007847XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
TSCM00000002646               CN0001P  -0000400002254.982010-12-10XML MM 201002081000001   YORK
 003400XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045
~

My Input file is having around 1 million records and these needs to be checked against all the conditions in condition sheet.

TSCM00000005837               CMCAMTN  -0000000001906.072010-12-10XML MM 201002081000000   YORK
 003007XML MM 2010000000000*                   00000000*
                                                                                  2010-12-10-00.00.00123456789123450 GB 5045

in your input file, they are three lines, but in your description, 3007 is at 318:321. So I guess they should be in one line.

One single record is taking 3 lines. All the records are starting with TSCM. There are 10 records in total.

# first get your condition mapping into a hash
open FH,"<yourconditionfile";
while(<FH>){
  chomp;
  my @tmp = split(",",$_);
  $hash{$tmp[0]."-".$tmp[1]}=$tmp[2];
}
close FH;

# then modify your first file, first let's seperate it into record per 3 lines
local $/="\n\n";
open FH,"awk '{print;if(NR %3 == 0) print \"\"}' youroriginalfile|";
while(<FH>){
  if(/\S*\s*(.*?[-+]).*?^\s*\d{2}(\d{4})/sm){
    #print $1,"  ",$2,"\n";
    my $tmp = $2."-".$1;
    if(exists $hash{$tmp}){
      s/$2/$tmp/;
    }
    s/\n$//;
    print;
  }
}

Sorry for my earlier message. In Unix File 1 record is in 1 line only. There is no newline character in the single record. However while displaying it takes 3 lines.

Also the above solution seems to be in Perl, can't it be done using sed or awk ?

awk '
NR==FNR {split($1,a,","); split($2,b,",");c[a[2] FS "00" a[1] "XML"]=b[2];next} 
(c[$2 FS $7]){gsub($2,c[$2 FS $7])}1
' condition.txt input.txt

Just for fun ( with the original requirements "3 lines" ) ....

# cat gen.pl
#!/usr/bin/perl

use strict;

my $meta=shift;
my $dat=shift;
my $destination=shift;

open (META,"<",$meta ) || die "cant't open file $meta \n";

my (%metadata,@line);
while (<META>) {
   /^(\d+),(\S+)\s+(\S),(\S+)\s+(\S)/;
   $metadata{$1.$2.$3} = $4.":".$5;
   }
close(META);


open (DAT,"<",$dat ) || die "cant't open file $dat \n";
open (DEST,">",$destination ) || die "cant't open file $destination \n";

while (<DAT>) {
   /^\S+\s+(\S+)\s+(\S)\S+\s+\S+\s+\S+\s+\S+$/;
   if ($1 && $2 ) { 
      $line[0]=$_;
      my ($tCode,$sig)=($1,$2);
      $_=<DAT>;
      $line[1]=$_;
      /\s+\S+(\d{4})\S+\s+.*$/;
      if ( $1) {
         my $bCode=$1;
         if ($metadata{$bCode.$tCode.$sig}) {
            $metadata{$bCode.$tCode.$sig}=~/(\S+):(\S)/;
            my ($newTcode,$newSig)=($1,$2);
            $_=$line[0];
            s/^(\S+\s+)(\S+)(\s+)(\S)(\S+\s+\S+\s+\S+\s+\S+)$/\1$newTcode\3$newSig\5/;
            print DEST;
            print DEST $line[1];
            $_=<DAT>;
            print DEST ;
            }
         else {
            print DEST $line[0].$line[1];
            $_=<DAT>;
            print DEST;
            }
         }
      }
   }
close(DAT);
close(DEST);

Usage:

gen.pl  condFile inputFile newFile

I am afraid this solution is not working. Also I can see that the command is using the hardcode values (XML). This will not be the case with the records; any value can come at any place.

I need a solution based on positions. It's getting urgent for me, I would be thankful if someone can help.

---------- Post updated at 08:26 AM ---------- Previous update was at 08:25 AM ----------

I would appreciate if somebody can provide solution with record as 1 line , using sed or awk.

---------- Post updated at 11:56 PM ---------- Previous update was at 08:26 AM ----------

can somebody please provide a pointer or guide me for my problem.