File filter

Dastard · September 6, 2007, 1:18am

Hi Everyone , have a nice
i would need a little help on this
i have file which contains blocks such as given below

<hgsdp:msisdn=923228719047,loc;
HLR SUBSCRIBER DATA

SUBSCRIBER IDENTITY
MSISDN IMSI STATE AUTHD
923228719047 410072110070614 CONNECTED AVAILABLE

NAM
1

LOCATION DATA
VLR ADDRESS MSRN MSC NUMBER LMSID
4-923210002011 923210002011
MS PURGED IN VLR

END
<hgsdp:msisdn=923228276174,loc;
HLR SUBSCRIBER DATA

SUBSCRIBER IDENTITY
MSISDN IMSI STATE AUTHD
923228276174 410072520066962 CONNECTED AVAILABLE

NAM
1

LOCATION DATA
VLR ADDRESS MSRN MSC NUMBER LMSID
4-923210002002 923210002002
MS PURGED IN VLR

END

now i want to filter out MSISDN based on MSC NUMBER ( bold above )
like i only want those MSISDNs in my output file which has MSC NUMBER = 923210002022

Thanks in Anticipation
Regards

robotronic · September 6, 2007, 3:52am

nawk -v "in_msc=923210002002" '
   /^</ {
      split($0, a, "=");
      split(a[2], b, ",");
      msisdn=b[1];
   }
   /^VLR/ {
      getline;
      msc=$2;

      if (msc == in_msc) { print(msisdn); }
   }
' input_file.txt

Dastard · September 6, 2007, 4:04am

Thanks its working like charm

but would just describe this code line by line , so that i don need to tele the whole output , whould just analyze code and this would be enuff to make sure that output is correct

Regards and Thanks

robotronic · September 6, 2007, 1:50pm

Basically, the logic is:

Line 1) Through command line, pass to the awk script the value of the msc number to find. You can also define this variable in the body of the script if you want.

Lines 2-6) When you find a line beginning with "<", extract the msisdn number. The first split will generate the array "a", which contains two string elements: the first part is "<hgsdp:msisdn", the second part is "923228719047,loc;".
The second split takes in input the second element of the "a" array and creates a "b" array by dividing the string, using the "," delimiter. So, the first element of array "b" is the number we need.
Assuming the msisdn numbers are all 12 chars in length, we could have used a much more simpler function: substr($0, 15, 12).

Lines 7-9) When you find a line beginning with "VLR", jump to the next line. Here, in the 2nd field, we have the msc number referring to the msisdn found before.

Lines 10-12) If the msc found is equal to the msc we specified in the command line, print the msisdn number.

Line 13) The input file to feed the awk script

 1   nawk -v "in_msc=923210002002" '
 2      /^</ {
 3         split($0, a, "=");
 4         split(a[2], b, ",");
 5         msisdn=b[1];
 6      }
 7      /^VLR/ {
 8         getline;
 9         msc=$2;
10
11         if (msc == in_msc) { print(msisdn); }
12      }
13   ' input_file.txt

Well, now that I've re-read the script, it is possible to use the same logic of lines 7-9 to extract the msisdn:

nawk -v "in_msc=923210002002" '
   /^MSISDN/ {
      getline;
      msisdn=$1;
   }
   /^VLR/ {
      getline;
      msc=$2;

      if (msc == in_msc) { print(msisdn); }
   }
' input_file.txt

As usual, there will be another bunch of methods to extract the same information, maybe in a better way... The important thing is getting the result