File formatting in unix

smalya · June 11, 2010, 3:42am

Hi ,

I have a text file noname.txt containing 1000+ records like this. One of the record I have given below.

Input will b e like this

BOT:
2010/06/01 00:25:59        
      21 = "private"        
      Access-Method = 31        
     NCC = GBR        
      01 = "340806@osiris.fr.ft"        
      04 = 57.250.0.210        
      05 = 337        
      06 = "Framed"        
      07 = "PPP"        
      1E = "08001961001"        
      1F = "441224615080"        
      28 = Stop        
      29 = 0        
      2A = 9321        
      2B = 5409        
      2C = "00045DDF"        
      2D = "RADIUS"        
      2E = 125        
      2F = 151        
      30 = 114        
      31 = "Lost Carrier"        
      3D = "Async"
:EOT

Output what I am expecting is

20100601#002559#private#31#GBR#osiris.fr.ft#340806#57.250.0.210#337#Framed#PPP#08001961001#441224615080#Stop#0#9321#5409#00045DDF#RADIUS#125#151#114#Lost Carrier#Async

Please let me know is this something possible in UNIX...
Please help I am in trouble.

thegeek · June 11, 2010, 6:39am

Yes it is possible in Unix.

If you know some scripting language you should be able to do it very easily.

merajh · June 16, 2010, 7:36am

I am very new to Perl and I have tried to write a very lay man code in Perl, hope it helps.

It would be excellent if someone can reduce the code length.

#!/usr/bin/perl

use strict;
use warnings;

open (FILE,"noname.txt")|| die "file does not exist";

my @arr = ();
while (<FILE>)
{
    my $LINE = $_;
        chomp($LINE);   
    if (/\//)
        {
        my @arr1 = split(/ /,$LINE);
        push @arr, join('',split(/\//,$arr1[0]));
        push @arr, join('',split(/:/,$arr1[1]));
        }       
    if (/=/)
        {
        my @arr2;
        if (/"/)
            {
            @arr2 = split(/"/,$LINE);
           
            if (/@/)
                {
                my @arr3 = split(/@/,$arr2[1]);
                push @arr, $arr3[1];
                push @arr, $arr3[0];
                }
            else
                {
                push @arr, $arr2[1];
                }
            }
        else
            {
            @arr2 = split(/= /,$LINE);
            push @arr, $arr2[1];
            }
        }
    if (/EOT/)
        {
        print join('#',@arr);
        @arr = ();
        print "\n";
        }

}

close (FILE);

This code can be easily converted to shell script.

3junior · June 16, 2010, 9:04am

awk '{if ( $1 ~ "BOT") {printf "\n"} else if ( $1 ~ "EOT" ) {printf "\n"} else {printf $3"#"}}' file.txt | sed -e s/\"//g

#private#31#GBR#340806@osiris.fr.ft#57.250.0.210#337#Framed#PPP#08001961001#441224615080#Stop#0#9321#5409#00045DDF#RADIUS#125#151#114#Lost#Async#

#private#31#GBR#340806@osiris.fr.ft#57.250.0.210#337#Framed#PPP#08001961001#441224615080#Stop#0#9321#5409#00045DDF#RADIUS#125#151#114#Lost#Async#

panyam · June 16, 2010, 9:07am

Something like this:

 
awk -F"=" '/BOT|EOT/ { next } $0 !~ /=/ { print ;next } { print $2 ;next } END { print "\n"}' ORS="#" input_file | sed -e 's/#[        ]*/#/g' -e 's/[     ]*#/#/g' -e 's/"//g'

Surely this code lenght can be reduced further.

anon64183241 · June 16, 2010, 11:53am

Here's a pass at this in perl:

#!/usr/bin/perl

use strict;
use warnings;

open (FILE,"noname.txt")|| die "file does not exist";
my @results;
while (<FILE>) {
        if ($_ =~ m/:EOT/) {
                # end of a record, print and reset
                print join("#", @results);
                @results = ();
                print "\n";
        } else {
                chomp($_);
                next if ($_ =~ m/^BOT:/); # skip start of record marker
                if ($_ =~ m%[:/]%) {
                        $_ =~ s%\s+%#%;  # replace spaces in datestamp with #
                }
                $_ =~ s%[/:]%%g; # strip / and :
                $_ =~ s%^\s+%%; # strip leading whitespace
                $_ =~ s%\s+$%%;  # strip trailing whitespace
                $_ =~ s%"%%g;    # strip quotes
                $_ =~ s%.*\s+=\s+(.*)%$1%g; # strip everything before the data
                push @results, $_;
        }
}
close (FILE);

Edited to note: I just noticed that the OP wants to have the data string "340806@osiris.fr.ft" transposed in the output, as "osiris.fr.ft#340806". That makes the above incorrect, and provides a more interesting challenge.

---------- Post updated at 08:53 AM ---------- Previous update was at 07:50 AM ----------

OK, here's another pass that handles the field with the @ correctly:

#!/usr/bin/perl

use strict;
use warnings;

open(FILE, "noname.txt") || die "Can't open file: $!\n";

while(<FILE>) {
        chomp($_);
        next if $_ =~ m/BOT:/;
        if ($_ =~ m/:EOT/) {
                print "\n";
        } else {
                #handle the time/date stamp
                if ($_ =~ s%^(\d{4})/(\d{2})/(\d{2})\s+(\d{2}):(\d{2}):(\d{2})\s+%$1$2$3#$4$5$6%) {
                        print "$_";
                        next;
                }
                # handle the transposition of what looks like an email address
                if ($_ =~ s%.*=\s+"(.*)@(.*)"\s+%$2#$1%) {
                        print "#$_";
                        next;
                }
                # handle everything else
                # assumes a pretty standard format of <whitespace>ID<whitespace>=<data><maybe whitespace>
                $_ =~ s/.*=\s+(.*)\s*/$1/; # pull out the data after the =
                $_ =~ s/"//g; # strip quotes
                $_ =~ s/^\s+//g; # strip leading spaces
                $_ =~ s/\s+$//g; # strip trailing spaces
                print "#$_";
        }
}

close(FILE);