Extract n-digits from string in perl

Hello,

I have a log file with logs such as

01/05/2017 10:23:41 [ABCD-22357$0]: file.log.38: database error, MODE=SINGLE, LEVEL=critical, STATE: 01170255 (mode main

how can i use perl to extract the 8-digit number below from the string

01170255

Thanks

I would suggest using split being saved to an array, then display the appropriate field. Perhaps something like this might help:-

  my $line = "01/05/2017 10:23:41 [ABCD-22357$0]: file.log.38: database error, MODE=SINGLE, LEVEL=critical, STATE: 01170255 (mode main" ;

  my @tmp_array = split (/ /,$line) ;
  print $tmp_array[9] ;

Does that help?
Robin

line="01/05/2017 10:23:41 [ABCD-22357$0]: file.log.38: database error, MODE=SINGLE, LEVEL=critical, STATE: 01170255 (mode main"

echo "$line" | perl -e 'foreach my $str (split(/ /, <>)) {if (length($str)==8 && $str=~/^\d+$/) {print "$str\n"}; } '

echo "$line" | sed -n -r '/\b([0-9]{1,8})\b/s/.*\b([0-9]{1,8})\b.*/\1/;p;'

echo "$line" | awk 'length==8 && /^[0-9]+$/' RS=" "

for w in $line ; do if [ ${#w} = 8 ] && [ "${w/[^0-9]*}" = "$w" ] ; then { print "$w" ; } ; fi ; done
1 Like

If the 8-digit number is always preceded by the word "STATE:" then you could use regular expressions as well:

$
$ cat mylog.txt
01/05/2017 10:23:41 [ABCD-22357$0]: file.log.38: database error, MODE=SINGLE, LEVEL=critical, STATE: 01170255 (mode main
$
$ perl -lne 'print $1 if /STATE:\s+(\d+)/' mylog.txt
01170255
$

If it could be preceded by more than word, then specify them all in your regular expression, like so:

$
$ cat mylog_1.txt
01/05/2017 10:23:41 [ABCD-22357$0]: file.log.38: database error, MODE=SINGLE, LEVEL=critical, STATE: 01170255 (mode main
something else
over here
01/05/2017 10:23:41 [ABCD-22357$0]: file.log.38: database error, MODE=SINGLE, LEVEL=critical, BLAH: 12345678 (mode main
some other stuff
$
$ perl -lne 'print $2 if /(STATE|BLAH):\s+(\d+)/' mylog_1.txt
01170255
12345678
$
 

Thanks Robin, but i'm looking at a situation where the 8-digit number can appear in another position in the string. In that case it will not always be the 9th element of the array.

---------- Post updated at 10:43 AM ---------- Previous update was at 10:28 AM ----------

How can i do this inside a perl script and not on the command line?

The perl code

print $1 if /STATE:\s+(\d+)/;

works on a line that is in $_ .

2 Likes

In a Perl program, you'd accept the input log file name (if so desired), open it, loop/process through it and then close it.
Here's a short program called "process_log.pl".
Hopefully the inline comments are descriptive enough.

$
$
$ cat -n mylog_1.txt
     1  01/05/2017 10:23:41 [ABCD-22357$0]: file.log.38: database error, MODE=SINGLE, LEVEL=critical, STATE: 01170255 (mode main
     2  something else
     3  over here
     4  01/05/2017 10:23:41 [ABCD-22357$0]: file.log.38: database error, MODE=SINGLE, LEVEL=critical, BLAH: 12345678 (mode main
     5  some other stuff
$
$
$ cat -n process_log.pl
     1  #!/usr/bin/perl -w
     2  # ------------------------------------------------------------------------------
     3  # Name : process_log.pl
     4  # Desc : A short and quick Perl program to process log files. It performs
     5  #        rudimentary input validation and error handling.
     6  # Usage: perl process_log.pl <log_file>
     7  # ------------------------------------------------------------------------------
     8  use strict;
     9
    10  # Print usage and quit if incorrect number of parameters were passed.
    11  if ($#ARGV != 0) {
    12      print "Usage: perl process_log.pl <log_file>\n";
    13      exit 1;
    14  }
    15
    16  # Set the file name. Hopefully it can be opened!
    17  my $file = $ARGV[0];
    18
    19  # Open file, loop through each line and process, close file
    20  open(FH, "<", $file) or die "Can't open $file: $!";
    21  while (<FH>) {
    22      # Remove the end of line character
    23      chomp;
    24      if (/(STATE|BLAH):\s+(\d+)/) {
    25          print $2, "\n";
    26      }
    27  }
    28  close(FH) or die "Can't close $file: $!";
    29
$
$ # Test with no parameters
$ perl process_log.pl
Usage: perl process_log.pl <log_file>
$
$ # Test with a file name that cannot be found or opened
$ perl process_log.pl does_not_exist.txt
Can't open does_not_exist.txt: No such file or directory at process_log.pl line 20.
$
$ # Test with the correct file
$ perl process_log.pl mylog_1.txt
01170255
12345678
$
$
1 Like

A few notes if I may.

For demonstration is alright, but there is not reason to remove the newline

22      # Remove the end of line character
23      chomp;

The first group can be ignored, using the ?: and it will make it more efficient. Also, it is fine to print without concatenation.

24      if (/(STATE|BLAH):\s+(\d+)/) {
25          print $2, "\n";
26      }
if (/(?:STATE|BLAH):\s+(\d+)/) {
    print "$1\n";
}

Now, that code only accept input from one file. Another version that can accept input from multiple files or even from a pipe, follows:

#!/usr/bin/perl

use strict;
use warnings;

while(<>) {
    /(?:STATE|BLAH):\s+(\d+)/ and print "$1\n";
}

This can be use as:

perl process_log.pl mylog_1.txt mylog_2.txt [...]

or

another_program_stream_output | perl process_log.pl
1 Like