How to extract part of string from all log files.?

Hi All,

Let us say we have 5 log files, extract the data from all log files and save the output in a file.

home/log/first.log
home/log/second.log
home/log/third.log
home/log/four.log
home/log/five.log

I want to extract the following text from the log files and save the output in a file.

numberoffiles=5 numberofRows=3459
Time taken: 10.65 seconds

The log file look like this

/ABC/RTE/AD_900_VOP_123/OPP
/ABC/RTE/TRE/AD_900_VOP_145/BBB
/ABC/RTE/AN_900_VFP_124/FBF
/ABC/RTE/HD_900_FOP_153/WEW
/ABD/RDV/AD_900_VOP_123/OPP
/ABC/RTE/WD_900_VOP_123/GRR/TRD
/ABC/RTE/RTD/AR_900_VOP_443/SDD

Thanks in Advance.

---------- Post updated at 03:43 PM ---------- Previous update was at 03:10 PM ----------

Sorry for violating code tags.

Please help me.

Thanks

Try this:

>output.txt
for filename in /home/log/*.log
do
   echo "$filename"
  tr -s '[,\]\[]' ' ' < filename | 
  awk '/numberoffiles/ {print $4, $5}
         /Time taken/ {print} '
  echo " "
done > output.txt
  • tr replaces brackets and commas with a space
  • awk prints 4th and 5th columns when numberoffiles appears in the line
    --- -- prints line when"Time taken" appears.

Hi,

Thanks for your response.

I have tried this only file names are creating and data is not coming.

output.txt

/home/log/first.log

Please help me.

Thanks

You have to help us here.
What is the name and kind of UNIX/Linux you have?
Please try these commands and post the output.

uname -a
cat /etc/release

What shell are you using? i.e., bash, ksh ...

I can guess what is wrong but that will take forever.

Hi,

I found the issue.

$ is missing in front of filename.

I have added data is coming in the file.

But one extra bracket is coming.

[numberoffiles=8 numberofRows=112
Time taken: 19.43 seconds

one more thing instead of position can we do based on the string only.

Let us say in the log some more fields added as below it will be problem.

for filename in /home/log/*.log
do
   echo "$filename"
  tr -s '[,\]\[]' ' ' < $filename | 
  awk '/numberoffiles/ {print $4, $5}
         /Time taken/ {print} '
  echo " "
done > output.txt

Please help me.

Thanks

Sorry about the error - I was away from where I could test the script.

I will show you a template of how to deal with this stuff in the context I already have.
You have to enumerate all of the fields on the line with numberoffiles
I changed the code to be verbose so you can modify it for the next change that we do not know about yet.

Change the awk script:

tr -s '[,\]\[]' ' ' < $filename | 
     awk '
     {  if (index($0, "stats" ) > 0)
        {
         { for(i=1;i<=NF; i++)                               
             {               
                if( $(i) ~ /numberoffiles/ || $(i) ~ /numberofRows/ ) 
                {
                 printf("%s ", $(i) )
                }
             }
             printf("\n")
         }
       }
       if (index($0, "Time")> 0 )  
       {
         print $0
       } 
      
     } '

What this does is to step through each field and see what the field compares to.
When a match is found it print just the field. Notice it uses printf() which lets us
use output format specifiers just like in the C language.

1 Like

Try also

awk '
BEGIN                   {SRCH1 = "numberoffiles[^ ]*"
                         SRCH2 = "numberofRows[^ ]*"
                        }
                        {gsub (/[][,]/, "")
                        }
FNR == 1                {FND = 0
                        }
match ($0, SRCH1)       {RST = RSTART
                         RLN = RLENGTH  
                         match ($0, SRCH2)
                         print FILENAME
                         print substr ($0, RST, RLN), substr ($0, RSTART, RLENGTH)
                         FND = 1
                        }
/Time/ && FND
' /home/log/*.log
1 Like

Thanks a lot

---------- Post updated at 01:38 PM ---------- Previous update was at 06:29 AM ----------

Hi,

The below script is working fine

for filename in /home/log/*.log
do
tr -s '[,\]\[]' ' ' < $filename | 
     awk '
     {  if (index($0, "stats" ) > 0)
        {
         { for(i=1;i<=NF; i++)                               
             {               
                if( $(i) ~ /numberoffiles/ || $(i) ~ /numberofRows/ ) 
                {
                 printf("%s ", $(i) )
                }
             }
             printf("\n")
         }
       }
       if (index($0, "Time")> 0 )  
       {
         print $0
       } 
      
     } '  done > logs_output.txt

I want the table name also in the output

I want the output as below after stats : should be replaced with ,

Table abc.dat stats, numberoffiles=5, blocksize=100, numberofRows=3459

I hav tried

if (index($0, "Table" ) > 0)
{
print $0
}

It's printing complete thing

Table abc.dat stats: [numberoffiles=5, blocksize=100, numberofRows=3459, totalSize=4531,datasize=123]

Please help me

Thanks in Advance

---------- Post updated at 03:48 PM ---------- Previous update was at 01:38 PM ----------

Hi All,

Please help me.

Thanks