[ksh] finding last file with keyword in it

Hi,

In short : I have several log files and I need to find the last file with a certain keyword in it.

# ls -1tr logs
log_hostX.Jan01_0100.gz
log_hostX.Jan01_0105.gz
log_hostX.Jan01_0110.gz
log_hostX.Jan01_0115.gz
log_hostX.Jan01_0120.gz
log_hostX.Jan01_0125.gz
log_hostX.Jan01_0130.gz
log_hostX

Through Internet I found a way to examine the last, the previous and the previous previous file, but for some reason cannot determine the name of the file before that.
My purpose is to find the begin date & time and the end date & time.
The end time is always in the last file, but the begin time can be the last 4 files.
Because of the number of log files, I need to go from newest to oldest.

I will post my code so far in the next posting.

Greetings,

E.J.

---------- Post updated at 10:49 AM ---------- Previous update was at 10:38 AM ----------

Here the code.
Perhaps not the best you may have seen, but it almost serves it purpose.
Line 109 is where I need help.
You can copy&paste it to try it out.
But of course when there are better solutions I would like to hear them too.
Always nice to learn something.

I am on a Solaris 10 machine with ksh88 (not allowed to upgrade it).

#!/usr/bin/ksh

# Prepare log dir
mkdir -p /tmp/logs
cd /tmp/logs

rm *.gz log_hostX

# Prepare some log files to play with
echo "log1 continue 01:00:00" > log_hostX.Jan01_0100
gzip log_hostX.Jan01_0100
sleep 1
echo "log2 continue 01:05:00 \nlog2 END 01:05:00" > log_hostX.Jan01_0105
gzip log_hostX.Jan01_0105
sleep 1
echo "log3 BEGIN 01:10:00\nlog3 continue 01:10:00" > log_hostX.Jan01_0110
gzip log_hostX.Jan01_0110
sleep 1
echo "log4 continue 01:15:00\nlog4 END 01:15:00" > log_hostX.Jan01_0115
gzip log_hostX.Jan01_0115
sleep 1
echo "log5 BEGIN 01:20:00\nlog5 continue 01:20:00" > log_hostX.Jan01_0120
gzip log_hostX.Jan01_0120
sleep 1
echo "log6 continue 01:25:00" > log_hostX.Jan01_0125
gzip log_hostX.Jan01_0125
sleep 1
echo "log7 continue 01:30:00" > log_hostX.Jan01_0130
gzip log_hostX.Jan01_0130
sleep 1
echo "log8 continue 01:35:00\nlog8 END 01:35:00" > log_hostX


### Actual code ###

# Set variables to default values
BEGIN_Mmm_DD="-----"
BEGIN_HH_MM_SS="--:--:--"
END_Mmm_DD="-----"
END_HH_MM_SS="--:--:--"

HOST=hostX
FILE=log_${HOST}

# Determine end time based on last file with END
if grep -i END ${FILE} > /dev/null
then
  END_HH_MM_SS=`grep -i END ${FILE} | awk '{print $3}'`
fi

# Determine begin and end MmmDD based on log file
BEGIN_Mmm_DD=`ls -als ${FILE} | awk '{printf "%s%0.2d", $7, $8}'`
END_Mmm_DD=$BEGIN_Mmm_DD

# Try to determine begin time based on last file with BEGIN
if grep BEGIN ${FILE} > /dev/null
then
  BEGIN_HH_MM_SS=`grep BEGIN ${FILE} | awk '{print $3}'`
fi

# BEGIN not found ?
# Try to determine the begin time in the previous trace file
if [[ $BEGIN_HH_MM_SS = "--:--:--" ]]
then
  # Determine filename
  FILE=`ls -1tr log_${HOST}.*.gz 2> /dev/null | awk '{X=$0} END {print X}'`
  if [ -n "$FILE" ]
    then
    # Determine date
    BEGIN_Mmm_DD=`ls -als ${FILE} | awk '{print $7 $8}'`
    # Determine begin time
    if gzcat ${FILE} | grep BEGIN > /dev/null
    then
     	BEGIN_HH_MM_SS=`gzcat ${FILE} | grep BEGIN | awk '{print $3}'`
    else
      BEGIN_Mmm_DD="-----"
      BEGIN_HH_MM_SS="--:--:--"
    fi
  fi
fi


# BEGIN still not found ?
# Try to determine the begin time in the previous previous trace file
if [[ $BEGIN_HH_MM_SS = "--:--:--" ]]
then
  # Determine filename
  FILE=`ls -1tr log_${HOST}.*.gz 2> /dev/null | awk '{y=x "\t" $0; x=$0}; END{print y}' | awk '{print $1}'`
  if [ -n "$FILE" ]                                                      
    then
    # Determine date
    BEGIN_Mmm_DD=`ls -als ${FILE} | awk '{print $7 $8}'`
    # Determine begin time
    if gzcat ${FILE} | grep BEGIN > /dev/null
    then
     	BEGIN_HH_MM_SS=`gzcat ${FILE} | grep BEGIN | awk '{print $3}'`
    else
      BEGIN_Mmm_DD="-----"
      BEGIN_HH_MM_SS="--:--:--"
    fi
  fi
fi

# What ? BEGIN still not found ?
# Try to determine the begin time in the previous previous previous trace file
if [[ $BEGIN_HH_MM_SS = "--:--:--" ]]
then
  # Determine filename
  FILE=`echo "THAT IS THE QUESTION !!"`
  FILE=log_hostX.Jan01_0120.gz
  if [ -n "$FILE" ]
    then
    # Determine date
    BEGIN_Mmm_DD=`ls -als ${FILE} | awk '{print $7 $8}'`
    # Determine begin time
    if gzcat ${FILE} | grep BEGIN > /dev/null
    then
     	BEGIN_HH_MM_SS=`gzcat ${FILE} | grep BEGIN | awk '{print $3}'`
    else
      BEGIN_Mmm_DD="-----"
      BEGIN_HH_MM_SS="--:--:--"
    fi
  fi
fi

# Print the header lines
printf "Host    Start             End\n"
printf "\n"

# Print the hostname
printf "%-8s" ${HOST}

# Print the results
printf "%-5s  %-8s   %-5s %-8s\n" ${BEGIN_Mmm_DD} ${BEGIN_HH_MM_SS} ${END_Mmm_DD} ${END_HH_MM_SS}

Here the result:

# ./LogStat.sh
Host    Start             End

hostX   Jan15  01:20:00   Jan15 01:35:00

---------- Post updated at 04:04 PM ---------- Previous update was at 10:49 AM ----------

Okay, I was clearly on the wrong track.
There is an easier way to go through a list of files.

This is what works for me in order to find the begin date & time.
Probably it can be made even more compact, but I am served.

BEGIN_Mmm_DD="-----"
BEGIN_HH_MM_SS="--:--:--"

HOST=hostX

cd /tmp/logs
LOG_LIST=`ls -1t log_${HOST}*`

for LOG_FILE in $LOG_LIST
  do
  
  # Determine the extension of the log file
  EXTENSION=$(echo ${LOG_FILE}|awk -F\. '{print $3}')
  EXTENSION=${EXTENSION:-none}
  if [ ! ${EXTENSION} == "gz" ]
  then
    # Try to determine begin date and time 
    if grep BEGIN ${LOG_FILE} > /dev/null
    then
      BEGIN_Mmm_DD=`ls -als ${LOG_FILE} | awk '{print $7 $8}'`
      BEGIN_HH_MM_SS=`grep BEGIN ${LOG_FILE} | awk '{print $3}'`
      break # we found the date and time, so we can leave the for loop
    fi
  else
    # Try to determine begin date and time
    if gzcat ${LOG_FILE} | grep BEGIN > /dev/null
    then
      BEGIN_Mmm_DD=`ls -als ${LOG_FILE} | awk '{print $7 $8}'`
      BEGIN_HH_MM_SS=`gzcat ${LOG_FILE} | grep BEGIN | awk '{print $3}'`
      break # we found the date and time, so we can leave the for loop
    fi
  fi

done
1 Like

Not sure I understand your request.
So - the end time will be in the last file (key word END), and the begin (keyword BEGIN) will be in the same file or in four predecessors. Why not loop up to five times through ls -t LOG_hostx* (watch out, no -r option!), grep ping each file for "BEGIN"? If there's more than one BEGIN and you need the last one, try tac ing the files.

Indeed, this was the solution, I was on the wrong track.
That happens when you are expanding the code, or when you adapt the code to little changes.
In my case, first the BEGIN/END sequence was in the last file only, then spread over 2 files, then over 3 and currently over 4.
So I kept looking for the next file, not paying attention to a sound solution.
Now it can be in any file, so is future proof.
Thanks for responding.