Script to check file sequence

Hi everyone,

I need help in creating a script that would check if the file sequence is in order in a particular directory. These are log files that are generated throughout the day.

Example of the file name will be, ABC01_YYMMDDHHMM###### (ABC01_0904161829000001)

Sometimes the file generated will skip and I need to determine which one.

Could someone please help me.

Thank you all very much.

i assume the last 6 digits is the running sequence and the max it can go is 999999.
make a base

for i in {1..99999}; do printf "%.6d\n" $i; done > base.txt

doing the check

ls ABC_* | sed "s/.*\(......\)$/\1/"|sort -n > check.txt

use diff

diff base.txt check.txt

Hi ghostdog74,

Thank you for your suggestion.

But when I ran your code

for i in {1..99999}; do printf "%.6d\n" $i; done > base.txt
it returned the following error

printf: {1..99999}: invalid number

The other 2 codes were fine though.

The other thing I found out is that only the last 4 digits are running numbers as in the example below.

ABC01_YYMMDDHHMMSS#### (ABC01_0904161829290001)

I currently do have a simple script that lists the file in the directory for the day, I was hoping that I could integrate the sequence check script into this file.

Please advice me if that is possible. My script file is below

while true
do
ll /data/data01/ARCHIVE/ABC | grep ABC01_* | grep "`date '+%b %e'`" | awk '{print $5, $6, $7, $8, $9}'
echo
date
sleep 60
done

My script out is;

510149 Apr 22 12:01 ABC01_0904221153032890
508721 Apr 22 12:01 ABC01_0904221154042891
509632 Apr 22 12:01 ABC01_0904221155052892
508150 Apr 22 12:01 ABC01_0904221156082893
508451 Apr 22 12:01 ABC01_0904221157092894
509378 Apr 22 12:01 ABC01_0904221158072895
509437 Apr 22 12:01 ABC01_0904221159072896
508824 Apr 22 12:01 ABC01_0904221200012897
508270 Apr 22 12:01 ABC01_0904221200592898

Wed Apr 22 12:18:47 MYT 2009

Maybe the script can check the files for correct sequence and display a message like "File missing from sequence ABC01_090416182929####" when there is a missing file.

Thank you for your input, really appreciate it.

Which OS you are using?

Please try :

for each in `seq 1 9999`
do
[[ -f "ABC01_090416182929$each" ]] || echo "File ABC01_090416182929$each not exist";
done

Hi,

Check the below code

infile is the input file containing the list of ABC01_* files
this is can be done by adding the code in the below script

cd <dirname>
ls -l | awk '{print $9}'

outfile is the file containing the missing sequence.

I ran the script and o/p is shown below

 
# more test
#!/bin/sh
set -x
flag=0
j=0
if [ ! -f outfile ]
then
touch outfile
else
rm outfile
fi
while read line
do
        i=`echo $line | cut -c19-`
        if [ $i -ne $j -a $flag -ne 0 ]
        then
        echo "file ending with the sequence $j is missing" >> outfile
        fi
        j=`expr $i + 1`
        if [ $flag -eq 0 ];then
        :
        fi
        flag=`expr $flag + 1`
        echo $flag
done < infile
# more infile
ABC01_0904221153032890
ABC01_0904221154042892
ABC01_0904221154042893
ABC01_0904221154042895
ABC01_0904221154042896
# echo
# more outfile
file ending with the sequence 2891 is missing
file ending with the sequence 2894 is missing