To check the missing file based on sequence number.

Hi All,

I have a requirement that i need to list only the missing sequences with a unix script.

For Example:
Input:

FILE_001.txt
FILE_002.txt
FILE_005.txt
FILE_006.txt
FILE_008.txt
FILE_009.txt
FILE_010.txt
FILE_014.txt

Output:

FILE_003.txt
FILE_004.txt
FILE_007.txt
FILE_011.txt
FILE_012.txt
FILE_013.txt 

(or)

003
004
007
011
012
013

If anyone can help me out, it will be great.

Regards,
Arun

You may find hints for the solution by searching these forums:

or

Hi, the following works in ksh.

Assumptions are that the format of the file name is file_{num}.txt and that the file numbers are all 3 digits.

#!/bin/ksh

#Do a file listing and count and set variables

ls FILE* > answers.txt
tot_line=`wc -l answers.txt | awk '{print $1}'`
typeset -Z3 file_missing
current_line=1
next_line=2

#loop through list of files getting the file number

while [ $current_line -lt $tot_line ]
do
current_file=`awk 'NR=='$current_line'' answers.txt | sed 's/\(.*_\)\(.*[0-9]\)\
(.*$\)/\2/'`
next_file=`awk 'NR=='$next_line'' answers.txt | sed 's/\(.*_\)\(.*[0-9]\)\(.*$\)
/\2/'`

#find out if files are sequential

sequence=`expr $next_file - $current_file`

#If files not sequential start loop to print list of unsequential files

if [ $sequence -ne 1 ]
then
curr_file=$current_file
while [ $sequence -gt 1 ]
do

file_missing=`expr $curr_file + 1`
curr_file=`expr $curr_file + 1`

echo "FILE_$file_missing.txt is missing"
sequence=`expr $sequence - 1`
done

fi

current_line=`expr $current_line + 1`
next_line=`expr $next_line + 1`

done

output

FILE_003.txt is missing
FILE_004.txt is missing
FILE_007.txt is missing
FILE_011.txt is missing
FILE_012.txt is missing
FILE_013.txt is missing

bash

for i in {1..14}
do
seq=`printf "%03d" $i`
if [ ! -f "FILE_${seq}.txt" ]
then
echo "FILE_${i}.txt"
fi
done

You might also try the following ksh script that just uses shell built-ins (so it runs a little faster than andy391791's script). And, unlike andy391791's script, it also adds some error checking code and will work with any shell that performs basic parameter expansions required by the POSIX standards instead of just working with a Korn shell. And, unlike looney's bash script, it will report missing files from the lowest numbered existing file with a name matching FILE_[0-9][0-9][0-9].txt to the highest numbered existing file with a name matching that pattern instead of just looking for missing files in the range 001 through 014.

#!/bin/ksh
first=1
missing=0

nextfile() {
	seq=$((seq + 1))
	next=$(printf 'FILE_%03d.txt' "$seq")
}

for i in FILE_[0-9][0-9][0-9].txt
do	if [ "$first" = 1 ]
	then	if [ ! -f "$i" ]
		then	printf 'No files matching pattern found.\n' >&2
			exit 1
		fi
		seq=${i#FILE_}
		seq=${seq#0}	# Avoid problems with some shells treating
		seq=${seq#0}	# numbers with leading zeros as octal.
		seq=${seq%.txt}
		first=
		continue
	fi
	nextfile
	while [ "$next" != "$i" ]
	do	printf '%s is missing\n' "$next"
		missing=$((missing + 1))
		nextfile
	done
done
if [ "$missing" = 0 ]
then	printf 'No missing files detected.\n'
fi

Another one with awk

ls | awk '
function xmatch() {
  if ( match($0,/[0-9]+/) ) {
    x=substr($0,RSTART,RLENGTH)+0
  } else {
    print ">>"$0
    x=0
  }
}
n {
  xmatch(); if (x) {
    for ( ; ++n<x; ) { printf "%0*d\n", RLENGTH, n }
  }
  next
}
{
  xmatch(); n=x
}
'