Check if file is present using input from another file

Hello,

I have a comma delimited file such as:

cat /statistics/support/input.txt
ID,Serial,quantitity,atribute1,atribute2
1,89569698,5,800,9900,
1,35568658,8,1200,5550
1,89569698,8,320,5500
1,68753584,85,450,200

ID should always have 1 digit, Serial 8 digits, and the others may vary.

In a separate folder, for each entry in the above files I should have txt files such as:

ls /logs/
abc_89569698_2013_01_21.txt
bcd_35568658_2013_01_21.txt
cyz_89569698_2013_01_22.txt
ccc_68753584_2013_01_21.txt
zbv_21456774_2013_01_21.txt

If possible I would an awk script to compare the serials in

input.txt 

against the file names in

/logs/

folder and to do the next ones:

  • build a log in
/statistics/support/statistics.log

getting the number of match serials, and total number of files in

/logs

;

  • if duplicates found then output in log:
89569698: 2013_01_21, 2013_01_22

;

  • for all files which aren't in input.txt create a directory and subdirectory in
/logs/

called

/missing/21456774

and then move the file.

Thanks in advance for any help.

Try this:

find /logs -type f -print | awk '
FNR==NR{if ($1!="ID") S[$2]; next}
{
  if($3 in F) {
    F[$3]=F[$3]", "$4"_"$5"_"$6
    D[$3]=F[$3] }
  else F[$3]=$4"_"$5"_"$6
}
!($3 in S) {
    system("mkdir -p /missing/"$3)
    system("mv "$0" /missing/"$3"/")
}
END {
   for(serial in S) { count++ }
   for(serial in F) { found++ }
   print "Serials: " count " Matched: " found " Files: "FNR
   for(serial in D) {
      print serial": " D[serial]
   }
}' FS=, /statistics/support/input.txt FS='[_.]' - > /statistics/support/statistics.log
1 Like

Hi,
Thank you for your reply,
In order to get it working I have done the next adjustments (also I have added a listing of missing serials from input):

find /logs -type f -print | awk '
FNR==NR{if ($1!="ID") S[$2]; next}
{
  if($2 in F) {
    F[$2]=F[$2]", "$3"_"$4"_"$5
    D[$2]=F[$2] }
  else F[$2]=$3"_"$4"_"$5
}
!($2 in S) {
   system("mkdir -p /missing/"$3)
   system("mv "$0" /missing/"$3"/")
print " Missing serials: " $2
}
END {
   for(serial in S) { count++ }
   for(serial in F) { found++ }
   print "Serials: " count " Matched: " found " Files: "FNR
   for(serial in D) {
      print serial": " D[serial]
    }  

}' FS=, /statistics/support/input.txt FS='[_.]' - > /statistics/support/statistics.log

Best Regards