Grep -c text processing of a log file

I have a log file with below format.

Log File:

1 started job on date & time JOB-A
2 started job on date & time JOB-B
3 completed job on data & time JOB-A
4 started job on date & time JOB-C
5 started job on date & time JOB-D
6 completed job on data & time JOB-B
7 started job on date & time JOB-E
8 started job on date & time JOB-F
9 completed job on data & time JOB-C
10 completed job on data & time JOB-D 
12 completed job on data & time JOB-F

As given above I could see that E is not completed and hence is missing from log file. I am trying to find those jobs which are like E. That is having one entry for started and no entry for completed. I could do a

grep -c "started" Logfile
grep -c "completed" Logfile 

and match them. That doesn't help me to find the jobs which are not completed. I am sure there is a better way to do it. One more thing to add, I don't have the list of job names.

Try:

while read _ _ _ _ _ _ _ JOBID
do
	count=$(grep "$JOBID" "./LOGFILE"|wc -l)
	[ $count -eq 2 ] && \
		echo "Job $JOBID: All good!" || \
		echo "Job $JOBID: Invalid amount of job entries ($count)..."
done<LOGFILE

hth

1 Like

Try also

declare -A CNT 
while read _ _ _ _ _ _ _ JOBID
        do if ((++CNT[$JOBID] == 2))
                 then  unset CNT[$JOBID]
           fi
        done< file  
echo ${!CNT[@]}
JOB-E
1 Like

Thanks sea and RudiC.. It was fast and elegant as usual.

Getting error with :

bash -x script
+ declare -A CNT
+ read _ _ _ _ _ _ _ JOBID
+ (( ++CNT[JOB-A] == 2 ))
+ read _ _ _ _ _ _ _ JOBID
+ (( ++CNT[JOB-B] == 2 ))
+ read _ _ _ _ _ _ _ JOBID
+ (( ++CNT[JOB-A] == 2 ))
+ unset 'CNT['
./rfnc.v: line 4: unset: `CNT[': not a valid identifier
+ echo 'JOB-A]'
JOB-A]
+ read _ _ _ _ _ _ _ JOBID

The unset somhow complains but works!?

When you wrote the code you introduced an extra space in then unset CNT[ $JOBID] make sure that it is as unset CNT[$JOBID] with no space between the first [ and the $

Ala, thanks I had found my mistakem real close to what you write :

then unset CNT[$JOBIDl

'l' instead of ].
Must change the colors of my terminal session!
Thanks to all!

Yes, by adding an extra character to $JOBID now the variable that the shell sees is $JOBIDl which contains nothing, since it does not exist, thus feeding to unset CNT[ + nothing

awk version, similar logic:

awk '{if($NF in A) delete A[$NF]; else A[$NF]} END{for(i in A) print i}' file

more concise version:

awk 'A[$NF]++{delete A[$NF]} END{for(i in A) print i}' file

Different logic:

awk '/started/{A[$NF]} /completed/{delete A[$NF]} END{for(i in A) print i}'  file