UNIX ksh - To print the PID number and repeat count

This question is asked in an interview today that I have to return output with each PID number and the count of each PID number logged today. Here is the script that I have written. Can you confirm if that would work or not. The interviewer didn't said if my answer is correct or not. Can someone review this for me. They want me to print each unique PID:number and its count with tab space like PID:5674 10

Log file abc.log

PID:6543 ��    
�������
PID:4325 ��
��������
PID:6543 ��

Log file xyz.log

PID:8888 ��
�������
PID:9992 ��
��������
PID:6543 ��

Note: The PID numbers can repeat in a file. And also one PID number can appear in multiple log files.

� If today�s and previous day�s log files are in same folder

#!/bin/ksh

cd /A/B/ 
for a in `ls -lrt | grep "Mar 24" | awk '{print $9}'`;    � list of files generated today
do 
grep "^PID:" $a | cut -d " " f1  >> /tmp/abc.log   � saving first column which look like PID:23456 
done

for b in `cat /tmp/abc.log | sort -u`;
do
x=grep $b /tmp/abc.log | grep -v grep | wc;
echo $b"    "$x    � will print like PID:23456  10(count)
done

� If today�s log files are in different folder

#!/bin/ksh

cd /A/B/
for a in `ls /A/B/*.log`
do
grep "^PID:" $a | cut -d " " f1 >> /tmp/abc.log
done

for b in `cat /tmp/abc.log | sort -u`;
do
x=grep $b /tmp/abc.log | grep -v grep | wc;
echo $b"    "$x
done

Now - what would be YOUR OWN results running your scripts (which BTW don't really differ except for the ls targets and the date selection)?

I immediately can see two syntax and one semantical errors, and several opportunities for improvements / optimisation. With the errors removed, the scripts should deliver what was requested.

Hi RudiC,

I don't have unix terminal at the moment so I have not tested it, I can do that on Monday when I reach home. I was curious to know if what I my scripts would work. Can you identify the errors you can see ( is that ; after the for statement in first for loop)

I - and not me alone - would usually recommend to try to find errors or opportunities oneself to

  • experience a good learning step.
  • become a real IT person (programmer, admin, whatever).

So - how about you check on Monday, and if you don't get it running, you come back to get help? And, it's NOT the ; as this is not necessary on a line break.

You posting the corrected versions here would also be appreciated.

Hi Rudic,

The code is working fine and the output as expected. Thanks for your help on it.

Here is an improved version of your script.

#!/bin/ksh

cd /A/B/ || exit    # exit if failed
for a in `ls -lrt | awk '/ Mar 24 / {print $9}'`    # list of files generated today
do 
  grep "^PID:" "$a" | cut -d " " f1   # print first column which look like PID:23456
done > /tmp/abc.log   # all output in the for-do-done block goes here

# for each unique PID ...
for b in `sort -u < /tmp/abc.log`   # cat file |  is overhead compared to  < file
do
  x=`fgrep -xc "$b" /tmp/abc.log`    # whole line match! And grep can count
  echo "$b    $x"    # will print like PID:23456  10(count)
done

In principal a $var in a command argument sould be in quotes, like "$var", otherwise the shell tries to expand it (word splitting, globbing).
This is standard shell code. If you know awk and its hashed arrays, then a smart all-in-awk solution comes in sight...