Kill pid

I created a program to kill long running pid processes.

I am getting the following error message:

-f command cannot be found.

I also want to count the number of pids that are killed and append the results to a text file. I am new to shell script programming.

1.The first part of code is exporting a text file column containing ppids.

pid.txt contents are as follows:

Ppid
5569000
6789034
4567890
1234567
5678908
3457892
  1. The second part of the code changes permission on the PpidFile

  2. The third part of the code loops through the column in the text
    file and kill each Ppid.

  3. finally a line of text is appended to a text file. The text contains the total number of pids killed and the date.

Here is my code:

export PpidFile="/path/to/pids.txt"
if [[ -f "$PpidFile" ]]
then
	/bin/chmod 755 $PpidFile
	Ret=$?
	if [ 0 -eq $Ret ]
	then
		for RelatedEachPid in `/bin/grep -v "Ppid" $PpidFile | /usr/bin/tr "\n" " "`
		do
			/bin/echo "kill -9 $RelatedEachPid"
			/bin/kill -9 $RelatedEachPid
			Ret=$?
			if [ 0 -ne $Ret ]
			then
				/bin/echo "kill -9 $RelatedEachPid Fail"
			else
				/bin/echo "kill -9 $RelatedEachPid 
PIDkill"
			fi

echo "total of pids killed: wc-l $PIDkill -  $date" >> pidkill.txt

What shell are you using?

There is no reason to export a variable that you are not passing to another shell execution environment.

There is no reason to make a file that does not contain any executable text executable. Consider changing:

if [[ -f "$PpidFile" ]]
then
	/bin/chmod 755 $PpidFile
	Ret=$?
	if [ 0 -eq $Ret ]
	then	...

to something more like:

if [ -r "$PpidFile" ]
then	...

There is no need for the tr command in your command substitution. And a:

while read RelatedEachPid
do	if [ "$RelatedEachPid" = "Ppid" ]
	then	continue
	fi
	...
done < "$PpidFile"

would be much more efficient than:

for RelatedEachPid in `/bin/grep -v "Ppid" $PpidFile | /usr/bin/tr "\n" " "`
do	...
done

There is no done terminating your for loop.

There is no fi terminating your 1st two if statements.

There is nothing that sets the variables PIDkill and date before they are used.

It looks like you might be trying to count the lines in the file named by the variable PIDkill with the command:

echo "total of pids killed: wc-l $PIDkill -  $date"

but printing the string wc-l - won't do that. If you had set PIDkill to the pathname of a file, then something more like:

echo "total of pids killed: $(wc -l $PIDkill) -  $date"

might come closer to doing what you seem to be trying to do.

1 Like

I do not think my syntax is correct.

PpidFile="/path/to/pids.txt"

Ret=$?     
if[0 -eq $Ret]

then 

while read RelatedEachPid [ /bin/grep -v "Ppid" ]
do			
                     /bin/echo "kill -9 $RelatedEachPid"
                        
                     /bin/kill -9 $RelatedEachPid
			Ret=$?

			if [ 0 -ne $Ret ]
			then
				/bin/echo "kill -9 $RelatedEachPid Fail"
			else
				/bin/echo "kill -9 $RelatedEachPid PIDkill"
   fi
fi
done <  $PpidFile 

echo "total of pids killed: $(wc -l $PIDkill) -  $date" >> pid_rm.txt



You are correct. Your syntax is not correct.

I repeat: What shell are you using?

I am using bash

#!/bin/bash
PpidFile="/path/to/pids.txt"

Ret=$?     
if[0 -eq $Ret]

then 

while read RelatedEachPid [ /bin/grep -v "Ppid" ]
do			
                     /bin/echo "kill -9 $RelatedEachPid"
                        
                     /bin/kill -9 $RelatedEachPid
			Ret=$?

			if [ 0 -ne $Ret ]
			then
				/bin/echo "kill -9 $RelatedEachPid Fail"
			else
				/bin/echo "kill -9 $RelatedEachPid PIDkill"
   fi
fi
done <  $PpidFile 

echo "total of pids killed: $(wc -l $PIDkill) -  $date" >> pid_rm.txt

Making several wild guesses based on statements in earlier posts, try:

#!/bin/bash
PIDkill=0			# # of processes successfully killed.
PlogFile="/path/to/pidkill.txt	# File to receive log entries.
PpidFile="/path/to/pid.txt"	# File containing list of PIDs to kill.

# Verify that $PpidFile exists and is readable...
if [ ! -r "$PpidFile" ]
then	printf 'Cannot read file "%s"\n' "$PpidFile"
	exit 1
fi

# Process the PIDs in $PpidFile.
while read RelatedEachPid
do	# Check for the header line...
	if [ "$RelatedEachPid" = 'Ppid' ]
	then	# Header found, skip to next line from $PpidFile.
		continue
	fi

	# Kill the process.
	echo "kill -9 $RelatedEachPid"
	if kill -9 $RelatedEachPid
	then	# kill succeeded: note status and increment counter.
		echo "kill -9 $RelatedEachPid Success"
		PIDkill=$((PIDkill + 1))
	# Following two lines are commented out because the kill command will
	# print a diagnostic message if it fails; why produce two outputs?
	# else	# kill failed: note status.
	#	echo "kill -9 $RelatedEachPid Fail"
	fi
done < "$PpidFile"

# Log the results from this run.
echo "total of pids killed: $PIDkill - $(date)" >> "$PlogFile"

Note that:

  1. I added a line to explicitly use bash (since you didn't say how you invoked this script, the default shell on AIX is a 1988 version of the Korn shell, and the unknown -f command diagnostic could have come from ksh not recognizing [[ -f file ]] although I would have expected that to yield a complaint about [[ instead of about -f ),
  2. a PIDkill variable has been created and it is incremented every time kill succeeds,
  3. a variable ( PlogFile ) has been added to specify the pathname of the output log file (that file was the only file in your script that didn't use an absolute pathname and there is nothing in your script to control the directory in which it runs),
  4. the filename that is the last component in the PpidFile variable has been changed to match the filename specified in post #1 in this thread, and
  5. your reference to the undefined variable date was replaced with a command substitution of the date utility.
1 Like

Thanks, you are so awesome!!!!!!!!!!!!!!!!!!!!!!!!!:smiley:

In general this is a very bad idea. First, processes might be runningg for a long time because they need to run for such a long time. If you kill the db-writer process of a DB it will not help the DB any but most probably corrupt the it beyond repair. If you kill a a systems daemon you might halt the system but most probably achieve nothing productive.

As a rule of thumb: never let scripts kill processes they have not spawned themselves. A script may kill a process it has started before in the background, but any other process should only be killed interactively! The reason is that admins (in general) are a lot smarter than scripts and can analyse the situation before they kill something vital.

This is, for an already bad idea, an even worse realisation. With kill -9 you are killing a process from outside without any chance to clean up: temporary files, shared memory segments, semaphores and any other item a process can allocate will remain there instead of being cleaned up by the exiting process.

If you really need to stop a process try with kill -15 first, then wait for some time. Signal 15 will tell a process to kill itself and well-written processes honour this signal, cleaning up whatever they have allocated. (Processes not doing so should be moved to /dev/null instead of being run, their programmers should be beaten with a UNIX manual.) Only then you might use kill -9 , but NEVER from script and never routinely. This is the desperate measure of last resort and should be used that way.

Finally, your choice of shells:

In AIX the default shell is ksh (in fact a ksh88). You can of course use any shell you want, but it is always a good idea to write scripts in a way so that they assume as little as possible. Don't make your scripts dependent on a non-standard shell without any necessity. Nothing in your script couldn't be written in ksh with the same ease as in bash.

I hope this helps.

bakunin

1 Like

Thanks Don,

I getting one error. I do not know how fix it. In the text file pid.txt column ppid there is a value of 1.

Here is my pid.txt contents

125895
345679
456789
1234567
1
1
1

Here is the error I am getting:

line  31 kill : 9 invalid signal specification

+ read RelatedEachPid
+'[' 1= Ppid ']'
+ echo 'kill 9 1'

How can this be fix, so that I do not get the above error messages?

Here is the code I used:

#!/bin/bash
PIDkill=0			# # of processes successfully killed.
PlogFile="/path/to/pidkill.txt	# File to receive log entries.
PpidFile="/path/to/pid.txt"	# File containing list of PIDs to kill.

# Verify that $PpidFile exists and is readable...
if [ ! -r "$PpidFile" ]
then	printf 'Cannot read file "%s"\n' "$PpidFile"
	exit 1
fi

# Process the PIDs in $PpidFile.
while read RelatedEachPid
do	# Check for the header line...
	if [ "$RelatedEachPid" = 'Ppid' ]
	then	# Header found, skip to next line from $PpidFile.
		continue
	fi

	# Kill the process.
	echo "kill -9 $RelatedEachPid"
	if kill -9 $RelatedEachPid
	then	# kill succeeded: note status and increment counter.
		echo "kill -9 $RelatedEachPid Success"
		PIDkill=$((PIDkill + 1))
	# Following two lines are commented out because the kill command will
	# print a diagnostic message if it fails; why produce two outputs?
	# else	# kill failed: note status.
	#	echo "kill -9 $RelatedEachPid Fail"
	fi
done < "$PpidFile"

# Log the results from this run.
echo "total of pids killed: $PIDkill - $(date)" >> "$PlogFile"


Hi Della,

Filter out the PID "1", it's the "init process" you shouldn't be able to "kill" it. But on the off chance that you somehow manage it, I'll point out that you are unlikely to like the results.

Regards

Dave

PID 1 is the init process, indispensable for the entire system to be up and running. I wouldn't kill that by no means if I were in your shoes...

The error message shown does not quite relate to the script - line 31 is way down below the kill command?

BTW - do you know the difference between PID and PPID? If your file really contains PPIDs (and the 1 in there seems to indicate such), I'd think twice before killing those.