Would appreciate a quick second set of eyes on a script (regarding doing things in the background)

What I'm trying to do is leave a tcpdump running all the time on a server, and at the end of every day kill it and start a new one. For some reason my boss doesn't want to set this up as a cron job, so I've come up with the following.:

#!/bin/bash

PCAPFILE=/tmp/mgmt.$(date "+%H.%M.%S.%m.%d").pcap

tcpdump -i eth0 -nn -vv -s 1500 net 192.168.5 -w $PCAPFILE   2>&1 & 
dumpPid=$!


TDATE=$(date --date='-1 days ago' "+%m.%d.%y")
DATE=$(date "+%m.%d.%y")


killDump ()
{
    kill $dumpPid
}


zipPcap ()
{
    gzip -c $1 > $1.gz && rm $1
}


    while true; do
        until [[ "$DATE" == "$TDATE" ]]; do
            sleep 1800
        done
        killDump
        zipPcap $PCAPFILE &
        PCAPFILE=/tmp/mgmt.$(date "+%H.%M.%S.%m.%d").pcap
        tcpdump -i eth0 -nn -vv -s 1500 net 192.168.5 -w $PCAPFILE   2>&1 & 
        dumpPid=$!
        TDATE=$(date --date='-1 days ago' "+%m.%d.%y")
        DATE=$(date "+%m.%d.%y")
    done

I've tested this on a small scale on my personal laptop, but I've made a minor change as highlighted above that I want to make sure shouldn't cause any problems.

Since by the end of the day the pcap is probably going to be rather large, I don't want the script to have to wait for the old one to finish zipping before it starts the next capture, so is running the zipPcap function in the background like that an acceptable way of kicking off the compression, and having the rest of the script move forwards?

I'm pretty confident this should be fine, but as I'm working without a lab to test on right now, I wanted to get a second set of eyes.

How many CPU cores does the system have? On a one-core system there's no advantage to running multiple gzips in parallel, since each will be running half as fast. On a four-core you could run four. etc. There's also the question of disk speed -- your disk can probalby keep up with 1 running gzip easily, but how about 4, or 8? And having four different large files being written to disk simultaneously is a recipe for bad fragmentation. There's also the question of memory use, potentially unlimited if you have a huge number of files.

The point? Don't go too nuts. A few may help, dozens probably won't. You may also want to create the file in temp space then move it once complete, to avoid too much fragmentation on the main filesystem.

You'll want a way to limit the number of processes created, create n processes for n cores then start waiting for individual processes before creating another.

I'd also note you might want a way to check whether a background compression succeeded or failed. Again, you can tell this by wait-ing for a specific background PID.

I think this bit will loop forever. Neither variable changes within the loop.

Also. Why not have pcap pipe its output through gzip in the first place, instead of afterwards in one big batch?

I'm not sure what you mean, there are two loops, the while true loop and the until loop.

All the until loop does is every 30 minutes check if the dates match or not. Once they do, it does all the work in the while loop which includes updating the dates:

while true; do  
        until [[ "$DATE" == "$TDATE" ]]; do
            sleep 1800
        done
        killDump
        zipPcap $PCAPFILE &
        PCAPFILE=/tmp/mgmt.$(date "+%H.%M.%S.%m.%d").pcap
        tcpdump -i eth0 -nn -vv -s 1500 net 192.168.5 -w $PCAPFILE   2>&1 & 
        dumpPid=$!
        TDATE=$(date --date='-1 days ago' "+%m.%d.%y")
        DATE=$(date "+%m.%d.%y")
    done

Every time the red part completes, it does the blue part which includes updating the variables that red relies on.

---------- Post updated at 12:56 PM ---------- Previous update was at 12:44 PM ----------

Well my main concern isn't having multiple zips going on at the same time. I mean it should only be zipping the file once a day, and I highly doubt that by the time the next zip comes around the last one will still be running.

My concern is that I don't want the next instance of tcpdump to have to wait for the zip to complete. I wanted the zip to run in the background while the next tcpdump starts.

I decided to do some more testing on a VM on my laptop, and I noticed that unlike what I wanted, the tcpdump was waiting for the zip to complete before starting again, so I moved the ampersand inside the zipPcap function, and now it seems to be working. I'm watching the files get created in real time, and when one stops writing 2 more files appear, the .gz, and the new .pcap. Once the .gz is done, the original file gets deleted.

My testing version:

#!/bin/bash

PCAPFILE=/tmp/mgmt.$(date "+%H.%M.%S.%m.%d").pcap

tcpdump -i eth0 -nn -vv -s 1500 -w $PCAPFILE   2>&1 & 
dumpPid=$!
i=1
killnum=3

TDATE=$(date --date='-1 days ago' "+%m.%d.%y")
DATE=$(date "+%m.%d.%y")


killDump ()
{
    kill $dumpPid
}


zipPcap ()
{
    gzip -c $1 > $1.gz && rm $1 &
}


    while true; do
        until [[ "$i" == "$killnum" ]]; do
            sleep 30
	    let i=i+1
	    echo $i
	    echo $dumpPid
        done
        killDump
        zipPcap $PCAPFILE 
        PCAPFILE=/tmp/mgmt.$(date "+%H.%M.%S.%m.%d").pcap
        tcpdump -i eth0 -nn -vv -s 1500  -w $PCAPFILE   2>&1 & 
        dumpPid=$!
        TDATE=$(date --date='-1 days ago' "+%m.%d.%y")
        DATE=$(date "+%m.%d.%y")
	i=0
    done

Oh, and to answer your question, the box I'm working on has 16 cores...

So far as zipping the pcap in real time...honestly, I didn't know that was an option.

The red part never completes because nothing changes $DATE or $TDATE within the loop.

This might work (untested in bash):

       until [[ "$DATE" == "$TDATE" ]]; do
            sleep 1800
            DATE=$(date "+%m.%d.%y")
        done

---------- Post updated at 05:57 PM ---------- Previous update was at 01:07 PM ----------

[/COLOR]Just an update, this is the version I wound up going with. It seems to be working fine, but I suppose I won't really know for sure until after 1230

#!/bin/bash
######################################################
# Program:      networkCap.sh
# Date Created: 7 July 2010
# Date Updated: NA
# Developer:    G B (Support Manager) && D D (Support Engineer)
# Description:  runs tcpdump on management netwok and automatically starts a new capture each day and zips the old one
######################################################

PCAPFILE=/tmp/mgmt.$(date "+%H.%M.%S.%m.%d").pcap

tcpdump -i eth0 -nnvv -s 1500 net 10.248.89 -w $PCAPFILE   2>&1 &
dumpPid=$!


TDATE=$(date --date='-1 days ago' "+%m.%d.%y")
DATE=$(date "+%m.%d.%y")


killDump ()
{
    kill $dumpPid
}


zipPcap ()
{
    gzip -c $1 > $1.gz && rm $1 &
}


    while true; do
        until [[ "$DATE" == "$TDATE" ]]; do
            sleep 1800
            DATE=$(date "+%m.%d.%y")
        done
        killDump
        zipPcap $PCAPFILE
        PCAPFILE=/tmp/mgmt.$(date "+%H.%M.%S.%m.%d").pcap
        tcpdump -i eth0 -nnvv -s 1500 net 10.248.89 -w $PCAPFILE   2>&1 &
        dumpPid=$!
        TDATE=$(date --date='-1 days ago' "+%m.%d.%y")
        DATE=$(date "+%m.%d.%y")
    done

I don't know what timezone you are in but this is going to loop for ever:

      until [[ "$DATE" == "$TDATE" ]]; do
            sleep 1800
        done

BTW. Many thanks for posting the current version of the script. May people forget to do that.

1 Like

Son of a gun...you know what...you're right. And the weird thing is, as soon as I looked at it again just now, it's obvious that you're right.

Well I just feel silly. Kudos.