Array Length Reports as Having Length when it is Empty?

mrm5102 · September 13, 2012, 12:08pm

Hello All,

I have this script that does stuff like "starting, stopping & restarting" a Daemon Process running on my machine...

My main question is why in part of my code (which you will see below) does the Array Length (i.e. ${#PIDS[@]} )
return "1" when I know the Array is empty..?

Here is a little explaination of the problem...
Occasionally when runnning the "ps auxww" command the output finds more than one instance of the Daemon. Because of
this I save the PID(s) into an array (which is declared at the start of the script), then inside a function I
save the PID(s) into that array. Like this below...

For Example:

### Declare the Array for the PID(s)
declare -a PIDS

    :
    :
.... Other Code ....
    :
    :

### START: Function check_ps()...
#####################################
check_ps()
{
    ### Reset ${PIDS[]} to nothing...
    PIDS=""

    ### Save the Output into an array:
    PS_OUTPUT=( $(ps auxww | grep -v grep | grep "/usr/local/bin/Daemon_Process") )

    ### If this array is NOT empty, then start saving the PIDs
    if [ ! -z $PS_OUTPUT ]
     then
        x=0
        for line in "${PS_OUTPUT[@]}"
         do
            PIDS[$x]=$(echo "$line" | awk -F ' ' {'print $2'})
            x=$(($x+1))
        done
    else
        echo "The Daemon is NOT Running!"
    fi
}    
#####################################
### END: Function check_ps()...

Now if I pass the "stop" argument to the script, after I collect all the PIDs using the code above, I then try to
kill the PIDs. Like this...

stop_daemon()
{
    # Call Function check_ps to get the PIDs
    check_ps

    ### If the Array PIDS contains data, then...
    if [ ${#PIDS[@]} -gt 0 ]
     then
        ### While The Array still contains data, do...
        while [ ${#PIDS[@]} -ne 0 ]
         do
            ### PRINT SOME STUFF FOR TESTING...
            echo "LENGTH = ${#PIDS[@]}"
            echo -e "ELEMENT 0 = \"${PIDS[0]}\"\n"

            kill ${PIDS[0]}
            KILL_RETCODE=$?

            ### Re-Check 'ps auxww' for the daemon
            check_ps
        done
    fi
}

Ok so, I was testing in the while loop in the "stop_daemon()" Function above. Lets say that there was only one PID found.
I had issued some echo commands to see what the values of the variables were since it seemed that it was an endless
loop. I echoed out the Length of the array (i.e. ${#PIDS[@]}) and it ONLY prints "1" everytime, and I also print the
first element of the array (i.e. ${PIDS[0]} ) which prints nothing AFTER the first iteration through the loop.

Example Output:

LENGTH = 1
ELEMENT 0 = "25589"

LENGTH = 1
ELEMENT 0 = ""

LENGTH = 1
ELEMENT 0 = ""

LENGTH = 1
ELEMENT 0 = ""

LENGTH = 1
ELEMENT 0 = ""
^C            --> HAVE TO KILL THE SCRIPT TO STOP THE LOOP.

Does anyone know why this loop never terminates? I know some of you will say that I don't need an array since I'm re-checking
the 'ps auxww' output on each loop, but I needed the array in another section of my script. Even still I feel like the loop
should be terminating since it has no elements... I think?

Any thoughts would be great!

Thanks in Advance,
Matt

Corona688 · September 13, 2012, 12:35pm

You're not even using the array, though. You're just shoving it into a list with @. Your code is so much more complicated than it'd be without, especially since it's possible -- as you've discovered -- to have blank elements in an array.

It's silly to run awk 3,000 times to process 3,000 lines. awk can process them all at once, and do the grep and grep -v grep too. It's not a glorified cut.

You don't need to use grep -v grep to exclude itself. grep '[p]rocessname' for example won't match '[p]rocessname' but will match 'processname'. The awk expression below won't match itself because of the need to use \/ for /.

Given a list like "PID1 PID2 PID3 PID4" you can loop it like for X in $PIDS # Not quoted! no arrays required. It will do what you expect, and not give you surprises like your array did.

If you need the ability to access elements, you can do set -- $PIDS to get a nice $1 $2 ... $N list, ${!N} to access element N in bash, and even get to use shift to pop off the front element one at a time this way.

You can get a nice space-separated list of PIDs in one operation:

check_ps()
{
        PIDS="$(ps auxww | awk -v ORS=" " '/\/usr\/local\/bin\/Daemon_Process/ { print $2 } END { printf("\n") }')"

        # Get count of strings in $# by setting $1, $2, ...
        # You weren't using function parameters anyway, so no harm.
        set -- $PIDS

        # Return true when PIDs exist to make your loop simpler
        [ "$#" -gt 0 ] && return 0
        # Return false when there's no PIDs
        return 1
}

while check_ps
do
        for X in $PIDS
        do
                kill $X
                break # Kill only the first by leaving the loop
        done
done

mrm5102 · September 13, 2012, 1:01pm

Hey Corona688, thanks for the reply.

Yea, sorry you have to bear with me I'm no expert at this as you can see...

My problem with coding has always been OVER-Complicating things, so thanks for the suggestions! I have seen the
"grep [p]rocessname" before but hadn't a clue what it was for... Cool thanks!

Also, I need to store all the output from the "ps auxww" command because I'm also getting the start time, cpu, as well as some other stuff
that I didn't include in my code above for the sake of over-complicating my original question... Which I've seemed to do anyway, ha!

And for the moment learning the whole awk language isn't really an option at the moment due to time constraints. But I
appreciate the info and hopefully I can dive into that later...

As for arrays, I thought I remember reading that putting a command in "ARRAY=( $(...) )" would automatically split the output on the IFS and
save it into an array..? I'm guessing that's incorrect?

Just so I'm understanding arrays correctly, would I have to be explicit in giving it an index in order for it to be considered an array?
Like this: not using this but just an example...

OUTPUT=$(ps auxww | grep [p]rocessname)

IFS='
'

x=0
for line in $OUTPUT
 do
        PID[$x]=$(echo "$line" | awk -F ' ' {'print $2'})
        x=$(($x+1))
done

Anyway thanks for your suggestion, much appreciated!

Thanks Again,
Matt

Corona688 · September 13, 2012, 1:10pm

You're making the same mistakes again. This:

for X in 9000 lines of junk
do
        echo "single line" | awk '{ extract something from single line }'
done

is almost always wrong since you can do it all with awk '{ extract something } ' file-with-9000-lines-of-junk in one operation instead of 9,000.

If you must save all its output, then:

OUTPUT="$(ps auxww | grep [p]rocessname)"

# Don't run awk 9,000 times to process 9,000 lines.  Run it ONCE for everything.
PIDS="$(echo "$OUTPUT" | awk '{ print $2 }')"

# Get count in one operation
set -- $PIDS
echo "There are $# PID's"

for X in $PIDS
do
        echo "$X"
done

Note that set -- overwrites your $1 $2 ... commandline variables.

Also note that functions have their own, independent set of $1 $2 ... variables. set -- there does not overwrite the global ones.

As for arrays, I thought I remember reading that putting a command in "ARRAY=( $(...) )" would automatically split the output on the IFS and
save it into an array..? I'm guessing that's incorrect?

It splits it, yes. But as you've discovered it's possible to get blank elements. And usually there's no point in doing so -- string splitting works everywhere, not just in arrays, so why not cut out the middleman and do it directly?

Corona688 · September 13, 2012, 1:15pm

How about you explain exactly what you want to do, instead of the way you're trying to do it? There might be a simpler way you're missing.

mrm5102 · September 13, 2012, 1:48pm

mrm5102:

Just so I'm understanding arrays correctly, would I have to be explicit in giving it an index in order for it to be considered an array?
Like this: not using this but just an example...
OUTPUT=$(ps auxww | grep [p]rocessname)

IFS='
'

x=0
for line in $OUTPUT
 do
   PID[$x]=$(echo "$line" | awk -F ' ' {'print $2'})
   x=$(($x+1))
done
Anyway thanks for your suggestion, much appreciated!

Thanks Again,
Matt

Like I said, I'm NOT using this code, I was simply asking if that's how you assign into array elements...!

I get what your saying, and that would be better if I was processing a file with 9,000 lines. BUT in this case I'm only checking "ps auxww" which wouldn't
have more then a TOTAL of a couple hundred processes running and I have never seen more then 3-4 instances of the Daemon that I'm looking for, so "9000 times"
is kind of silly to say...

I get it's MUCH more efficient, but after just trying both ways side-by-side, the difference in speed and cpu is unnoticeable...

Anyway, I appreciate your help and explanations with all this but now I think I'm going to go back and rewrite this whole thing... And try and spend a little time with awk.

So thanks AGAIN for your help!

Thanks Again,
Matt

Corona688 · September 13, 2012, 1:55pm

Ever wonder why flash ads are so good at making computers lag and freeze? They can't have been that bad on the author's computer. And certainly, running by themselves on a studio-grade machine, a flash ad can chug along just fine even jammed on 100% quality maximum framerate. It's when you start putting 3 of them on one page that they start misbehaving and competing with each other. But they work fine for the author, and that's good enough for them!

If your program's the only thing ever running on that computer, it will be okay. But that's not a good habit to learn; someday you won't be able to get away with it. And it makes life so difficult for you anyway...

mrm5102 · September 13, 2012, 2:30pm

Yea I hear ya Corona...
It definatly makes sense what you were saying, but this script will only be ran manually
when myself or another person needs to restart/start/stop/status the daemon...

I definatly agree with you and I appreciate you taking the time to explain it!
I just need to get some more practice with awk's language before I can utilize it
with any confidence, and be able to know what exactly, for instance, the one line
You wrote does, and be able to debug later if need be.

Thanks Again,
Matt

Corona688 · September 13, 2012, 3:58pm

Here's an example of how awk works.

$ cat <<"EOF" >example1.awk

# Comments begin with # like in shell

# It takes a list of expressions and code blocks, and for every line
# it reads, runs the code blocks in order.
# next will cause it to skip to the next line without running statements below.
{ print "This runs first" }
{ print "This runs second" }
{ print "This runs third"; next }
{ print "This would run fourth except for that next" }
EOF

$ echo | awk -f example1.awk

This runs first
This runs second
This runs third

$ cat <<"EOF" >example2.awk

# Instead of code-blocks, you can put simple statements.
# Whenever the expression is true, the current line will be printed.
# /string/ is a statement.  Like in sed, that means to look for string.

/string/
EOF

$ echo "string" | awk -f example2.awk

string

$ echo "strong" | awk -f example2.awk

$ cat <<EOF >example3.awk
# You can put an expression in front of a code block on the same line.
# This will cause it to run the code block only when it is true.
/string/ {
        print "String matched"
}
EOF

$ echo "string" | awk -f example3.awk 

String matched

$ echo "strong" | awk -f example3.awk

$ cat <<"EOF" >example4.awk
# awk supports variables and columns.  $ does NOT mean variable, $ is
# an operator!  It means column.  So $5 is the 5th column.
# If you did N=5; print $N, that would print the 5th column.

# If the first column is 1, it will print column 1.
# If the first column is 2, it will print column 2.  etc.
{ print $($1) }
EOF

$ echo "1 a b c d" | awk -f example4.awk

1

$ echo "4 a b c d" | awk -f example4.awk

c

$ cat <<"EOF" >example5.awk
# BEGIN and END are special code-blocks which run before the program
# begins reading lines and once it finishes reading lines.
BEGIN { print "This runs before" }
END { print "This runs after" }
# This expression is always true, so always prints
1
EOF

$ echo "asdf" | awk -f example5.awk

This runs before
asdf
This runs after

$ cat <<"EOF" >example6.awk

# awk uses spaces as separators by default, but can split on any
# character.  GNU awk even lets you use a regex.
BEGIN { FS="." }
{ print $2 }
EOF

$ echo "A.b.c.d" | awk -f example6.awk

b

$ # You can assign columns, not just read them!
$ echo "1 2 3 4 5 6 7 8 9 10" | awk '{ $5="asdf" } 1'

1 2 3 4 asdf 6 7 8 9 10

$ # There are lots of special variables available.  NF is the number of
$ # columns, starting at 1 ( which makes $NF the last column ).
$ # NR is the total number of lines.
$ # FNR is the line-number in the current file being read.
$ # You can tell it to read stdin with -.  If no files are given, it also reads stdin.
$ # FILENAME is a special variable which tells you what file awk is currently reading.
$ printf "%s\n" a b c d e f g | awk '{ print "File", FILENAME, "Total", NR, "Current", FNR, $0 }' - example1.awk

File - Total 1 Current 1 a
File - Total 2 Current 2 b
File - Total 3 Current 3 c
File - Total 4 Current 4 d
File - Total 5 Current 5 e
File - Total 6 Current 6 f
File - Total 7 Current 7 g
File example1.awk Total 8 Current 1
File example1.awk Total 9 Current 2 # Comments begin with # like in shell
File example1.awk Total 10 Current 3
File example1.awk Total 11 Current 4 # It takes a list of expressions and code blocks, and for every line
File example1.awk Total 12 Current 5 # it reads, runs the code blocks in order.
File example1.awk Total 13 Current 6 { print "This runs first" }
File example1.awk Total 14 Current 7 { print "This runs second" }
File example1.awk Total 15 Current 8 { print "This runs third"; next }
File example1.awk Total 16 Current 9 { print "This would run fourth except for that next" }

$ # The OFS variable controls the output field separator.  It defaults to a space.
$ # You can set variables outside the awk program with -v VAR=VALUE.
$ # So let's show what OFS does.
$ printf "%s\n" a b c d e f g | awk -v OFS="," '{ print "File", FILENAME, "Total", NR, "Current", FNR, $0 }' - example1.awk

File,-,Total,1,Current,1,a
File,-,Total,2,Current,2,b
File,-,Total,3,Current,3,c
File,-,Total,4,Current,4,d
File,-,Total,5,Current,5,e
File,-,Total,6,Current,6,f
File,-,Total,7,Current,7,g
File,example1.awk,Total,8,Current,1,
File,example1.awk,Total,9,Current,2,# Comments begin with # like in shell
File,example1.awk,Total,10,Current,3,
File,example1.awk,Total,11,Current,4,# It takes a list of expressions and code blocks, and for every line
File,example1.awk,Total,12,Current,5,# it reads, runs the code blocks in order.
File,example1.awk,Total,13,Current,6,{ print "This runs first" }
File,example1.awk,Total,14,Current,7,{ print "This runs second" }
File,example1.awk,Total,15,Current,8,{ print "This runs third"; next }
File,example1.awk,Total,16,Current,9,{ print "This would run fourth except for that next" }

$ # ORS is like OFS, except it controls what's printed at the end of a line.  Defaults to newline.
$ printf "%s\n" a b c d e f g | awk -v ORS="," '{ print "File", FILENAME, "Total", NR, "Current", FNR, $0 }' - example1.awk
File - Total 1 Current 1 a,File - Total 2 Current 2 b,File - Total 3 Current 3 c,File - Total 4 Current 4 d,File - Total 5 Current 5 e,File - Total 6 Current 6 f,File - Total 7 Current 7 g,File example1.awk Total 8 Current 1 ,File example1.awk Total 9 Current 2 # Comments begin with # like in shell,File example1.awk Total 10 Current 3 ,File example1.awk Total 11 Current 4 # It takes a list of expressions and code blocks, and for every line,File example1.awk Total 12 Current 5 # it reads, runs the code blocks in order.,File example1.awk Total 13 Current 6 { print "This runs first" },File example1.awk Total 14 Current 7 { print "This runs second" },File example1.awk Total 15 Current 8 { print "This runs third"; next },File example1.awk Total 16 Current 9 { print "This would run fourth except for that next" },

$

mrm5102 · September 14, 2012, 5:08pm

Cool, thanks for the examples!

Thanks,
Matt

---------- Post updated at 05:08 PM ---------- Previous update was at 09:38 AM ----------

Hey Corona, I'm playing with awk a bit and was trying some examples and just general testing of awk.

If say you have a 'string' that is basically just a list of numbers separated by a space, how do you print from
'{print $1}' to say the last element/column in the string..?

I thought it had something to do with "END" but I couldn't figure it out...

For Example: I had been doing stuff with Unix Timestamps previously and thought to return to it with a different approach using awk for some practice...

DATE_VARS=$(date "+%S %M %k %d %m %Y")
# OUTPUT -->     DATE_VARS == "09 02 17 14 09 2012"

### Use 'awk' to set the variable PERL_VARS...
PERL_VARS=$(echo "$DATE_VARS" | awk -F ' ' -v ORS=',' '{print $1} END')

# OUTPUT of $PERL_VARS should be:
# "09,02,17,14,09,2012"

Not sure what I'm missing in the "awk" command..?
I'm sure there's a million different ways to do this but I was curious how exactly I would do it like the above example?

Any suggestions??

Thanks in Advance,
Matt

pamu · September 14, 2012, 5:27pm

you can do like this..

 $echo "$DATE_VARS" | tr " " ","
50,16,17,14,09,2012

and for your command..

PERL_VARS=$(echo "$DATE_VARS" | awk -F ' ' -v ORS=',' '{print $1} END')

1) Space is by default field separator so no need to give..
2) $1 prints only first field
3) END command needs action part...

try this..

echo "$DATE_VARS" | awk  -v ORS=',' '{for(i=1;i<=NF;i++) {print $i}}'

RudiC · September 15, 2012, 1:10pm

date "+%S %M %k %d %m %Y"|awk '$1=$1' OFS=,

Corona688 · September 15, 2012, 4:10pm

Use $0 to get the entire string.

It does not. END is the end of the program, not the string.

Corona688 · September 15, 2012, 4:12pm

You do not need awk to solve your problem. Simply make the date the way you want it in the first place, and you will not have to translate it.

DATE_VARS=$(date "+%S,%M,%k,%d,%m,%Y")

See RudiC's example, though. Also, the one I wrote you which uses OFS too.

mrm5102 · September 17, 2012, 9:36am

Hey guys, thanks all for your replies!

Cool I'll give these a shot and post back...

EDIT:
What I'm doing is this,,, When I run the 'ps' command and I'm looking for a particular process that's running, sometimes there
are multiple instances running of this same process and say for instance 2 of the processes started a day or 2 ago, but that's
all they tell you is the date it started (i.e. the 'ps' command's "STIME"). So what I did was I run "ps -eo pid,etime,cmd" and I get the
elapsed time of the process.
The output of the elapsed column looks like "2-21:44:05", meaning it started 2 days, 21 hours, 44min, and 5 seconds ago. I then use
the time variables from this ETIME column in the output of 'ps' and calculate the other variables I need and feed them into Perl's timelocal
Function with ($sec, $min, $hour, $mday, $month, $year) in order to get a UNIX timestamp for a past date and time.

Thanks Again,
Matt

mrm5102 · September 20, 2012, 12:48pm

Hey All,
hopefully someone is still watching this thread, I just had one simple question...

I was just wondering what putting this --> $(...) in double quotes does...?

I searched through the "Bash Cookbook" I have, but I couldn't find any explanation on this...
Also, Googled like crazy but I wasn't able to find anything specifically on this..?

What is the difference between these 2 in the example below?

### What's the difference between this:
PIDS="$(ps auxww)"

### And this:
PIDS=$(ps auxww)

I tried testing both, and both outputs look the exact same...?
Anyone know the difference?

Thanks in Advance,
Matt

Corona688 · September 20, 2012, 1:35pm

In that case, there doesn't appear to be one, since $( ) doesn't get split. Other things like variables or a literal string would get split without quotes however.

This is because of a feature the bourne shell has. You can set a variable for a single command:

HTTP_PROXY="localhost:3128" wget http://whatever.com/

The red is the variable, the blue is the command. wget is a command to download webpages, and checks the HTTP_PROXY variable for what proxy to use.

So if you do

STRING=a b

, it doesn't try to set the string to "a b", it sets STRING=a and tries to run the command b.

mrm5102 · September 20, 2012, 2:41pm

Hey Corona688, thanks for the reply.

Ah-ha.... Cool thanks for the explanation, that makes sense!

Thanks Again,
Matt