AWK sub function curious problem under bash

I need to detect the number of pages in a print job when it is available so I can warn users when they try to print a report much larger than they expected. Sometimes they are trying to print 1000 page reports when they thought they were getting a 10 page report.

Under linux I am scanning the print job before it goes to the printer and want the user to be able to cancel the print job if it is not what they expected.

This current function works, but is not exactly what I wanted:

 
function get-last-page() 
{ 
awk '/Page/{ f=$NF }; END{ print f }' $1 | awk '{ sub(/Page/,""); print }'; 
}

In my test case the page numbers can be found on the top of each page after the word "Page". -but- sometimes the page count abuts the "Page" label so the above function could return "Page2308" when there is no white space between the regex "Page" and the desired number. There is no problem when there is white space.

Simple answer, just sub out the /Page/ unconditionally and I always get the last page number of the report. -but- I originally tried it like this (using only one awk call) and I cannot for the life of me make it work properly:

 
function get-last-page() 
{
awk '/Page/{ f=$NF }; END{ print sub(/Page/,"",f); }' $1; 
}

The above function does not return the desired 2308 (or whatever). I cannot make out what is wrong. If I change it to the following I get what I expect, including the "Page2308":

 
function get-last-page() 
{
awk '/Page/{ f=$NF }; END{ print f }' $1; 
}

The above behaves as I expect, but has the error I am trying to fix with the second example. I don't understand why I can't do the substitution in the END{} clause but can do it after the pipe (as in the first example).

I really was hoping to get this to work as a sort-of one-liner.

Can anyone help me either fix this or at least understand why it does not work?

I am running under a bash shell on openSuSE 10.3 linux.

TIA

Regarding function 2, sub is not a function that produces a string that you can print. This will perhaps work better:

function get-last-page() 
{
awk '/Page/{ f=$NF }; END{ sub(/Page/,"",f); print f}' $1; 
}

Scutinizer! You da man!

OK I see it now. I had a version that was close but instead of "print f" I just did a print.

I love awk but only brush up against it when I need it and learn as I go. Your comment on how to view the sub function return value was exactly the point I was missing.

Thanks I am a happy camper with my new, working one-line solution! :smiley:

(Are you familiar with Frank Zappa's "Central Scrutinizer" from "Joe's Garage I, II, and III"? Just wondering.)

Great that it works and good to hear some enthusiastic feedback...

Very! What gave it away? :D:D:D . Big fan of one the most intelligent persons ever on planet Earth and his music of course (though obviously not as famous as Max Rebo, cause of the smaller audience ;).

Ahh, so you have heard about the little blue elephant (who is not an elephant)? Exxxxcellent.

-OK- Next problem that is perplexing me (in the same large-print-file problem) is another corker... grrr.

#!/bin/bash
##### Detect potential too-large reports
function get-last-page()
{
   if [ $# == 1 ] && [ -s $1 ] && grep -qs "Page " $1 ; then # File is valid
      PAGES=`awk '/Page/{ f=$NF }; END{ sub(/Page/,"",f); print f }' $1;`
#     echo $PAGES
      if [ "$PAGES" -gt "100" ]; then
         get-operator-option $1 $PAGES "Pages"
         return $?
      fi
   fi
}
##### Ask operator how to handle too-large report
function get-operator-option()
{
   echo -e "The system has detected that your report may be larger than you"
   echo -e "expected.  You need to decide what to do with this report.\n"
   echo -e "Your $1 report is $2 $3.\n"
   echo -e "No matter what you decide your report is saved to disk now.\n"
   echo -e "Please select from the following options (Enter 1, 2, 3, or 4):\n"
   select OPTION in \
                    "Print large report anyway."                     \
                    "Don't print large report (cancel print)."       \
                    "E-mail me information on this report (cancel)." \
                    "E-mail IT department on this report (cancel)."
   do
      case \($REPLY\) in
        "1") return 0;;  # Print  report
        "2") return 1;;  # Cancel report
        "3") return 2;;  # Cancel report and email info to user and IT
#           send-too-large-email ${USER} ${PROGRAM} ${SPOOLFILE};&
        "4") return 3;;  # Cancel report and email info to IT
#           send-too-large-email "IT" ${PROGRAM} ${SPOOLFILE};;
      esac
   done
}
 
#### Test for too-large print job
if [ $(get-last-page $1) -gt 0 ]; then
   echo "exit TESTING: $?"
   read x;
fi
echo "normal TESTING: $?"
read x;

I don't know how I get myself in these fixes, but when I test my script using this function I -NEVER- see the message echo'ed at the top of my get-operator-option function. It goes straight to the select OPTION clause.

(-btw- when I type the function into my current shell and run it as an immediate function, not from a script file, I see the echo'ed text! ai!)

Why is echo not working at the top of the get-operator-option function? What silly little thing am I missing?

TIAA

Sometimes "echo -e" does not work correctly. Better use printf instead. Also the \($REPLY\) seems odd. I would just use $REPLY