AWK for a beginner

I am going to learn AWK for Pattern search (extracting strings ) related activities. I think that is what AWK is used for anyway.
What book/similair resource would you suggest for a beginner ?

I'd start with a running system, man pages and internet free tutorials. The regular expressions part of awk is a nested item to master on its own.

  • awk is good for lines of fields with regular expression matching that can stimulate storage and printing of values.
  • bash/ksh can do similar field activities, but awk has more of a field orientation and has advanced options that make it easier to process flows of data.
  • sed is more focused on strings and can consider multiple lines, but the state is where in the script is executing and what is in the pattern space buffers. Much of the syntax and regular expression matching is ex/vi/perl/awk/grep/emacs compatible.

The funny thing about awk is that it comes with its own built-in while loop.

A program that looks like this:

awk '/regex/ { print $1 }' filename

Works like this pseudocode:

FS=" "; // Field separator
RS="\n";  // Record separator
NR=1; // Total number of lines read

for(FNR=1; !end_of_file("filename");  FNR++, NR++)
{
        $0=read_line("filename", RS); # Read until next RS character
        tokens[]=split($0, FS); # Split "a b c" into "a", "b", "c"

        if(match(/regex/, $0)) print(tokens[1]);
}

FS and RS are special variables in awk which you can use to change what awk thinks columns and newlines are.

NR and FNR are special variables which count the total number of lines, and the line number in the current file respectively.

The read loop is intrinsic in sed, too, but you can make your own loop reading with 'N', and the = operator is a weak line counter (only prints, so you need a sed after to marry the two line set).

Only the shells need something like "while read ... do ... done <file" and I fear there might not be any eof last line test while processing the last line. Shells can capture lines with 'line' and test that for eof before processing, but you have to read a line ahead to know EOF is waiting, so the loop gets pretty messy.

"sed & awk" by Dale Dougherty, printed by O'Reilly. A phantastic book, written with a good dose of humor, so it makes an educating as well as an entertaining read.

I thought i know sed until i got this book ~15 years ago and relearned it - this time for real.

I hope this helps.

bakunin