Help a newbie please with awk if else statements

Hi,

Despite reading the Conditional Statements chapter in the O'Reilly Sed & Awk book several times and looking at numerous examples, I cannot for the life of me get any kind of if ... else statement to work in my awk scripts! My scripts work perfectly (as they are written at least) and do what they should until I add any if statement(s), at which point I just get syntax errors at every 'if' and every 'else'.

Can anyone shed some light on this please? I am using gawk in DOS under Windows XP Pro SP3 (not my choice of OS, but it pays the bills).

BEGIN {FS="[ \t]*,[ \t]*";
  OFS=",";
  print "Finding good records..." 
 }
NR == 1 { new_fname = gensub(/\.csv$/, "", 1, FILENAME) "_good.csv"; }
$36 ~ /N/ { for (i = 1; i<= NF; i++) {
   gsub(/^ /, "", $i);
   gsub(/ $/, "", $i);
   gsub(/"/, "", $i);
   }
  }
output = ""
 
output = sprintf ("%s,%s,%s", $1, $38, $39);
 
if ( $40 == /^ / && $41 == /^ / ) {
  output = sprintf("%s,%s,%s,%s,", output, $42, $43, $44);
  }
 else {
  output = sprintf("%s,%s,%s,%s,%s,%s,", output, $40, $41, $42, $43, $44);
  }
 
if ( $55 ~ /MR|MISS|MS|MRS/ ) { 
  output = sprintf("%s, %s %s %s %s", output, $59, $55, $56, $57, $58);
  }
 else if ( $55 ~ /Applicant/ && !$58 ) {
  output = sprintf("%s, %s %s %s %s", output, $60, $56, $57, $58, $59);
  }
 
 
END {  
  printf("%s\n",output);
  print "Finished!  Output sent to '"new_fname"'"
  }

Basically I am parsing a csv file which has a few empty fields (they are always in the same place, but are not on every line) that I wish to capture and remove from the output, this is what the if ... else statements do. I have tried other methods but this seems like the best (or only) way to do it, combined with sprintf.

An explicit if/else can appear only inside an action block { ... }.
Could you post sample input and the desired output?

See the bolded remarks I put in the snippets of your code.

Looking at this again, I think you're confused about how awk works. Each line is compared to each pattern, in turn, and if there is a match, the "program" part (for each pattern that matches) is then run, in turn. After all input is consumed, THEN the "END" block is executed. To have one of these programs "skip" to the next line, use "next;". Further points:

  1. I'm not sure what you are trying to do with the NR==1 line (you never write to new_fname). You can output to a file with "printf expr list >filename".

  2. You're not using sprintf() in a meaningful way, except to avoid printing out a newline. You can just do "print $1, $38, $50", except that a newline will be appended. But I don't see why you're using it this way.

  3. Instead of "END", you can have a "default" program at the end. Sometimes you have to give it an expression of "1". This way you can prepare your output and then have:

/pattern/ { program1 }
/pattern2/ { program 2 }
...
1 { print output >newfilename }

You can also put this at the beginning of the list of pattern-programs, and that way it will execute every time, regardless if a program uses "next". Here, however, you have to be sure that (1) you don't print out when there is nothing to print out, and (2) if you don't want to print out something, you empty the field.

length(output) { print output > newfilename } 

/badfield/ {  output=""; next; }
/goodfield/ { output=output "blah blah"; next; }
/other/ { output="fubar"; }
...

Just when I thought I was starting to get the hang of it too! :wink:

I am forming the output filename in a variable, changing the ending to "_good.csv". This works and I am happy with it.

I am attempting to form my output line for each record according to some caveats, namely removing certain blank fields if they are present. I could not see any other way to do this except to use sprintf to add what I needed to my existing string, and then print that to the file. Just using "print $1, $38, $50" didn't catch the blank fields (and other special cases) as I needed it to.

I think I understand this part already, but it has helped me grasp the idea of program blocks and the way awk works. Thanks :slight_smile: