Awk: BEGIN: prints nothing

My code fails to do anything if I've BEGIN block in it:

Run the awk script as:

awk -f ~/bin/sum_dupli_gene.awk make_gene_probe.txt
#!/usr/bin/awk -f



BEGIN {
    print ARGV[1]
#--loads of stuff
}

END{
#more stuff
}

If I remove BEGIN my code works perfectly. Can someone help me what am I doing wrong?

You have an awk script. Is it called ~/bin/sum_dupli_gene.awk make_gene_probe.txt ? That looks more like the argument to -f that you'd pass into the script.

Are you sure you don't want something like:

/path-to/my/awk-script ~/bin/sum_dupli_gene.awk make_gene_probe.txt

Thanks for the reply.

If I don't use -f I get error:

awk  ~/bin/sum_dupli_gene.awk make_gene_probe.txt
awk: 1: unexpected character '.'

I'm unable to fix this error.

If I don't use awk and -f then nothing gets printed

~/bin/sum_dupli_gene.awk make_gene_probe.txt

Ah, sorry, I misread your post, thought that was all a path.

You don't need to invoke it with awk as you have a 'shebang' line, but if you do, yes, you will need the -f.

You're accessing the script via the full path, but the input file as a relative path. Is the input file in the current directory?

Also, which OS are you using?

I am on linux

which awk

/usr/bin/awk

---------- Post updated at 06:52 PM ---------- Previous update was at 06:52 PM ----------

Yes, the input is in current directory.
Code works fine if I remove BEGIN.

Linux version 4.9.0-4-amd64 (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.51-1 (2017-09-28)

All I can see in the BEGIN section is print ARGV[1]. There's nothing wrong with that. Are you saying it does not work just with that? "#--loads of stuff" doesn't really help us if we can't see that "stuff". Can you be more specific about what you mean by "doesn't work"?

I run script as:

 ~/bin/sum_dupli_gene.awk make_gene_probe.txt

I removed ARGV[1]. The code for loads of stuff looks like:

#!/usr/bin/awk -f

#awk -F'\t' -v OFS='\t' '

BEGIN {
     if(NR==1){

	 header="AffyID"

	 for(j=2;j<=NF;j++){
	     header=header"\t"$j
	 }
     }
     #---fix headers
     
     else {
	 #--start with row second
	 
	 if($2 in gene_arr) {
	     for(k=3;k<=NF;k++){
		 #starting with 3rd column. will keep sum array with 3rd too
		 
		 sum[gene_arr[$2],k]= $k + sum[gene_arr[$2],k]
	     }
	 }
	 else {
	     #--if gene not in array
	     gene_arr[$2]=NR
	     #starting with 3rd column. will keep sum array with 3rd too
	     for (k=3;k<=NF;k++){
		 sum[NR,k] = $k;
	     }
	 }
	 
     }
 }
 #--processing ends for columns and headers
 #-------------------
 END{
     print header
     
     for (key in gene_arr){
	 
	 printf key"\t"key"\t"

	 for(z=3;z<=NF;z++){
	     #starting with 3rd column. will keep sum array with 3rd too
	     if(z==NF){ #if last columns
		 printf sum[gene_arr[key],z]
	     }

	     else{
		 printf sum[gene_arr[key],z]"\t"
	     }
	 }
	 printf "\n"
     }
 }
#--------------------------
 
#  < make_gene_probe.txt
     

By doesn't work I mean it prints blank lines. '\n' That's it. No print header or anything.

First question: why are you doing this in the BEGIN section in the first place? It's executed once at the beginning, not for each record (line) in the file.

Oh!

Sorry, didn't know or understand that.

Follow-up query:
How do I mention field separator in my code?

The field separator is FS. You can either pass it into the script, either from the command line when you invoke the script (by adding -v FS=?), or by setting it in the BEGIN section (with FS=?).

Remember if you invoke the script from the command like with awk -v FS=? -f ..., the shebang is not used.

If you chose to invoke the script directly, e.g. with: ~/bin/sum_dupli_gene.awk make_gene_probe.txt , you can add the -v option there:

~/bin/sum_dupli_gene.awk -v FS=? make_gene_probe.txt

Or by adding it to the shebang line:

#!/usr/bin/awk -v FS=? -f

(where ? is the field separator to use)

Thanks

#!/usr/bin/awk -v FS='\t' -f

Error: awk: improper assignment: -v FS='\t' -f

Running as: ~/bin/sum_dupli_gene.awk make_gene_probe.txt

Hmm. I must have just dreamt (or imagined) that would work, but was sure it did!

Use another option (such as)

BEGIN {
  FS="\t"
}

or

~/bin/sum_dupli_gene.awk -v FS="\t" make_gene_probe.txt

(probably the first one)

~/bin/sum_dupli_gene.awk -v FS="\t" make_gene_probe.txt

This I tried before posting and worked.

Somehow I was trying to minimize arguments on command line with awk. Thanks for your extensive support. :slight_smile:

BEGIN {
  FS="\t"
}

This worked. But this fails if I use '\t' instead of "\t"
I had put '\t' because it's a character not string. Don't know. I just over thought. :stuck_out_tongue:

:slight_smile:

awk code's usually enclosed in single quotes, so using them inside it requires a bit of escaping.

Oh.
Thanks for explaining. :slight_smile: