need to get the total # of column for each line - NF not working

script_op2a · December 22, 2010, 7:40pm

Hello,

I just need to print the # of columns for each line of the input file.
The input file uses the ascii 009 tab character.
I specify this character as the FS (field separator) in the BEGIN section, and I know the FS character is correct because I can print it.

When I try to print the # of columns using NF it always prints the #1.
If I try to print using $NF it prints the whole line in the input file.

It is a UNIX shell script that takes parameters from the command line.

The input file is in the same dir as the script. It is called i.txt.

The input file contains:

met2    100377    11/18/10    0    20101112    98848533    ?    0    ?            08    XPTPLA    0        0    ?        0    0                        ?    ?    0    0    no    0        source.txt
met2    11018521    11/18/10    0    20101117    98850390    ?    0    ?            08    XPTPLA    0        0
    ?        0    0                        ?    ?    0    0    
no    0        source.txt
met2    100377    11/18/10    0    20101112    98848533    ?    0    ?            08    XPTPLA    0        0    ?        0    0                        ?    ?    0    0    no    0        source.txt
met2    11018521    11/18/10    0    20101117
    98850390    ?    0    ?            08    XPTPLA    0        0
    ?        0    0                        ?    ?    0    0    
no    0        source.txt

The source code is this:

#!/usr/bin/sh

infile="$1/$2"
outfile="$3/$4"
delimiter="\\$5"  
delimiter_conv=`printf $delimiter`;


if [[ ! -r $infile ]]
then
        #echo "file is not readable: $infile"
        exit 1
fi

awk -v out="$outfile" -v delim="$delimiter_conv" '
BEGIN {FS=delim;}
{       

print NF > out;

}' $infile

The name of the script is f.sh.
I execute the script from the command line like this:

./f.sh . i.txt . out.txt 009

The resulting output file contains this:

As you can see, there are 9 lines in the input file and 9 lines in the output files.
This is correct however there # 1 is wrong.
Instead of the # 1 it should be the # of columns in that line with the column separator being the ascii 009 tab character.

Ygor · December 22, 2010, 10:32pm

Try something like...

$ printf "a\tb\tc\n" | awk -v var="009" 'BEGIN{FS=sprintf("%c", var)} {print NF}'
3