Problem while counting number of fields in TAB delimited file

I'm facing a strange problem, please help me out.
Here we go.

I want to count number of fields in particular file.
filename and delimiter character will be passed through parameter.

On command prompt if i type following i get 27 as output (which is correct)

cat customer.dat | head -1 | awk -F"\t" '{print NF}'

but when i type following it gives 32 as output

./get_col_lengths.sh customer.dat 'TAB'
data_file=$1
delimiter_char=$2
header_line=`cat $data_file | head -1`
if [ $delimiter_char = "TAB" ]
then
  col_cnt=`echo $header_line | awk -F"\t" -f '{print NF}'`
else
  col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi

it is giving 32 because thr are timestamp (date time) fields in file. So my impression is it is not setting tab as delmiter

Try to change quotes:

col_cnt=`echo $header_line | awk -F'\t' -f '{print NF}'`

this too is not working :frowning:

why there is one more -f ?

col_cnt=`echo $header_line | awk -F"\t" -f '{print NF}'`

sorry for the typo mistake.
here is the correct code which is not working.

data_file=$1
delimiter_char=$2
header_line=`cat $data_file | head -1`
if [ $delimiter_char = "TAB" ]
then
col_cnt=`echo $header_line | awk -F'/t' '{print NF}'`
else
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi

i guess, the if statement is failing..

 
if [ $delimiter_char = "TAB" ]
then
echo "Inside the TAB condition"
col_cnt=`echo $header_line | awk -F'/t' '{print NF}'`
else
echo "Coming here for TAB?"
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi

execute the script as below

./get_col_lengths.sh customer.dat 'TAB'

see, which echo is printing

change the if condition as below and execute it once again

if [ "$delimiter_char" eq "TAB" ]

No dear, "if" is working, tried that already
still i'm putting output here

./get_col_lengths.sh CUSTOMER.dat 'TAB'
No. of Records in CUSTOMER.dat : 200
No. of Columns in CUSTOMER.dat : 1

where is my echo statement in the output?

what shell you are using ?

echo $SHELL
/bin/bash

Code 1:
echo $delimiter_char
if [ $delimiter_char eq "TAB" ]
then
echo "Inside the TAB condition"
col_cnt=`echo $header_line | awk -F'/t' '{print NF}'`
else
echo "Coming here for TAB?"
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi

Output 1:
./get_col_lengths.sh CUSTOMER.dat 'TAB'
TAB
./get_col_lengths.sh: line 21: [: eq: binary operator expected
Coming here for TAB?
No. of Records in CUSTOMER.dat : 200
No. of Columns in CUSTOMER.dat : 1

Code 2:
echo $delimiter_char
if [ $delimiter_char = "TAB" ]
then
echo "Inside the TAB condition"
col_cnt=`echo $header_line | awk -F'/t' '{print NF}'`
else
echo "Coming here for TAB?"
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi

Output 2:
./get_col_lengths.sh CUSTOMER.dat 'TAB'
TAB
Inside the TAB condition
No. of Records in CUSTOMER.dat : 200
No. of Columns in CUSTOMER.dat : 1

use -eq

 
if [ "$delimiter_char" -eq "TAB" ]

for your code

use == ( comparison )

 
if [ $delimiter_char == "TAB" ]
 

Also post the first line (which you are trying to count)

if i use if [ $delimiter_char == "TAB" ]

./get_col_lengths.sh CUSTOMER.dat 'TAB'
TAB
Inside the TAB condition
No. of Records in CUSTOMER.dat : 200
No. of Columns in CUSTOMER.dat : 1

here is the first line of CUSTOMER.dat

MWCU5203691 2010-09-30 14:12:00.000000 GATE-IN PHDVOSW GATE-IN 2010-09-30 14:12:00.000000 EXP N 22489 49580 26889 59280 Y N MCC 688888 5RK 1011 MCC033271 5IZUNBEDJVU1O N 2010-09-30 09:13:42.408011 2010-10-25 00:32:21.615382 2010-10-01 08:22:47.268264 2010-12-02 05:56:18.000000 I

don't store the header in the variable. (it is losing the tab )

use it like this

col_cnt=`nawk -F"\t" '{ if(NR==1) { print NF }}' $1`
bash-3.00$ cat /tmp/myfile   # tab seperated file
abcd    edfg    hijk
bash-3.00$ nawk  -F"\t" '{print $1}' /tmp/myfile
abcd
bash-3.00$ my_var=`cat /tmp/myfile | head -1`
bash-3.00$ echo $my_var  # variable loses the tab 
abcd edfg hijk
bash-3.00$ cat /tmp/myfile      # tab seperated file
abcd    edfg    hijk
bash-3.00$ echo $my_var | nawk  -F"\t" '{print $1}'   # nawk is not able to recognize the delimeter
abcd edfg hijk
1 Like

Thanks a lot.
Its working now. :slight_smile: