I'm facing a strange problem, please help me out.
Here we go.
I want to count number of fields in particular file.
filename and delimiter character will be passed through parameter.
On command prompt if i type following i get 27 as output (which is correct)
cat customer.dat | head -1 | awk -F"\t" '{print NF}'
but when i type following it gives 32 as output
./get_col_lengths.sh customer.dat 'TAB'
data_file=$1
delimiter_char=$2
header_line=`cat $data_file | head -1`
if [ $delimiter_char = "TAB" ]
then
col_cnt=`echo $header_line | awk -F"\t" -f '{print NF}'`
else
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi
it is giving 32 because thr are timestamp (date time) fields in file. So my impression is it is not setting tab as delmiter
yazu
June 13, 2011, 8:03am
2
Try to change quotes:
col_cnt=`echo $header_line | awk -F'\t' -f '{print NF}'`
why there is one more -f ?
col_cnt=`echo $header_line | awk -F"\t" -f '{print NF}'`
sorry for the typo mistake.
here is the correct code which is not working.
data_file=$1
delimiter_char=$2
header_line=`cat $data_file | head -1`
if [ $delimiter_char = "TAB" ]
then
col_cnt=`echo $header_line | awk -F'/t' '{print NF}'`
else
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi
i guess, the if statement is failing..
if [ $delimiter_char = "TAB" ]
then
echo "Inside the TAB condition"
col_cnt=`echo $header_line | awk -F'/t' '{print NF}'`
else
echo "Coming here for TAB?"
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi
execute the script as below
./get_col_lengths.sh customer.dat 'TAB'
see, which echo is printing
change the if condition as below and execute it once again
if [ "$delimiter_char" eq "TAB" ]
itkamaraj:
i guess, the if statement is failing..
if [ $delimiter_char = "TAB" ]
then
echo "Inside the TAB condition"
col_cnt=`echo $header_line | awk -F'/t' '{print NF}'`
else
echo "Coming here for TAB?"
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi
execute the script as below
./get_col_lengths.sh customer.dat 'TAB'
see, which echo is printing
change the if condition as below and execute it once again
if [ "$delimiter_char" eq "TAB" ]
No dear, "if" is working, tried that already
still i'm putting output here
./get_col_lengths.sh CUSTOMER.dat 'TAB'
No. of Records in CUSTOMER.dat : 200
No. of Columns in CUSTOMER.dat : 1
where is my echo statement in the output?
what shell you are using ?
echo $SHELL
/bin/bash
Code 1:
echo $delimiter_char
if [ $delimiter_char eq "TAB" ]
then
echo "Inside the TAB condition"
col_cnt=`echo $header_line | awk -F'/t' '{print NF}'`
else
echo "Coming here for TAB?"
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi
Output 1:
./get_col_lengths.sh CUSTOMER.dat 'TAB'
TAB
./get_col_lengths.sh: line 21: [: eq: binary operator expected
Coming here for TAB?
No. of Records in CUSTOMER.dat : 200
No. of Columns in CUSTOMER.dat : 1
Code 2:
echo $delimiter_char
if [ $delimiter_char = "TAB" ]
then
echo "Inside the TAB condition"
col_cnt=`echo $header_line | awk -F'/t' '{print NF}'`
else
echo "Coming here for TAB?"
col_cnt=`echo $header_line | awk -F"$delimiter_char" '{print NF}'`
fi
Output 2:
./get_col_lengths.sh CUSTOMER.dat 'TAB'
TAB
Inside the TAB condition
No. of Records in CUSTOMER.dat : 200
No. of Columns in CUSTOMER.dat : 1
use -eq
if [ "$delimiter_char" -eq "TAB" ]
for your code
use == ( comparison )
if [ $delimiter_char == "TAB" ]
Also post the first line (which you are trying to count)
itkamaraj:
use -eq
if [ $delimiter_char -eq "TAB" ]
for your code
use == ( comparison )
if [ $delimiter_char == "TAB" ]
Also post the first line (which you are trying to count)
if i use if [ $delimiter_char == "TAB" ]
./get_col_lengths.sh CUSTOMER.dat 'TAB'
TAB
Inside the TAB condition
No. of Records in CUSTOMER.dat : 200
No. of Columns in CUSTOMER.dat : 1
here is the first line of CUSTOMER.dat
MWCU5203691 2010-09-30 14:12:00.000000 GATE-IN PHDVOSW GATE-IN 2010-09-30 14:12:00.000000 EXP N 22489 49580 26889 59280 Y N MCC 688888 5RK 1011 MCC033271 5IZUNBEDJVU1O N 2010-09-30 09:13:42.408011 2010-10-25 00:32:21.615382 2010-10-01 08:22:47.268264 2010-12-02 05:56:18.000000 I
don't store the header in the variable. (it is losing the tab )
use it like this
col_cnt=`nawk -F"\t" '{ if(NR==1) { print NF }}' $1`
bash-3.00$ cat /tmp/myfile # tab seperated file
abcd edfg hijk
bash-3.00$ nawk -F"\t" '{print $1}' /tmp/myfile
abcd
bash-3.00$ my_var=`cat /tmp/myfile | head -1`
bash-3.00$ echo $my_var # variable loses the tab
abcd edfg hijk
bash-3.00$ cat /tmp/myfile # tab seperated file
abcd edfg hijk
bash-3.00$ echo $my_var | nawk -F"\t" '{print $1}' # nawk is not able to recognize the delimeter
abcd edfg hijk
1 Like
itkamaraj:
don't store the header in the variable. (it is losing the tab )
use it like this
col_cnt=`nawk -F"\t" '{ if(NR==1) { print NF }}' $1`
bash-3.00$ cat /tmp/myfile # tab seperated file
abcd edfg hijk
bash-3.00$ nawk -F"\t" '{print $1}' /tmp/myfile
abcd
bash-3.00$ my_var=`cat /tmp/myfile | head -1`
bash-3.00$ echo $my_var # variable loses the tab
abcd edfg hijk
bash-3.00$ cat /tmp/myfile # tab seperated file
abcd edfg hijk
bash-3.00$ echo $my_var | nawk -F"\t" '{print $1}' # nawk is not able to recognize the delimeter
abcd edfg hijk
Thanks a lot.
Its working now.