Capturing column headers in an array

Hello,

I am processing a tab delimited text file and need to grab all of the column headers in an array.

The input looks like,

num     Name            PCA_A1     PCA_A2       PCA_A3
0       compound_00     -3.5054     -1.1207     -2.4372
1       compound_01     -2.2641     0.4287      -1.6120
2       compound_02     -2.7516     -0.1016     -2.1137
3       compound_03     -1.3053     1.8495      -1.0224
4       compound_04     -1.1845     -0.3377     -2.9453
5       compound_05     -2.9492     -0.8277     -2.7023
6       compound_06     -0.6327     1.8127      -1.1693
7       compound_07     -0.2988     1.3539      -1.6114
8       compound_08     2.6872     -1.3726      -5.9732
9       compound_09     -1.4546     -0.8284     -3.5016

I captured the first line of the input with,
header_row=$(sed -n 1p "$input_file")

then I parsed "header_row" on tab,
IFS='\t' read -a column_headers <<< "$header_row"

If I print the size of the array, I get 1 and not 5.

# derive number of columns
number_of_columns=${#column_headers[@]}

echo "number_of_columns"
echo $number_of_columns

If I print the first array element, I get everything,

echo ${column_headers[0]}
num Name PCA_A1 PCA_A2 PCA_A3

I have tried parsing on space instead of tab in case the output of sed did not preserve the tabs,
IFS=' ' read -a column_headers <<< "$header_row"

but that gives me the same results. I am still ending up with the header row as a single string, not parsed into the elements of an array.

Any suggestions as to what I am doing wrong?

LMHmedchem

IFS='\t' doesn't do what you think it does.

IFS=' ' won't work if the delimiters are tabs.

Why not leave IFS alone so it can accept any whitespace?

read does not need awk/sed/cut/etc's help to read from a file, either:

read -a column_headers < filename
1 Like

Thanks, that sorted it out.

The reason I added the sed line was I only wanted the first row. Is there a way to use read to just get the first row?

LMHmedchem

It's already doing that. It doesn't need sed's help.

1 Like

Alright, that simplifies my script some. What is the reason for read using "<<<" to read from a bash variable but only "<" to read from a file?

LMHmedchem

The > and < file redirections have been around since the first UNIX shells were created. The Bourne shell added << and <<- here-documents. Recent bash and ksh and a few other shells have added <<< short-cut here-document redirections.

1 Like