I am trying to paste thousands of files together into a matrix. Each file has only 1 column and all the files have the same number of rows (~27k rows). I tried paste * > output as well as some other for loops
but the output only contains the columns from the 1st and last files. The format of the files are as followed. It has a header which is identical to the file name:
Run the command again with the -x (--xtrace) option set, to see how the shell expands / interprets your command.
Run the command paste 235423.tsv 263428.tsv 291417.tsv
Run file *.tsv and post the output.
Run od -tx1c 235423.tsv (and other files) and post (a reasonable part of) the outputs.
I agree with RudiC that we need to see the first few lines of your input files. From the output you have shown us, it would seem that the most crucial would be:
for file in *.tsv
do echo "File: $file:"
head "$file" | od -t1xc
done
Since you haven't bothered to tell us what operating system you're using, if od complains about unknown options, try od -bc instead of od -tx1c .
Before we see the output from the above commands, would anyone care to guess which of these files have DOS <CR><LF> line separators instead of UNIX line terminators? Unfortunately, even if this is the problem, I'm not seeing the output I would have expected.
There you are - DOS line terminators (<CR> = \r = ^M = 0x0D). Those cause the combined lines to wrap back to the left margin for every column. The idiosyncratic output seen in post #3 comes from the <TAB> following the <CR> shifting all consecutive output to the first <TAB> position, overwriting files 2 till n-1, and leaving the final filen column. For a proof, try with longer column elements, or with a different paste delimiter ( -d option).
Remove the <CR> chars with e.g. the dos2unix command, or sed 's/\o015//' .