Make multiple files of equal length

I have 150 files with 4 columns each but variable row lengths that I need to combine by column. I do not have any common column. I want to use "paste " command in unix to do it but before that I have to get all my files to be of equal length.

Is there a way using awk or sed to fill up n no. of rows (where n is the length of the biggest file) in each of the smaller files with zeros (0) to make all files of equal length?
Below are examples of my input files:

01-16A1-325     01-16A1-325     01-16A1-325     01-16A1-325
01-16A1-325     01-16A1-325     01-16A1-325     01-16A1-325
A     T     G     C
11     47     0     1
11     47     0     0
11     48     0     0
12     50     0     0
12     53     0     0
13     56     0     0
13     60     0     0
13     62     0     0
13     63     0     0
13     64     0     0
13     66     0     0
14     68     0     0
14     70     0     0
14     72     0     0

Thanks

No, you don't. You can use paste on files of different lengths and then replace empty fields with the desired default value. In my opinion, that's the easiest approach.

Regards,
Alister

awk 'NF<4 {for (i=NF+1;i<=4;i++) $NF=$NF OFS "0"}1' OFS="\t" infile

If your really want to get a file with 600 columns and with some arbitrary 0-s in missed columns, this pipe may work:

grep . *
one.t:0 9 8 7
three.t:q w e r
three.t:a s d f
three.t:z x y v
two.t:1 2 3 4
two.t:5 6 7 8

find . -type f | while read f; do
  wc -l "$f"
done | sort -nr -k1,1 | cut -d' ' -f2- | 
xargs paste -d' ' | awk '
  NR == 1 { N=NF }
  NR != 1 { for (i=1; i<=N; i++) if (!$i) $i=0 }
  1'
q w e r 1 2 3 4 0 9 8 7
a s d f 5 6 7 8 0 0 0 0
z x y v 0 0 0 0 0 0 0 0

Hi Alister,

Thanks for your reply. I wanted to do exactly what you suggested but after running paste I realized that if we have two files of different row lengths, then this is what it does to it: % paste file1 file2 > file3
If the content of file1 is:
1
2
3
and file2 is:
a
b
c
d
the resulting file3 would be:
1 a
2 b
3 c
d

That is why I wanted to make the row lengths of my files equal.
Am I right?

---------- Post updated at 06:29 AM ---------- Previous update was at 06:28 AM ----------

Hi Yazu,

Thanks for your reply. I want to make my row lengths same, my column lengths are equal.

You have not stated anything specific about your file format except for the number of columns, so I have made the following assumptions: The 4 column input files are tab-delimited. All whitespace below consists of tabs, not spaces. Also, I'm assuming that the colon character, :, does not appear in any of the input.

If any of those assumptions is invalid, only minor changes are required.

$ cat c1
1       2       3       4
$ cat c2
5       6       7       8
5       6       7       8
$ cat c3
9       10      11      12
9       10      11      12
9       10      11      12
$ paste -d: c1 c2 c3 | awk '{$1=$1; for(i=1; i<=NF; i++) if(!length($i)) $i=0 OFS 0 OFS 0 OFS 0}1' FS=: OFS=\\t          
1       2       3       4       5       6       7       8       9       10      11      12
0       0       0       0       5       6       7       8       9       10      11      12
0       0       0       0       0       0       0       0       9       10      11      12

This can be simplified a tiny little bit if there are never any null fields in the input data. If that's the case, then the delimiter used by paste can be the same as the delimiter used by the input data. Since there would never be any need to distinguish between a null input data field and paste simulating an empty line in a source file.

Regards,
Alister

Thanks for the code. The only issue is I am trying to make the row lengths equal as my column lengths are already equal.
I have tried to modify your code but I still havenot reached the result I need.
My file with the longest row length has 3581 rows, so I have done the following:

awk 'NR< 3581 {for (i=NR+1;i<=3581;i++) $NR=$NR OFS "0"}1' OFS="\t" infile
However what I get when I do this is the following:
01-16A1-325 0 0 0 0 0 0 0 0 0 01-16A1-325 01-16A1-325 0 0 0 0 0 0 0 0 A T G 0 0 0 0 0 0 0 11 47 0 1 0 0 0 0 0 0 11 47 0 0
0 0 0 0 0 11 48 0 0

0 0 0 0 12 50 0 0

0 0 0 12 53 0 0

0 0 13 56 0 0

0 13 60 0 0

Can you help me modify the code correctly?

Thanks

---------- Post updated at 07:44 AM ---------- Previous update was at 07:31 AM ----------

Hi Alister,

Your assumptions are correct, but I am not sure am getting the correct format after running your code. I want to make my row lengths equal because I have for eg. 3 files with different row lenghts, file 1: 3581 rows, file2: 3578 rows and file3: 3508.
All I want to do is make the row lengths equal and then use paste to combine them. All files have 4 columns. So I am wondering if awk '{$1=$1; for(i=1; i<=NF; i++) is appropriate. If I know correctly it should be NR instead of NF?
Thanks again

---------- Post updated at 08:07 AM ---------- Previous update was at 07:44 AM ----------

Hi Alister,

Your code worked like a charm for me! Please ignore my other post that mentioned that I was not getting the correct format. Thanks a lot. Also thanks for this great forum that helps so many techies when they need it the most..

Do you bother to read what is posted because yazu already gave you the solution...