Using GAWK to combine files

Hello All,

I have a folder containing few files. Each & every file contain only 1 column.

I want to combine only column of all the files through GAWK, separate them by a delimiter and store it to a new file.

So basically using GAWK, I want to combine '$1' of all files, separate them by a delimiter and save it in a file:
i.e. '$1:$1:$1:$1:$1.....'

This should work:

{ tr '\n' ':'  * ; echo ; } >../output

I'll suggest paste - merge corresponding or subsequent lines of files

paste -d\: * > newfile

tr is throwing an error:

And Paste is using the new line characters of the original rows. So rows are not appearing on single line. I would just like to remove new line characters of all the rows of all the columns except the column of last file.

awk '{printf("%s%s", NR==1?"":":", $0)}END{print ""}' * > newfile

Hmm, here is my test

# ls
file1   file2   file3   file4   file5   file6   file7   file8   file9
# cat *
1
2
3
4
5
6
7
8
9
# paste -d\: *
1:2:3:4:5:6:7:8:9

Can you post a data sample.

I think I missed out one information....

Although all the files have 1 column but that 1 column may have multiple rows but number of rows in all the rows will remain same...

I is unclear to me what you are after. Suppose you have a couple of files with just one column and e.g. 4 rows then danmero's paste command should just work...
example:

$> ls *.txt
a.txt  b.txt  c.txt  d.txt  e.txt  f.txt  g.txt

$> cat a.txt
a1
a2
a3
a4

$> cat g.txt
g1
g2
g3
g4

$> paste -d\: *.txt
a1:b1:c1:d1:e1:f1:g1
a2:b2:c2:d2:e2:f2:g2
a3:b3:c3:d3:e3:f3:g3
a4:b4:c4:d4:e4:f4:g4

---------- Post updated at 01:08 AM ---------- Previous update was at 12:44 AM ----------

-or-

Do you need it transposed, like so:

$> for i in *.txt ; do xargs < $i | tr ' ' ':' ; done
a1:a2:a3:a4
b1:b2:b3:b4
c1:c2:c3:c4
d1:d2:d3:d4
e1:e2:e3:e4
f1:f2:f3:f4
g1:g2:g3:g4
$> paste -s -d\: *.txt
a1:a2:a3:a4
b1:b2:b3:b4
c1:c2:c3:c4
d1:d2:d3:d4
e1:e2:e3:e4
f1:f2:f3:f4
g1:g2:g3:g4

Ok fixed the issue...

I was actually combining windows file format files which had windows specific line break characters. As a result of which columns were not visible on same row...

Hence had to convert the result file to unix file as shown below:

I have few more questions:

  1. What is the significance of black slash (\) in -d\

  2. If I want to use tab as the delimiter then I guess following should be sufficient but somehow its not giving the desired results:

Any pointers?

  1. Is there a way I can use a long single delimiter like '##--##' or may be '&+&+&'
    I tried following but it doesn't seem to work.

Any idea?

  1. In the paste example it is used as an escape character but it is not really necessary there:
paste -d: *.txt 

should work too. in your tr -d example it means ascii character with octal value 15 followed by ascii character oct 32 (carriage return and substitute character)

  1. try paste without the -d option
  2. paste can only use single character separators (or a list thereof, which it will then cycle through and use one character at a time)
    You could try this;
paste Columns/*.out|sed 's/\t/##--##/g' > Table.windows

See man paste for further details

It will also remove the tab charactes if any present in the data.

I think following hack will do the trick....

Changing FS as I am interested in the space characters after the columns otherwise GAWK will trim the white spaces...

Advantage of this is that there no need use 'tr' to remove line break characters - GAWK takes of care of it...

OK, then pick a character that is not present and replace that again by the string you want, like e.g.

paste -d:@ Columns/*.out|sed 's/@/##--##/g' > Table.windows

or some other character.