Combine Multiple text or csv files column-wise

Hi All

I am trying to combine columns from multiple text files into a single file using paste command but the record length being unequal in the different files the data is running over to the closest empty cell on the left. Please see below.
What can i do to resolve this ?

File 1		File 2		File3 		File4

A		1		X		7		
B		2		Y		8
C		3				9
D		4				10
E		5
		6

Desired Output

Final File

Column1	Column2	Column3	Column4
    A	            1	            X	            7		
    B	            2	            Y	            8
    C	            3	            9
    D	            4		   10
    E	            5
    6

Current Output

Column1	Column2	Column3	Column4
A	        1	        X	        7		
B	        2	        Y	        8
C	        3	        9
D	        4	        10
E	        5
6

Your current output and desired output look same.
Also you didn't mention the inputs completely. Pls mention sample of each file content, expected out, current output, the command you are tried.

1 Like

Hi Anurag

The tabulation got removed when i put in the post. I attaching it as a text file.

I am using paste File1.txt File2.txt File3.txt File4.txt > Final File.txt

Please see attachment

try using the -d option of paste command to explicitly display another delimiter and check that the right number are paste in the right columns.

(maybe it is just a matter of displaying the tab key)

1 Like

I did try the '-d' option of paste command but to no respite

paste File1.txt File2.txt File3.txt File4.txt

works for me. You may try:

paste -d'\t' File1.txt File2.txt File3.txt File4.txt

Anurag

Not working for me. Did you try with files having different no. of records.

The problem happens when the records in the file on the right are higher than on the file on the left.

When this the case records from right side file spill over to the file on the left like below .

1  A
2  B
   C
 

instead of

1 A
2 B
C
1 Like

Are you sure you didn't inadvertently swapped the desired result with current result? If the result of the paste is not what you want, you can pipe it to

sed 's/^\t\t*//;s/\t\t\t*/\t/g'

Maybe you should add empty lines to your files (especially File3) until your file has the same number of lines than the longer one (MaxLine), before performing the paste command

MaxLine=$(wc -l File* | grep -v total | sort -k 1n | awk '{x=$1}END{print x}' )

Binlib

I did not swap. The formatting got removed. The desired one is to have the output with higher records on the right stay even after pasting unlike now where it is moving over to left column.

---------- Post updated at 04:28 PM ---------- Previous update was at 04:26 PM ----------

ctsgnb

May be adding empty lines is not an option as output is coming from Oracle DB.

Sorry I misread your input file. "pr -m" is your answer:

$ pr -tm <(printf "%s\n" {A..E} 6) <(printf "%s\n" {1..5}) <(printf "%s\n" x y 9 10) <(printf "%s\n" 7 8)
A          1            x              7
B          2            y              8
C          3            9              
D          4            10              
E          5                          
6  

@venky
If these files come from Oracle DB, then you should build the SQL statement that directly generate the output in the wanted format.

binlib

The columns are actually in text files. Can you give mr the exact syntax on how to combine 1.txt and 2.txt and so on to 3.txt

1.txt + 2.txt = 3.txt

---------- Post updated at 02:04 PM ---------- Previous update was at 02:01 PM ----------

ctsgnb

I am redirecting the oracle db output to text files and then want to combine the text files into a csv file. For this i am printing a comma at the end of each record. This works fine until all files have equal no. of records but when the records for file on the left is less than the file on the right then records from right side file spill over to the left.

# cat t1
i
g
h

# cat t2
t
ez
z
z
z
# pr -tm -s   t*
i       t
g       ez
h       z
        z
        z
# pr -tm -s  t1 t2
i       t
g       ez
h       z
        z
        z
#

Note that after the -s option you have the (invisible) following sequence when entering the command line:

<space><cntrl>+<v><tab><space>

you can also go with :

# T=$(echo "\t")
# pr -tm -s $T t1 t2
i       t
g       ez
h       z
        z
        z

Of course you can just replace t1 t2 ... with your filenames

As ctsgnb pointed out, you should use SQL to directly generate the desired result. I am sure this can be done in SQL simpler and more efficiently. It's probably just a simpler joins of four tables. Give us your tables and data, if nobody here can help, you can ask the people in the pl/sql forum of Welcome to The Oracle FAQ | Oracle FAQ.

just give input file and output file .........till now everything goes confusion