How to copy a column of multiple files and paste into new excel file (next to column)?

I have data of an excel files as given below,
file1

org1_1	1	1	2.5	100
org1_2	1	2	5.5	98	
org1_3	1	3	7.2	88

file2

org2_1	1	1	2.5	100
org2_2	1	2	5.5	56	
org2_3	1	3	7.2	70

I have multiple excel files as above shown.

I have to copy column 1, column 4 and paste into a new excel file as below shown format (next to column),

org1_1	100	org2_1	100	org3_1	55 ....... 
org1_2	98	org2_2	56	org3_2	62 .......
org1_3	88	org2_3	70	org3_3	72 .......

I have used the following command, but it paste the selected columns next row of each selected data,

for file in *.xls
do
   awk '{print $1,$4}' *.xls > concate.xls
done

Therefore, please help me to do the same.
Thanks in advance.

Could you please try following, since I am posting it from mobile so couldn't test it.

awk '{a[FNR]=a[FNR]?a[FNR] OFS $1 OFS $4:$1 OFS $4} END{val=length(a); for(i=1;i<=val;i++){print a}}'  *.xls

Thanks,
R. Singh

1 Like

Dear Singh,
I am not getting output.

Your sample output shows you might intend to paste fields 1 and 5, not 4. And, your code snippet will print fields 1 and 4 only, for all .xls files in your current working directory, and repeat that for the .xls file count in your current working directory, ervery time overwriting the target file, so only the last loop will persist.

Try this different approach:

$ cut -f1,5 file1 | paste - file2
org1_1    100   org2_1    100   org3_1    55 ....... 
org1_2    98    org2_2    56    org3_2    62 .......
org1_3    88    org2_3    70    org3_3    72 .......
1 Like

Dear Rudic,
Sorry, I mentioned field 1 and 4 instead of 1 and 5. Your command works perfectly fine for 2 files, but i need to loop it for multiple files. Please let me know, how to do the same.

I don't think that would work with *.xls . You should let us know which files to extract and to which to append.

I have 30 csv/excel file, all the excel file has the data similar to the one I mentioned above. So I need to copy column 1 and 5 from all the files and pasted in a new excel file next to next column. All 1 and 5 columns copied from the multiples files to be pasted in columns wise into a new excel file. I need to have all 1 and 5 columns data in a new excel sheet.

Try

> RES; for FN in *.xls; do cut -f1,5 $FN | paste - RES > TMP; mv TMP RES; done
1 Like

My terminal says, the following commands are not found.

RES FN TMP 

I do not know, how to install it.

Those are variable / file names, not commands; nothing to be installed. Above proposal was tested with your two sample files, and it worked. What is your shell? Show a log of what you are doing.

bash is my shell.
I do not know, why it says RES command is not found.

Did you copy the leading > redirection operator as well? This is a shortcut to create an empty file.

And, please, for the future, if something doesn't work, post execution logs and error messages verbatim!

1 Like

Oh sorry, there was a typo, I took length of array , it would have been a .
Try following it should work.

awk '{a[FNR]=a[FNR]?a[FNR] OFS $1 OFS $4:$1 OFS $4} END{val=length(a); for(i=1;i<=val;i++){print a}}'  *.xls

Thanks,
R. Singh

1 Like

Hi.

Here is a solution that uses a single "master" paste with a created list of cut process substitution commands:

#!/usr/bin/env bash

# @(#) s1       Demonstrate creation and execution of arbitrary number of commands.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C cut paste

pl " Basic structure of data file data1, showing invisible characters:"
cat -A data1

pl " Input data files:" data?
head data?

pl " Results, process substitution:"
paste <(cut -f1,5 data1) <(cut -f1,5 data2)

pl " Create a list of process substitution commands:"
command=""
for i in data*
do
  pe " Working on file $i"
  command="$command <(cut -f1,5 ${i})"
done
pe " command = :$command:"

pl " Use the created command string in a master paste command:"
pe " ( noting that command \"eval\" can be dangerous. )"
eval paste $command

exit 0

producing:

$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-7-amd64, x86_64
Distribution        : Debian 8.11 (jessie) 
bash GNU bash 4.3.30
cut (GNU coreutils) 8.23
paste (GNU coreutils) 8.23

-----
 Basic structure of data file data1, showing invisible characters:
org1_1^I1^I1^I2.5^I100$
org1_2^I1^I2^I5.5^I98^I$
org1_3^I1^I3^I7.2^I88$

-----
 Input data files: data1 data2 data3
==> data1 <==
org1_1  1       1       2.5     100
org1_2  1       2       5.5     98
org1_3  1       3       7.2     88

==> data2 <==
org2_1  1       1       2.5     100
org2_2  1       2       5.5     56
org2_3  1       3       7.2     70

==> data3 <==
org3_1  1       1       2.5     100
org3_2  1       2       5.5     35
org3_3  1       3       7.2     44

-----
 Results, process substitution:
org1_1  100     org2_1  100
org1_2  98      org2_2  56
org1_3  88      org2_3  70

-----
 Create a list of process substitution commands:
 Working on file data1
 Working on file data2
 Working on file data3
 command = : <(cut -f1,5 data1) <(cut -f1,5 data2) <(cut -f1,5 data3):

-----
 Use the created command string in a master paste command:
 ( noting that command "eval" can be dangerous. )
org1_1  100     org2_1  100     org3_1  100
org1_2  98      org2_2  56      org3_2  35
org1_3  88      org2_3  70      org3_3  44

Best wishes ... cheers, drl

3 Likes

Extremely sorry rudic, I forgot to copy

>

special character. it works fine.

Dear Singh,
I need to add one more feature in your command. I have excel file names as shown below

org1.xls org2.xls org3.xls org4.xls org5.xls ........... org30.xls

The columns 1 and 5 to be pasted into a new excel file as file existing order. But all the commands suggested here are not printing the columns as per the file existing order.
I need to print org1.xls columns 1, 5 followed by org2.xls, org3.xls ......org30.xls. Is it possible?

Try

 > RES; for FN in $(ls *.xls | sort -t. -k1.4n); do cut -f1,5 $FN | paste - RES > TMP; mv TMP RES; done
1 Like

Its printing exactly reverse order. The last file (0rg30) columns printed first followed by org29, org28 ......org1.

Hello dineshkumarsrk,

Could you please try following(taking ls adoption from RudiC's solution).

ls *.xls | sort -t. -k1.4n |  awk '{a[FNR]=a[FNR]?a[FNR] OFS $1 OFS $4:$1 OFS $4} END{val=length(a); for(i=1;i<=val;i++){print a}}' 

Fair warning I haven't tested it.

Thanks,
R. Singh

1 Like

Sorry singh,
It is only printing the files as given below,

org1.xls
org2.xls
org3.xls
.
.
.
org30.xls