Hi, I want to match a column of one file with many others and take the average of each one and put them into one file (I know sounds complicated).
so the 1st file is just a list of names that I want to match with the 2nd file that have names along with rows of values.
awk 'NR==FNR{a[$1];next}($1 in a){print}' 1st.txt 2nd.txt
However I also want to match the 1st file with file 3,4,5,6 etc. How can I write a code (or modify it to do that). Basically 1st file matching with 2md, 3rd, 4th etc...
After I get the match in a output file for files 2,3,4,5 etc., I want to then average it using this code
awk '{ M=NF; for(N=4; N<=NF; N++) T[N]+=$N } END {printf("%f", T[4]/NR);for(N=5; N<=M; N++) printf("\t%f", T[N]/NR);printf("\n");}'
Again I want to do it simultaneously for all files at once. After I want all the values along with the name of the initial files (2,3,4,5 etc) into one final output file along with the values.
Hope I did not confuse anyone..
I am currently dong this one by one and it is taking forever... I just want one file i the end with everything (along with names for each row).
Thanks
I think you could just specify more files after 2nd.txt
awk 'NR==FNR{a[$1];next}($1 in a){print}' 1st.txt 2nd.txt 3rd.txt 4th.txt
If the filenames allow you could use wildcards, e.g:
awk 'NR==FNR{a[$1];next}($1 in a){print}' match_file.txt file*.txt
Hi thanks for replying. Yes that definitely works and I am slowly getting to the final stage.
Here is the new code that I have. Yet again there are still problems but I think a minor tweek from you experts can solve it.
awk 'NR==FNR{a[$1];next}($1 in a){print}' 1st.txt *filmatch.txt | awk '{ M=NF; for(N=4; N<=NF; N++) T[N]+=$N } END {printf("%f", T[4]/NR);for(N=5; N<=M; N++) printf("\t%f", T[N]/NR);printf("\n");}' > output.txt
What the above currently does is match all files based on 1st.txt and puts them into ONE output file. *filmatch.txt is made up of numerous files (2.txt, 3.txt, 4.txt etc.) and I want to include their name in the final file.
right now the current output looks like this (basically just the values that I want averaged):
0.068808 0.067252 0.068956 0.068141 0.068563 0.069272 0.070322 0.070029 0.069015 0.071708 0.071292 0.069931 0.071829 0.070628 0.069996 0.071036 0.070910 0.071590
But I want it to look like this:
1st.txt 0.068808 0.067252 0.068956 0.068141 0.068563 0.069272 0.070322 0.070029 0.069015 0.071708 0.071292 0.069931 0.071829 0.070628 0.069996 0.071036 0.070910 0.071590
2nd.txt 0.068808 0.067252 0.068956 0.068141 0.068563 0.069272 0.070322 0.070029 0.069015 0.071708 0.071292 0.069931 0.071829 0.070628 0.069996 0.071036 0.070910 0.071590
etc...
If you can get this to work then it would be great.
Thanks
You could try something like this, which would print the filenames
awk 'NR==FNR{a[$1];next}($1 in a){print FILENAME $0}' 1st.txt *filmatch.txt
and take it from there...
Hi, thanks. That partly worked (1st step). The first column becomes fused to the filename with that code.
Now I need to average the rows that have the same filename.
Thanks