I have two files, file1 and file2 who have identical number of rows and columns. However, the script is supposed to be used for for different files and I cannot know the format in advance. Also, the number of columns changes within the file, some rows have more and some less columns (they are shaped like an upper triangular matrix).,as below.
file1 :
11 12 13 14 15
21 22 23 24
31 32 33
41 42
51
file2:
61 62 63 64 65
71 72 73 74
81 82 83
91 92
101
I need to add the values of the elements in the same row and column of the two files and print them in the corresponding row/column. So I need to get
but that is neither adding them up properly nor printing them properly. What would be a way to do add these element considering I'm dealing with large files and variable sizes?
I've now tried this but it also doesn't add up the values properly nor print them in the right format. I suppose my approach isn't the way to go, it was simply the only thing I could think of doing. Is there a better approach to go at it?
Does the fact that I am printing it as string (printf and %s) make any problems with precision or format for later use? The entries in file1 and file2 can be either float or decimal and the precision is very important for the task, so I tried replacing %s with %d or %i to keep the output in the same format but that didn't give the result in a triangular form like it did with %s. Why is that?
well... I assumed the sample files were representative. It seems that they cover only a part of the cases.
It's doable. Please provide the more representative samples for both files.
What happens if the same entries have different floating precision from 2 files? The higher precision wins?
I wrote round numbers for simplicity. So the file is actually several thousand lines/columns long and these are example lines. Yes in this case the precision of the more precise entry should be kept when adding a less precise entry to it.
And the separator is actually three whitespaces, not one, should OFS recognize that if they are consistent in both files?
Unfortunately, I won't be able to get to this for a day or so.
So if somebody else look into this...
It's a matter of parsing the format/precision of the corresponding entries in both files, coming up with the more "precise" spec and using it in the printf format string for the math...
$ awk '
{getline X < F2
n = split (X, T)
for (i=1; i<=NF; i++) $i += T
}
1
' F2="maya22.txt" maya11.txt
3.56367e-07 7.22077e-08 -2.38554e-08 8.61545e-09 1.34745e-09
2.21836e-10 2.88507e-07 1.0633e-07 -2.8497e-08
1.46473e-08 3.02148e-09 -3.94646e-10
2.60537e-07 4.29898e-08
-2.36822e-08
On the other hand:
awk '
{getline X < F2
n = split (X, T)
for (i=1; i<=NF; i++) $i += T
}
1
' CONVFMT='%.16E' F2="maya22.txt" maya11.txt
3.5636651200281602E-07 7.2207650159240707E-08 -2.3855385753423701E-08 8.6154500245612697E-09 1.3474513362335099E-09
2.2183583723650399E-10 2.8850663428589998E-07 1.0632952117690300E-07 -2.8497022231591001E-08
1.4647308040391139E-08 3.0214763933556298E-09 -3.9464610062914300E-10
2.6053745987049467E-07 4.2989789243606999E-08
-2.3682162517781680E-08
This has to do with converting numbers when performing math or converting numbers to strings:
CONVFMT The conversion format for numbers, "%.6g", by default.
------ Post updated at 05:13 PM ------
you can probably generalize the solution for ANY number of the similarly formatted files to operate on: myScript.sh file1 file2 file3 where myScript.sh is: