I have to create a new CSV file based on the value listed on the 3rd column from different CSV files. This is what I need:
-
I should substitute the first column from each file, excluding the headers, with the file name
InputXX
. -
Then, I need to look for rows with 0 on the third column in the first file
Input1.csv
, and substitute the entire row with the same corresponding row from the second fileInput2.csv
. This process should be repeated for all rows till the end of the file (96 entries). -
If the value on the third column from the second file is also 0, then, the value from the third file is the one that is used to complete the entry on the first file. If the value on the third file is also 0,then the value from the forth file is used to complete the entry. So on and so on. If the value for that particular row is 0 in all files, then, the original entry is kept.
-
Once the first file is completed, the process is repeated for the second file using data from the remaining files.
-
The rows from the last file that have been used to complete the first files are kept the same on that last file but the value on the third column is changed to 0 since they have been already used.
-
Finally, all data is combined into the one file keeping the headers at the very top.
The original files look like this:
Input1.csv
Source,Well_Source,Volume,Destination_Well,Destination
PCR_Plate,1,5,1,Deadpool
PCR_Plate,2,7,1,Deadpool
PCR_Plate,3,8,1,Deadpool
PCR_Plate,4,3,1,Deadpool
PCR_Plate,5,5,1,Deadpool
PCR_Plate,6,0,1,Deadpool
PCR_Plate,7,4.,1,Deadpool
PCR_Plate,8,7.1,1,Deadpool
PCR_Plate,9,0,1,Deadpool
PCR_Plate,10,0,1,Deadpool
Input2.csv
Source,Well_Source,Volume,Destination_Well,Destination
PCR_Plate,1,0,1,Deadpool
PCR_Plate,2,0,1,Deadpool
PCR_Plate,3,8,1,Deadpool
PCR_Plate,4,3,1,Deadpool
PCR_Plate,5,5,1,Deadpool
PCR_Plate,6,0,1,Deadpool
PCR_Plate,7,4.,1,Deadpool
PCR_Plate,8,7.1,1,Deadpool
PCR_Plate,9,10,1,Deadpool
PCR_Plate,10,0,1,Deadpool
Input3.csv
Source,Well_Source,Volume,Destination_Well,Destination
PCR_Plate,1,5,1,Deadpool
PCR_Plate,2,0,1,Deadpool
PCR_Plate,3,8,1,Deadpool
PCR_Plate,4,0,1,Deadpool
PCR_Plate,5,0,1,Deadpool
PCR_Plate,6,0,1,Deadpool
PCR_Plate,7,4.,1,Deadpool
PCR_Plate,8,7.1,1,Deadpool
PCR_Plate,9,10,1,Deadpool
PCR_Plate,10,0,1,Deadpool
Input4.csv
Source,Well_Source,Volume,Destination_Well,Destination
PCR_Plate,1,5,1,Deadpool
PCR_Plate,2,0,1,Deadpool
PCR_Plate,3,8,1,Deadpool
PCR_Plate,4,5,1,Deadpool
PCR_Plate,5,5,1,Deadpool
PCR_Plate,6,0,1,Deadpool
PCR_Plate,7,4.,1,Deadpool
PCR_Plate,8,7.1,1,Deadpool
PCR_Plate,9,10,1,Deadpool
PCR_Plate,10,10,1,Deadpool
The resulting CSV file should look like this:
Source,Well_Source,Volume,Destination_Well,Destination
Input1,1,5,1,Deadpool
Input1,2,7,1,Deadpool
Input1,3,8,1,Deadpool
Input1,4,3,1,Deadpool
Input1,5,5,1,Deadpool
Input1,6,0,1,Deadpool
Input1,7,4.,1,Deadpool
Input1,8,7.1,1,Deadpool
Input2,9,10,1,Deadpool
Input4,10,10,1,Deadpool
Input3,1,5,1,Deadpool
Input2,2,0,1,Deadpool
Input2,3,8,1,Deadpool
Input2,4,3,1,Deadpool
Input2,5,5,1,Deadpool
Input2,6,0,1,Deadpool
Input2,7,4.,1,Deadpool
Input2,8,7.1,1,Deadpool
Input3,9,10,1,Deadpool
Input2,10,0,1,Deadpool
Input4,1,5,1,Deadpool
Input3,2,0,1,Deadpool
Input3,3,8,1,Deadpool
Input4,4,5,1,Deadpool
Input4,5,5,1,Deadpool
Input3,6,0,1,Deadpool
Input3,7,4.,1,Deadpool
Input3,8,7.1,1,Deadpool
Input4,9,10,1,Deadpool
Input3,10,0,1,Deadpool
Input4,1,0,1,Deadpool
Input4,2,0,1,Deadpool
Input4,3,8,1,Deadpool
Input4,4,0,1,Deadpool
Input4,5,0,1,Deadpool
Input4,6,0,1,Deadpool
Input4,7,4.,1,Deadpool
Input4,8,7.1,1,Deadpool
Input4,9,0,1,Deadpool
Input4,10,0,1,Deadpool
I have been focusing on trying to get it to work using two files with little to no success:
awk -F, 'FNR==NR{a[$0]=$3 FS $3;next}{ print $0, a[$1]}'
Any help will be greatly appreciated