Below is the script. Here I am first trying to join the files : sampleoutput1.txt and sampleoutput3.txt. Then I am trying to join the resultant with sampleoutput2.txt
I'd be surprised if sampleoutput1-3.txt held the correct output as the field separator is NOT ";" and the files' sequence is wrong.
Given the first file does NOT have a line break before the third field, and that spaces are used consistently around the field contents, and that the DOS/windows <CR> line terminators are removed, this might work:
Hello RudiC , the files are not sorted moreover there are some records for hostname which are present in sampleoutput1.txt and not present in sampleoutput2.txt and sampleoutput3.txt. Also delimiter used is ";". I think the above wont work in this scenario. Please help.
After your corrections, I noticed there is a comma where there should be a semicolon, also there are spaces around the semicolons. I do not know if they are there in reality, I adjusted FS to compensate. If they are not there you can just use ";" . Please put more care still into your file sample specifications.
Very Sorry RudiC , I will take more care next time. Thanks for your help.
---------- Post updated at 08:28 AM ---------- Previous update was at 08:24 AM ----------
Thanks a lot Scrutinizer , the code which you provided worked fine. Also I will ensure more care while posting next time. Could you please help me in explaining the script. Also Please provide some sources/books which can help me in learning to master advanced awk and sed.
---------- Post updated at 10:05 AM ---------- Previous update was at 08:28 AM ----------
Hello SriniShoo , The script which you provided went into long running and eventually i stopped it after approx 15 minutes. Thanks a lot for your efforts
awk '
BEGIN { ####### Start begin statement here.
FS=" *; *" ####### mentioning field seprator here as "*; *" means it will take look for all text from starting to till semi colon as mentioned in your input file
OFS=";" ####### mentioning output field seprator as semi colon ;
}
FNR==1{ ####### Whenever FNR is equal to 1, this condition will be TRUE whenever a new file read. As FNR will be reset whenever a new file is being read and on other hand NR will be keep on increasing it's value till all files being read.
c++ ####### incrementing the value of a variable named c by one.
}
c==1 { ####### When variable C's value is 1 this will be TRUE after first file is being read.
A[$1]=$0 ####### Creating an array named A whose index is $1 and value is $0 (complete line).
}
c==2 { ####### When variable named C's value is 2 which will be TRUE when second file is being read because when read second file's turn will come then FNR==1 conditon will increase variable named C's value to 1 more which will be 2 now.
A[$2]=A[$2] OFS $1 ####### Taking same array named A this time taking index as $2 because $1 of sampleouput1.txt and $2 of sampleoutput2.txt are same. Assigning the value to A{$2] OFS $1 where OFS is Output fielfd seprator.
}
c==3 { ####### When variable named C 's valie is 3 this will be TRUE when 3rd file is being read.
$1=A[$1] ####### now making $1's value as A[$1]; because $1 of file named sampleoutput2.txt and $1 of sampleoutput1.txt and $2 of sampleoutput3.txt are same as per input files provided.
print ####### Performing print action now, which will print complete line then.
}
' sampleoutput1.txt sampleoutput3.txt sampleoutput2.txt ####### mentioning filenames now.
FS=" *; *" ####### using the field separator (FS) here as " *; *" means it will consider spaces around the semicolons also part of the field separator (so that these spaces do not becomes part of the fields)