Please help, I am new to shell Programming. I have three files each containg a unique text (key) field (e.g. ABCDEF, XCDUD as shown below), line return followed by some data of which there can be more then one instance. In addition, in some cases there may be no data but only a key field. Please see example below:
File A contains:
ABCDEF ----> Key
DataA-1 ---> Data
DataA-2 ---> Data
DataA-3 ---> Data
XCDUD -----> Key
DataA-1 ------> Data
UUUUA -----> Key
DataA-1 ------> Data
File B contains:
ABCDEF
DataB-1
DataC-1
XCDUD
DataB-1
UUUUA
File C contains:
ABCDEF
DataC-1
XCDUD
UUUUA
DataC-1
I want to merge these files by the unique key; I am only interested in the merged data separated by line return as shown below:
ABCDEF
DataA-1
DataB-1
DataC-1
DataA-2
DataA-3
XCDUD
Data A-1
Data B-1
UUUUA
Data A-1
Data C-1
Is it possible to script this? Please indicate how?
Basically, what I am calling the key is the field: <_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier> as the number within is always unique eg. 115781057901, 1215781057902, 1215781057903 and so on. In each file the data is placed after the key. Each file contains one type of data, so I am trying to report on the data by the key.
Originally, I have one file that contains all the data. So I egrep <_05_1:MessageIdentifier> and <Error:Exception> in one file, <_05_1:MessageIdentifier> and <06:Detail> in another and finally <_05_1:MessageIdentifier> and <DataPosted> in another. The reason I am doing this is because I am going to CUT the data to get what we want before I merge the files. If there is way of egreping all the fields and cutting each piece of data, that would sort my problem in one go.
<_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier>
<Error:Exception> Error was 121238123... </Error:Exeption>
<_05_1:MessageIdentifier>ERR:38736086_1215781057903</_05_1:MessageIdentifier>
<Error:Exception> Error was 4554641..... </Error:Exeption>
<_05_1:MessageIdentifier>ERR:38736086_1215781057905</_05_1:MessageIdentifier>
<Error:Exception> Error was 1277123.... </Error:Exeption>
Basically, what I am calling the key is the field: <_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier> as the number within is always unique eg. 115781057901, 1215781057902, 1215781057903 and so on. In each file the data is placed after the key. Each file contains one type of data, so I am trying to report on the data by the key.
Originally, I have one file that contains all the data. So I egrep <_05_1:MessageIdentifier> and <Error:Exception> in one file, <_05_1:MessageIdentifier> and <06:Detail> in another and finally <_05_1:MessageIdentifier> and <DataPosted> in another. The reason I am doing this is because I am going to CUT the data to get what we want before I merge the files. If there is way of egreping all the fields and cutting each piece of data, that would sort my problem in one go.
<_05_1:MessageIdentifier>ERR:38736086_1215781057901</_05_1:MessageIdentifier>
<Error:Exception> Error was 121238123... </Error:Exeption>
<_05_1:MessageIdentifier>ERR:38736086_1215781057903</_05_1:MessageIdentifier>
<Error:Exception> Error was 4554641..... </Error:Exeption>
<_05_1:MessageIdentifier>ERR:38736086_1215781057905</_05_1:MessageIdentifier>
<Error:Exception> Error was 1277123.... </Error:Exeption>
Thank you for persevering with my query, I really apprecaite. As I am new Shell Scripting could you please give me some idea of what each line is doing? I have some idea but I do not completely apprecaite the code. Where is the files to process specified?
Radoulov the code below works as expected :). Brilliant.. Thank you:b:. Could you please explain what each line is doing as I am new and it all seems a little bewildering.
perl -ne'
$key = $1 and next if /ERR:\d+_(\d+)/;
$data{$key} = $data{$key} ?
$data{$key} . "\n" . $1 :
$1 if m|>(.+?)</|;
END {
print map { $_ . "\n" . $data{$_} . "\n" } keys %data;
}' G01.txt G02.txt G03.txt