Hi AWK Experts,
file1.txt contains:
29b11b820ddcc:-|OHad.perWrk|spn_id=AH111|spn_ordtyp=MY_REQ|msg_typ=ah.ntf.out|spn_ordid=928176|spn_nid=3|msg_strt=1175615334703|msg_que=oput|diff=371|17:48:55,074|17:48:55,084|10
file2.txt contains:
29b11b820ddcc:-|spn_id=QA999|spn_ordid=98976098|spn_ordtyp=MY_RES|OHad.perWrk|msg_strt=11756153398760|diff=1571|msg_typ=ah.out.res|msg_que=res_oput|17:48:55,074|17:48:55,084|10
The columns in the input record that match below string must be selected and printed out in the same order as given below.
$1 -> OHad
$2 -> spn_id
$3 -> spn_ordid
$4 -> spn_ordtyp
$5 -> msg_typ
$6 -> spn_nid
$7 -> msg_que
$8 -> diff
$9 -> HH:MI:SS,sss
$10 -> 9999 (i.e., number of any length)
Hence the targetfile.txt must contain:
OHad.perWrk|spn_id=AH111|spn_ordid=928176|spn_ordtyp=MY_REQ|msg_typ=ah.ntf.out|spn_nid=3|msg_que=oput|diff=371|17:48:55,074|17:48:55,084|1000
OHad.perWrk|spn_id=QA999|spn_ordid=98976098|spn_ordtyp=MY_RES|msg_typ=ah.out.res||msg_que=res_oput|diff=1571|17:48:55,074|17:48:55,084|990810
Please note that [msg_strt] column is not required and [spn_nid] is required but is missing in file2.txt hence a blank is acceptable.
Could you please help me in getting the concept or a working prototype using AWK or any better tool that runs in shell script.
with best regrads,
Spring-Buck
Spring_Buck,
Based on what you wrote and the two files sample you gave, I have a
solution assuming there is only one record per file.
You must run the shell twice, separately for each of the one record file.
I also noticed that your "targetfile.txt" does not have the first pipe "|" at the
beginning of each record -- I am putting it in my solution.
Before you run the shell, create these two files:
1) Create one file "egrep_file" as follows:
OHad
spn_id
spn_ordid
spn_ordtyp
msg_typ
spn_nid
msg_que
diff
..:..:..,
^[0-9][0-9]*$
2) Create one file "sed_file" as follows:
s/\(OHad\)/01\1/
s/\(spn_id\)/02\1/
s/\(spn_ordid\)/03\1/
s/\(spn_ordtyp\)/04\1/
s/\(msg_typ\)/05\1/
s/\(spn_nid\)/06\1/
s/\(msg_que\)/07\1/
s/\(diff\)/08\1/
s/\(..:..:..,\)/09\1/
s/^\([0-9][0-9]*\)$/10\1/
Then create a shell with the following commands:
## Create another file with one field per line without pipes "|":
tr '|' '\n' < input_file > $$one_col_file
## Using "egrep_file", create a file with the wanted target output:
egrep -f egrep_file $$one_col_file > $$wanted_target_file
## Using "sed_file", create a file with keys prefixed to be sorted:
sed -f sed_file $$wanted_target_file | sort > $$sort_file
## Remove the sort keys:
sed 's/^|../|/' $$sort_file > $$no_keys_file
## Create the final file with one record:
paste -d'\0' -s $$no_keys_file > FINAL_file
Let me know if it does what you want.
something to start with:
nawk -f spring.awk file1.txt file2.txt fileN.txt
spring.awk:
BEGIN {
FS=OFS="|"
FLD_regex="^OHad" FS "^spn_id=" FS "^spn_ordid=" FS "^spn_ordtyp=" FS "^msg_tp=" FS "^spn_nid=" FS "^msg_que=" FS "^diff=" FS "-3" FS "-2" FS "0"
colN=split(FLD_regex, colA, FS)
}
{
for(i=1; i <= NF; i++)
for(cols=1; cols <= colN; cols++) {
if ( colA[cols] ~ "^-*[0-9][0-9]*$" ) {
fld=int( colA[cols])
outputA[cols] = (fld > 0 ) ? $fld : $(NF + fld)
}
else if ( $i ~ colA[cols] )
outputA[cols] = $i
}
for(j=1; j<= colN; j++)
printf("%s%s", (j in outputA) ? outputA[j] : "", (j != colN) ? FS : "\n")
split("", outputA)
}