Need help in AWK;Search String and rearrange columns

spring_buck · April 5, 2007, 7:34am

Hi AWK Experts,

The columns in the input record that match below string must be selected and printed out in the same order as given below.
$1 -> OHad
$2 -> spn_id
$3 -> spn_ordid
$4 -> spn_ordtyp
$5 -> msg_typ
$6 -> spn_nid
$7 -> msg_que
$8 -> diff
$9 -> HH:MI:SS,sss
$10 -> 9999 (i.e., number of any length)

Hence the targetfile.txt must contain:
OHad.perWrk|spn_id=AH111|spn_ordid=928176|spn_ordtyp=MY_REQ|msg_typ=ah.ntf.out|spn_nid=3|msg_que=oput|diff=371|17:48:55,074|17:48:55,084|1000
OHad.perWrk|spn_id=QA999|spn_ordid=98976098|spn_ordtyp=MY_RES|msg_typ=ah.out.res||msg_que=res_oput|diff=1571|17:48:55,074|17:48:55,084|990810

Please note that [msg_strt] column is not required and [spn_nid] is required but is missing in file2.txt hence a blank is acceptable.

Could you please help me in getting the concept or a working prototype using AWK or any better tool that runs in shell script.

with best regrads,
Spring-Buck

Shell_Life · April 5, 2007, 11:27am

Spring_Buck,
Based on what you wrote and the two files sample you gave, I have a
solution assuming there is only one record per file.

You must run the shell twice, separately for each of the one record file.

I also noticed that your "targetfile.txt" does not have the first pipe "|" at the
beginning of each record -- I am putting it in my solution.

Before you run the shell, create these two files:

1) Create one file "egrep_file" as follows:
OHad
spn_id
spn_ordid
spn_ordtyp
msg_typ
spn_nid
msg_que
diff
..:..:..,
^[0-9][0-9]*$

2) Create one file "sed_file" as follows:
s/$OHad$/01\1/
s/$spn_id$/02\1/
s/$spn_ordid$/03\1/
s/$spn_ordtyp$/04\1/
s/$msg_typ$/05\1/
s/$spn_nid$/06\1/
s/$msg_que$/07\1/
s/$diff$/08\1/
s/$..:..:..,$/09\1/
s/^$[0-9][0-9]*$$/10\1/

Then create a shell with the following commands:
## Create another file with one field per line without pipes "|":
tr '|' '\n' < input_file > $$one_col_file

## Using "egrep_file", create a file with the wanted target output:
egrep -f egrep_file $$one_col_file > $$wanted_target_file

## Using "sed_file", create a file with keys prefixed to be sorted:
sed -f sed_file $$wanted_target_file | sort > $$sort_file

## Remove the sort keys:
sed 's/^|../|/' $$sort_file > $$no_keys_file

## Create the final file with one record:
paste -d'\0' -s $$no_keys_file > FINAL_file

Let me know if it does what you want.

vgersh99 · April 5, 2007, 11:40am

something to start with:

nawk -f spring.awk file1.txt file2.txt fileN.txt

spring.awk:

BEGIN {
  FS=OFS="|"

  FLD_regex="^OHad" FS "^spn_id=" FS "^spn_ordid=" FS "^spn_ordtyp=" FS "^msg_tp=" FS "^spn_nid=" FS "^msg_que=" FS "^diff=" FS "-3" FS "-2" FS "0"

  colN=split(FLD_regex, colA, FS)
}

{
  for(i=1; i <= NF; i++)
     for(cols=1; cols <= colN; cols++) {
        if ( colA[cols] ~ "^-*[0-9][0-9]*$" ) {
           fld=int( colA[cols])
           outputA[cols] = (fld > 0 ) ? $fld : $(NF + fld)
        }
        else if ( $i ~ colA[cols] )
                outputA[cols] = $i
     }

  for(j=1; j<= colN; j++)
    printf("%s%s", (j in outputA) ? outputA[j] : "", (j != colN) ? FS : "\n")
  split("", outputA)

}