Sorting file with CRLF within field, RS=$

OK below is what my sample file looks like. I need to sort by the Primary Key ie: {1:F01SAESVAV0AXXX0466020126} in the first record. Record seperator is $.

I tried sort, but it completely messes it up. I am thinking I will need to use something like awk which understands the record seperator and allows me to substring or something to sort by PK. Any assistance is greatly appreciated.

Source File

{1:F01SAESVAV0AXXX0466020126}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$
{1:F01SAESVAV0AXXX0466020123}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$
{1:F01SAESVAV0AXXX0466020121}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$

Output I want:

{1:F01SAESVAV0AXXX0466020121}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$
{1:F01SAESVAV0AXXX0466020123}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$
{1:F01SAESVAV0AXXX0466020126}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$

Are these DOS files? UNIX and Linux system utilities generally expect <newline> (not CR/LF) line terminators?

What operating system, shell, and version of awk are you using?

Operating System: GNU/Linux
AWK Version: GNU Awk 3.1.7

I am not sure yet as to the source system whether it is Unix or DOS but I know that End of Message is denoted by: $. File will be processed on linux.

Got Perl?

perl -ne 'BEGIN{$/="\$\n"}; ($i) = /^({[^}]*})/ and $r{$i} = $_; END{for(sort keys %r){print $r{$_}}}' alfredo123.file
{1:F01SAESVAV0AXXX0466020121}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$
{1:F01SAESVAV0AXXX0466020123}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$
{1:F01SAESVAV0AXXX0466020126}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$
1 Like

Would this do?

awk '/^{1:/ {LAST=$1} {print LAST, NR, $0}' FS="}" OFS="}" file | sort -t"}" -k1,1 -k2,2n | cut -d"}" -f3-
1 Like

Sincere appreciation for the response provided. Unfortunately I do not have Perl so the perl option would not work for me. The awk command provided worked but I would certainly like to understand what is going on here:

awk '/^{1:/ {LAST=$1} {print LAST, NR, $0}' FS="}" OFS="}" file | sort -t"}" -k1,1 -k2,2n | cut -d"}" -f3-

Also, it seems like the Unique key for sort would be the first 2 tags instead of just the first one and the $ would be at the start of every new row instead of at the end of each row. If I get an understanding of what is going on here, maybe I can implement the sort based on 2 tags and identify new record based on dollar.

Unique Key:

{1:F01SAESVAV0AXXX0466020121}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}

So my source file is now going to look like this:

{1:F01SAESVAV0AXXX0466020126}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}
${1:F01SAESVAV0AXXX0466020123}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}
${1:F01SAESVAV0AXXX0466020121}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$

Sorry, I don't understand what you are saying. When applying my proposal to your sample input, I get

{1:F01SAESVAV0AXXX0466020121}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$
{1:F01SAESVAV0AXXX0466020123}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$
{1:F01SAESVAV0AXXX0466020126}{2:O1011538070522LRLRXXXX4A0700005910650705221739N}{3:{108:MT101 001 OF 019}}{4:
:20:00028
:28D:1/1
:50H:/VTB.2003.02
19Apr2002
:30:020419
:21:x
:32B:USD1,
:50L:x
:59:/x
x
:71A:OUR
-}{5:{MAC:00000000}{CHK:24857F4599E7}{TNG:}}{S:{SAC:}{COP:P}}$

which is in the correct sort order, and the $- signs are in the correct spot.

1 Like