Sorting/Arranging file based on tags using awk

Hi,

I have file which contains data based on tags. Output of the file should be in order of tags.

Below are the files :

Tags.txt

f12
f13
f23
f45
f56

Original data is like this :
Data.txt

2017/01/04|09:07:00:021|R|XYZ|38|9|1234|f12=CAT|f23=APPLE|f45=PENCIL|f13=CAR
2017/01/04|09:07:00:021|T|LMN|38|7|1234|f23=ORANGE|f12=DOG|f45=BOOK|f56=ICE-CREAM
2017/01/04|09:08:00:768|R|XYZ|42|9|3457|f56=CUSTARD|f13=RAILWAY
2017/01/04|09:02:00:976|L|PQR|38|9|5644|f56=CHOCOLATE|f12=SNAKE|f13=AUTO|f23=BANANA

And, Output should be like this :
Expected Result -

2017/01/04|09:07:00:021|R|XYZ|38|9|1234|CAT|CAR|APPLE|PENCIL|
2017/01/04|09:07:00:021|T|LMN|38|7|1234|DOG||ORANGE|BOOK|ICE-CREAM
2017/01/04|09:08:00:768|R|XYZ|42|9|3457||RAILWAY|||CUSTARD
2017/01/04|09:02:00:976|L|PQR|38|9|5644|SNAKE|AUTO||BANANA|CHOCOLATE

I was thinking of using associative array in AWK. But, not able to do it properly. Can someone please help?

Hello Prathmesh,

Can I just confirm if this sorting is just to be within each record and that the output lines should be in the same order, i.e. it's horizontal sorting, so this:-

a,4,3,2,1
c,5,4,3,2
b,1,5,4,2

...would deliver:-

a,1,2,3,4
c,2,3,4,5
b,1,2,4,5

If so, I have a few to questions pose in response first:-

  • Is this homework/assignment? There are specific forums for these.
  • What have you tried so far?
  • What output/errors do you get?
  • What OS and version are you using?
  • What are your preferred tools? (C, shell, perl, awk, etc.)
  • What logical process have you considered? (to help steer us to follow what you are trying to achieve)

Most importantly, What have you tried so far?

There are probably many ways to achieve most tasks, so giving us an idea of your style and thoughts will help us guide you to an answer most suitable to you so you can adjust it to suit your needs in future.

We're all here to learn and getting the relevant information will help us all.

Kind regards,
Robin

1 Like

Thanks Robin for your reply.

This is not assignment problem. My OS is GNU/Linux and I prefer to use Shell script/AWK.

I am thinking of listing all possible tags in one file Tags.txt , then match each tag in one line at a time and take the result after = sign of that particular tag and present it as output. However, I am still not able to come up with correct AWK statement for this.

And, Yes It is horizontal sorting based on the order of tags.

Looks like the pipe character is the field separator.
Are the tags always in field 8 and higher?

Try

awk -F\| '
NR==FNR         {F[NR] = $1
                 MX = NR
                 next
                }
                {for (i=8; i<=NF; i++)  {split ($i, T, "=")
                                         R[T[1]] = T[2]
                                        }
                 for (i=1; i<=MX; i++)  $(7+i)=R[F]
                 delete R
                }
1
' OFS=\| file1 file2
2017/01/04|09:07:00:021|R|XYZ|38|9|1234|CAT|CAR|APPLE|PENCIL|
2017/01/04|09:07:00:021|T|LMN|38|7|1234|DOG||ORANGE|BOOK|ICE-CREAM
2017/01/04|09:08:00:768|R|XYZ|42|9|3457||RAILWAY|||CUSTARD
2017/01/04|09:02:00:976|L|PQR|38|9|5644|SNAKE|AUTO|BANANA||CHOCOLATE

You seem to have a small error in your desired output sample.

1 Like

Yes. Pipe is delimiter. And tags may or may not be in field 8 or higher.

Sent from my Nexus 5 using Tapatalk

---------- Post updated at 12:37 AM ---------- Previous update was at 12:36 AM ----------

Thanks. I will try it and let you know.

Sent from my Nexus 5 using Tapatalk