Hi Gurus,
I have a file(weblog) as below
abc|xyz|123|agentcode=sample code abcdeeess,agentcode=sample code abcdeeess,agentcode=sample code abcdeeess|agentadd=abcd stereet 23343,agentadd=abcd stereet 23343
sss|wwq|999|agentcode=sample1 code wqwdeeess,gentcode=sample1 code wqwdeeess,gentcode=sample1 code wqwdeeess|agentadd=ssss stereet sssss,agentadd=ssss stereet sssss
awe|rez|777|agentcode=sample2 code dfsdfeess,agentcode=sample2 code dfsdfeess,agentcode=sample2 code dfsdfeess|agentadd=tttt stereet ttttt,agentadd=tttt stereet ttttt
twe|tez|555|agentcode=sample3 code ddddddddd,dddddd,agentcode=sample3 code ddddddddd,dddddd|agentadd=tttt stereet ttttt,agentadd=tttt stereet ttttt
I want to remove the duplicate values from column 4 and 5. There is a possibility that same value may repeat with comma delimited. Comma can also come in the data as well .
My algorithm is to take column 1, 2, 3 (makes record unique) and then split column 4 and 5 based on commas, and then remove duplicate, join them back with comma(so that comma in the record wont be lost)
Is there a command with awk or perl
Out put should be like
abc|xyz|123|agentcode=sample code abcdeeess|agentadd=abcd stereet 23343
sss|wwq|999|agentcode=sample1 code wqwdeeess|agentadd=ssss stereet sssss
awe|rez|777|agentcode=sample2 code dfsdfeess|agentadd=tttt stereet ttttt
twe|tez|555|agentcode=sample3 code ddddddddd,dddddd|agentadd=tttt stereet ttttt