Help needed in formatting the Output file

Hi All,

Need your help in resolving the below issue.

I've a file called "data.txt" with the below lines:

TT: <tell://me/sreenivas> 
<tell://me/100>

TT: <tell://me/sudheer> 
<tell://me/300>

TT: <tell://me/sreenivas> 
<tell://me/200>

TT: <tell://me/sudheer> 
<tell://me/400>  

I want an output in the below format. Please help me.

TT: <tell://me/sreenivas>
<tell://me/100>
<tell://me/200>

TT: <tell://me/sudheer> 
<tell://me/300>
<tell://me/400>  

Explanation of above o/p:
If the pattern between "<tell://me/" and ">" is same on any of the lines that contains "TT" then take only one line from them.
That line should be followed by the lines followed by the actual lines that have the same pattern between "<tell://me/" and ">".

Looking forward to your help as soon as possible. Let me know if any queries.

With Regards,
SRK

Is your input always guaranteed to consist of groups of 2 lines, one starting with "TT:" and the other with a "<tell..."clause?

If so, create a sort-of table with the 2-line groups brought to one line, like this:

TT: <tell://me/sreenivas> <tell://me/100>
TT: <tell://me/sudheer> <tell://me/300>

You can do this easily with a single sed-line. Sorting this will give you all equal keys following each other. Last step is to do a "control break", which is a basic algorithm in programming. Here is shown how to do this.

I hope this helps.

bakunin

1 Like

This is also pretty easy with awk :

awk '
NF == 0 { next }
/^TT/ {	if(!((key = $1 FS $2) in out))
		out[key] = $0
	next
}
{	out[key] = out[key] "\n" $0 }
END {	for(key in out)
		printf("%s\n\n", out[key])
}' data.txt

This will work even if there are multiple non-blank lines between the lines starting with TT: .

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk .

1 Like

Thanks Don. It helped me a lot!