I have a file (myfile.txt) with contents like this:
1.txt apple is
3.txt apple is
5.txt apple is
2.txt apple is a
7.txt apple is a
8.txt apple is a fruit
4.txt orange not a fruit
6.txt zero is
The above file is already sorted using this command:
sort -k2 myfile.txt
My objective is to get this:
1.txt_3.txt_5.txt apple is
2.txt_7.txt apple is a
8.txt apple is a fruit
4.txt orange not a fruit
6.txt zero is
You can notice that if the text in the second column is same as we go downwards, we concatenate the values from the first column until they remain the same.
This is what I have tried, but not working perfectly well:
awk '
{sub (" ", FS)
$0=$0
T[$2]=(T[$2]?T[$2]"_":"") $1
}
END {for (t in T) print T[t], t
}
' FS="\001" file
8.txt apple is a fruit
1.txt_3.txt_5.txt apple is
2.txt_7.txt apple is a
4.txt orange not a fruit
6.txt zero is