Hi,
I updated the code snippet so that I hope the desired output is clearer. @RavinderSingh13 your code gives me the following error:
awk: invalid -v option
@RudiC and @Scrutinizer I hope that the updated desired output answers some of your questions. @Scrutinizer I tried your code too but the output that it gives me is not correct. Here is the example. As you see - there are "-" in the blank spaces and the
<s>
and
</s>
envelope the entire text rather than each individual sentence.
<s>
Hi PP -
my VBD -
name DT -
is NN -
. SENT -
-
Her PP -
name VBD -
is DT -
the NN -
same WRT -
. SENT .
</s>
Again, here would be the example of the desired output:
<s>
Hi PP -
my VBD -
name DT -
is NN -
. SENT -
</s>
<s>
Her PP -
name VBD -
is DT -
the NN -
same WRT -
. SENT .
</s>
There is still some ambiguity. In the first half there is a trailing dot, in the second half there is a trailing dash.
Also, your samples appears to not be TAB-delimited, contrary to what you say in the description..