Help with reformat data structure

Input file:

bv|111259484|pir||T49736_real_data
bv|159484|pir||T9736_data_figure
bv|113584|prf|T4736|truth
bv|113584|pir||T4736_truth

Desired output:

bv|111259484|pir|T49736|real_data
bv|159484|pir|T9736|data_figure
bv|113584|prf|T4736|truth
bv|113584|pir|T4736|truth

Once the program find "pir||"
I hope to replace "pir||" into "pir|' and follow by replace the next shown "_" into "|"
Command I try:

awk '{gsub(/pir||/,"pir|",$1);print}' input_file.txt

I just able to replace the "pir||" into "pir|' but I don't know how to replace the following "_" into "|" :frowning:
Thanks for any advice.

Try this...

sed 's/pir||/pir|/g' file

sed 's/||/|/g' file

Hi pamu,

Thanks for your reply.
My main problem facing is I not sure how to replace the "" into "|" after I find "pir||" in the data set :frowning:
I hope to replace "pir||" into "pir|' and at the same time follow by replace the next shown "
" into "|".

Thanks for any advice.

You want to replace _ with | right..?

Then try this..

sed -e 's/pir||/pir|/g' -e 's/_/|/g' file

I just wanna to replace only the first "" shown when the data got "pir||" or else just print the original data set.
If I used the sed command that you mention. It will replace all the "
" into "|" which might be different with my desired output result :frowning:

Not able to find with single awk...:frowning:

but it works..:slight_smile:

awk -F "|" '{for(i=1;i<=NF;i++) { if ($i == "" && $(i-1) == "pir") {sub("_","|",$(i+1))} }}1' OFS=\| file | sed 's/pir||/pir|/g'

1 Like

Hi pamu,

Many thanks.
It really work and get what I desired :slight_smile:
Really appreciated :smiley:

With a single sed:

sed 's/pir||/pir|/
t secrepl
b
:secrepl
s/\(pir|[^_]*\)_/\1|/' file

I am assuming that if both pir| and pir|| occur in a line, then the latter comes first and pir|| occurs only once in a line. If these assumptions don't hold good for your data, then you might need to change the sed command a bit. In that case, let me know.

That's cool... piece of sed...:slight_smile:

But not more than your signature "Gotham Knight" :wink:

1 Like