Replace everything but pattern in a line using sed

I have a file with multiple lines like this:

<junk><PATTERN><junk><PATTERN><junk>
<junk><PATTERN><junk><PATTERN><junk><PATTERN><junk>

Note that

  1. There might be variable number occurrences of PATTERN in a line.
  2. <> are just placeholders, they do not form part of the pattern.

I need to replace all the <junk> with a nice separator, like a comma.

So I am left with, in the above case

,PATTERN,PATTERN,
,PATTERN,PATTERN,PATTERN,

In my case, PATTERN has the pattern "AB:" followed by 9 digits.

I am able to replace the pattern with comma, using the following command -

sed 's/\(AB\:[0-9]\{9\}\)/,/g' <input_file>

So I thought replacing everything BUT the pattern should be just one step further - but I am not able to find a solution. Isn't there a way to do this using sed, using the ! operator? I am not able to put my finger on the solution. Any help would be greatly appreciated!

Thanks

Sometimes you can use some tricks to add extra data to break patterns out like this.. using for example, control characters (needs to be a character not found in the data)... consider:

sed -e 's/PATTERN/^BPATTERN^B/g' <test2.txt | tr '\012\002' '\002\012' | grep -v '^PATTERN$' | tr -d '\012' | tr '\002' '\012'

Where your data is in test2.txt and in the first sed the ^B are literally Ctrl-B characters... this breaks the patterns out onto lines to themselves and replaces newlines with Ctrl-B and stitches things back together afterwords.

You are not per chance only interested in the patterns themselves or do they need to remain on their original lines? Otherwise if your system has grep -o you can do this:

grep -Eo 'AB:[0-9]{9}' infile

and if you system does not have grep -o, you could do this:

sed 's/AB:[0-9]\{9\}/\n&\n/g' infile | grep -E 'AB:[0-9]{9}'

or for older sed:

sed 's/AB:[0-9]\{9\}/\
&\
/g' infile | grep -E 'AB:[0-9]{9}'
awk 'gsub("<junk>",",")' Your_File

Thanks for all the replies!

Scrutinizer, I need all the patterns that occurred in a line, and they need to remain on the original lines, with the junk filtered out. I did try out the grep -o option, but then I couldn't tell which line a given pattern corresponds to.

codecaine, the junk follows no particular pattern, so I can't delete the junk using sed. :frowning:

cjcox, I see what you mean - and how it would work. I was wondering if there would be a more straightforward solution. If I can replace all occurrences of a pattern with a string of my choice, surely I'd expect to be able to replace everything BUT the pattern with a small tweak of the command? Or is it not that straightforward?

Thanks again for all the help, really appreciate it! :slight_smile: