Multiple long field separators

locoroco · January 5, 2013, 10:28pm

How do I use multiple field separators in awk?

I know that if I use

awk -F"[ab]"

, both a and b will be field separators. But what if I need two field separators that both are longer than one letter?

If I want the field separators to be "ab" and "cd", I will not be able to use

awk -F"[abcd]"

. The latter would make both a, b, c and d into field separators.

jim_mcnamara · January 5, 2013, 11:13pm

Suppose you want 'foo' and 'ooo' as words to be record separators:

awk -F'(foo)|(ooo)'  '{print $2}' somefile

the regex is (for modern awk, nawk on solaris only) :

-F '(word1)|(word2)|(wordn)'

Each word in parenthesis, separated by a pipe.

Don_Cragun · January 6, 2013, 12:15am

jim mcnamara:

Suppose you want 'foo' and 'ooo' as words to be record separators:
awk -F'(foo)|(ooo)'  '{print $2}' somefile
the regex is (for modern awk, nawk on solaris only) :
-F '(word1)|(word2)|(wordn)'
Each word in parenthesis, separated by a pipe.

Although what you suggested will work just fine, it is even simpler than that. Either of the following will also set foo and ooo as FIELD (not RECORD) separators:

awk -F 'foo|ooo' ...
awk -F '(f|o)oo' ...

and for the original request:

awk -F 'ab|cd' ...

is all that is needed.