[solved] awk: placement of user-defined functions

Hi folks,

is there any recommendation, especially from a point of performance, about where to place a user-defined function in awk, like in BEGIN{} or if it is only need once at the end in END{}? Or doesn't it matter at all since, awk is so clever and only interprets it once, wherever it is placed?

The usual approach is to put them either BEFORE the BEGIN rule, or AFTER the END rule. Performance-wise, it doesn't matter. You might want to have a look at User-defined - The GNU Awk User's Guide

1 Like

Looked there already, but sometimes it's better to read than just to look :smiley: Didn't see the 1st sentences in "9.2.1 Function Definition Syntax". Thanks a lot :slight_smile:

With regard to performance of user-defined functions, placement does not matter to the only implementation with whose internals I'm a bit familiar, nawk.

The only way to improve performance of a given user-defined function is to change implementation. Typically, gawk is by far the slowest of all. nawk is noticeably faster. mawk is much faster still. Typically. You'd have to benchmark your program to confirm.

With regard to placement of the function definition, it's not allowed to be inside BEGIN or END. It should be at the same top-level as patterns and actions. If your implementation allows you to embed it, be aware that others may not.

Regards,
Alister

2 Likes