Function pointer to inline function ?

Hi.

Problem: I have to parse the payload of a packet. The payload could be in Big Endian Format (network byte order) or little. That depends on a flag present in the header of the packet.

Solution: A horrible solution could be to check for that flag everytime I have to read a field in the payload, that is

if (header->most_significan_byte)
    var = ntohl(payload->field32);
else
    var = payload->field32;

and that would be done in all the function body that is responsible for parsing the payload. Quite awful indeed.

I was thinking to use a function pointer, actually two, and init them to ntohl and ntohs if the payload was in network order, or init them to an empty function otherwise. Something like

uint32_t (*_ntohl)(uint32_t hostlong);
uint16_t (*_ntohs)(uint16_t hostshort);
...

uint32_t empty_ntohl(uint32_t hostlong)
{
     return hostlong;
}

uint16_t empty_ntohs(uint16_t hostshort)
{
     return hostshort;
}

...
if (header->most_sifnigican_byte) {
   _ntohl = ntohl;
   _ntohs =  ntohs;
} else {
   _ntohl = empty_ntohl;
   _ntohs = empty_ntohs;
}
..

var = _ntohl(payload->field32);

Now. Before going ahead and mess up with my already-working-code-without-byte-ordering-support, I'd like to know if this solution is
1 - correct
2 - efficient

or if any of you guys have already faced this problem and have a better and more elegant solution.

In the subject, I wrote function pointer to inline function, because I was wondering if the empty function could be inline to improve performances and avoid the function call
overhead.

Would the pointer function call be replaced with actual inline function in case it points to the inline one?
The answer if of course not, because the last one is done at
compile time. Yeah.. I'm basically answering myself while writing. :slight_smile:

Anyway, any comment and/or suggestion?

Thanks in advance.
S.

Is there some reason that the application generating the payload sometimes does it in network byte order and sometimes LE order? If you have control over this application, I suggest you fix it up to always generate network byte order packets.

Yes.

Too easy! :slight_smile:
No, I don't have any control over it. Even because the packet is not always sent through actual IP protocol, but it could also be sent through USB.

It's an embedded context if you are wondering.

If this is determinate valued (0, 1 or whatever): (header->most_significant_byte)
try creating an array of function pointers

The Function Pointer Tutorials - Syntax

Are you typing up these responses in another program then copy-pasting? You don't need to do that, it adds pointless extra linebreaks. The text will wrap by itself when it needs to.

The performance difference between a function pointer and if/else is not going to be significant here, use whatever is clearest. If it was a choice between three or more functions, a function pointer might be more efficient and elegant.

Nice tip! Didn't think of that.
Thanks.

Although it's four functions and not only two. Thus means
two arrays of pointers.

Interesting solution anyway.

S.

---------- Post updated at 01:48 AM ---------- Previous update was at 01:46 AM ----------

I'm not copying-and-pasting anything.

The point is that the function is called _VERY_ often, so the more is optimized, the better.

Thanks for the advice.
S.

Corona is correct - you should leave optimization to compilers when complex makes code harder to read and maintain. And the speed return is minimal.

General tips:

If you have already not done so - try profiling your code.

Algorithm changes usually provide far better optimization than tweaks like function pointers. This is the reason for seemingly odd algorithms like Duff's machine. And threading.

Turn on optimization for your compiler - things like loop unrolling may provide a lot of speed increase.

There's almost nothing there to optimize. At best the difference will be minimal, at worst you'll make it slower. If the compiler can inline htons(), it'll become perhaps one branch-instruction optionally skipping one to three instructions that do the byte swap. That's likely fewer instructions than it takes to call a function, let alone run the code inside it. (and if it's not inlining, that's something you could do...) Function calls can be expensive.

Do you have a _VERY_ good tutorial about profiling with gprof ? I'm actually having
some trouble finding one. I found many, but none that satisfies me.
I'm particularly interested in profiling library code (shared).

Hmm... optimization sometimes lead to very hard to find bugs, at least in my little experience. So I'd rather leave that as my last option.

Thanks again!
S.

---------- Post updated at 09:29 AM ---------- Previous update was at 09:28 AM ----------

Thanks for the explanation!

Regards,
S.

Optimization can cause errors in programs with subtle bugs. Places where accidental writing to stack or incidental return values hadn't mattered before, suddenly may. Proper programming practices will circumvent this, and the benefits of optimization can be very good.

Try a static link - that is one easy way to profile shared routines.
However try these as base assumptions:

most of the problems you have are yours, not library code
algorithms and proper program design give the best return on optimization by far
code tricks result in very minor gains

Do you know about aio (asynchronous I/O)? try reading up on that. It is often used in high demand high volume I/O design (disk). I think it is still not fully supported for sockets in Linux. I don't know for sure.

gprof - this is a good as there is:
GNU gprof

Indeed. That's why I said I'll leave optimization as my last option.

First I'll make it fast and correct (hopefully bug free), then switch to optimization to make
it even faster (even if of a small percentage). At that stage, if a bugs come up, it is easier to find. At least, that's what I've learned.

---------- Post updated at 02:59 AM ---------- Previous update was at 02:51 AM ----------

Hmm... interesting.

In fact, the library is mine! :smiley: I developed it.

Indeed! I'm not trying to achieve optimization with this kind of tricks.
As I said, I still haven't profiled my code and my post was only about how
to best resolve that particular problem (decoding a payload which byte order
is based on a header flag, which can only be known at runtime).

Yes. I've already read about aio and I did consider it. I might use it for dumping
the data, I receive from the net.

Thanks for the whole reply!