C programming: string definition question

technossomy · October 23, 2021, 11:00pm

Here is heavily abbreviated (ie not validating user input) C-code for generating weights in a smoothing function:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>

#define PERIODS 40

/* function to generate weights */
float *weights (char wf[9], float k, float lambda) {
  static float w[PERIODS];

  for (int x = 0; x < PERIODS; ++x) {
    if (strcmp(wf, "weibull") == 0) { w[x] = pow(x/lambda, k-1) * exp(-pow(x/lambda, k)); }
    else if (strcmp(wf, "gompertz") == 0) { w[x] = exp(x/lambda) * exp(-exp(x/lambda)/k); }
    else w[x] = pow(x/lambda, k-1) * exp(-x/lambda);  // gamma
  }

  return w;
}

int main (int argc, char *argv[]) {
  float k = 1.5;
  float lambda = 5;
  printf("Argument: %s\n", argv[1]);
  float *w = weights((argv[1] != NULL) ? argv[1] : "gamma", k, lambda);

  for (int i = 0; i < PERIODS; ++i) printf("Weight %d\t%.4f\n", i, w[i]);

  return 0;
}

It works correctly, so calling with ./smooth gamma, ./smooth weibull, ./smooth gompertz or simply ./smooth all produce correct output.
There is something dissatisfying about float *weights (char wf[9], float k, float lambda). How would more seasoned developers define char wf[9] given that strings are usually defined with pointers rather than a character array?

bendingrodriguez · October 24, 2021, 12:19am

Hi @technossomy,

I am a bit surprised that smooth works without an argument, because then argv[1] is not defined. That's what argc is for.

I would also write a separate function per weight method, and pass the function pointer to weights. This avoids the repeated if/else in the loop. Also, define w only in main and pass it to weights, that avoids double definition:

#define PERIODS 40

float gamma(int x, float k, float lambda) { return pow(x/lambda, k-1) * exp(-x/lambda); }
float weibull(int x, float k, float lambda) { return pow(x/lambda, k-1) * exp(-pow(x/lambda, k)); }
float gompertz(int x, float k, float lambda) { return exp(x/lambda) * exp(-exp(x/lambda)/k); }

void weights(float w[], float (*fct)(int, float, float), float k, float lambda)
{
    for (int x = 0; x < PERIODS; x++)
        w[x] = fct(x, k, lambda);
}

int main(int argc, char *argv[])
{
    float k = 1.5;
    float lambda = 5;
    float w[PERIODS];

    if (argc == 1 || !strcmp(argv[1], "gamma")) weights(w, gamma, k, lambda);
    else if (!strcmp(argv[1], "weibull")) weights(w, weibull, k, lambda);
    else if (!strcmp(argv[1], "gompertz")) weights(w, gompertz, k, lambda);
    else { puts("unknown function"); return 1; }

    for (int i = 0; i < PERIODS; ++i) printf("Weight %d\t%.4f\n", i, w[i]);

    return 0;
}

munkeHoller · October 24, 2021, 12:25am

unless there's a specific reason it needs to be 9 bytes in size?

char *wf will do as it is passed by reference
const char *wf if you want it not to be changed during the function call - would generate a compile time error typically.
ie:
float *weights( char *wf, float k , float lambda)...
float *weights( const char *wf, float k , float lambda)...
I recommend you try the alternative declarations/definitions.
Also, strictly, // are NOT comments in C (K&R) but are in C++ , so if you are compling C, however it would appear most 'modern' compliers allow.

Azhrei · October 24, 2021, 2:45am

+1 to the technique of passing a function pointer into weights().

(The argv[1] reference works because there will always be an argv[1], although it may be NULL, and b/c of the tertiary operator used in the function call.)

@technossomy Your original weights() function puts if statements inside the loop, which should be avoided for performance reasons (and is likely why you posted your question in the first place!). It's likely faster nest the loops inside if statements, meaning you'll have three different loops. The alternative is to use an if statement to determine which processing function to call, as @bendingrodriguez does (the functions gamma(), et al).

As a general programming note, using data to determine which function pointer to pass to weights() is a great practice. Imagine that instead of hard-coded if statements and strcmp(), that there was a data structure (linked list, array, b-tree) that could map the string name directly to a function pointer. This would allow the main code to use argv[1] in the mapping and retrieve the function pointer to use. This makes the main code easier to follow, makes the weights() function easier to debug, and makes the overall application more extensible since you can add a new function (and a corresponding entry to the mapping) without touching any of the existing code, which has positive ramifications for testing and debugging.

I've written way more than I had planned. But I wanted to provide some background on why a particular solution might be chosen. If/when you get a chance, read up on design patterns (Wikipedia has an article). This solution is similar to the Strategy pattern (not identical since C doesn't provide OO support).

technossomy · October 24, 2021, 6:44am

Fair point. There are other smoothing functions, the longest of which is 9 characters in length.

BendingRodriguez corrections have a lot of information in them that I plan to follow.

Correct, a remnant from Javascript in my case. And compiling on gcc 9.3.0.

technossomy · October 24, 2021, 7:05am

bendingrodriguez, this is excellent. To summarise your additions:

Use of argc to trap possibility of no user-provided input
Separate function per smoothing function
Introduction of void wrapper function and pass a function pointer as function argument
Single declaration of weights array w[x]
Introduction of an exit code for trapping erroneous user input

technossomy · October 24, 2021, 8:38pm

By the way, upon compilation of bendingrodriguez code I am getting the following error message:

smooth.c:9:7: error: conflicting types for ‘gamma’
    9 | float gamma(int x, float k, float lambda) { return pow(x/lambda, k-1) * exp(-x/lambda); }
      |       ^~~~~
In file included from /usr/include/features.h:461,
                 from /usr/include/x86_64-linux-gnu/bits/libc-header-start.h:33,
                 from /usr/include/stdio.h:27,
                 from smooth.c:2:
/usr/include/x86_64-linux-gnu/bits/mathcalls.h:241:1: note: previous declaration of ‘gamma’ was here
  241 | __MATHCALL (gamma,, (_Mdouble_));
      | ^~~~~~~~~~

Any idea how I can circumvent this message?

Neo · October 24, 2021, 8:47pm

Make sure you are typing consistently.

A quick glance looks like you are mixing float (32 bit) and double (64 bit).

munkeHoller · October 24, 2021, 9:56pm

presuming you are using <math.h>,
i recommend you read the documentation, as for 'circumventing' the message, its simple, understand the tools you are using.
here's a couple of links to help you
https://www.tutorialspoint.com/c_standard_library/c_function_exp.htm
https://www.tutorialspoint.com/c_standard_library/c_function_pow.htm

bendingrodriguez · October 25, 2021, 5:05am

Hi @technossomy,

since it is not really necessary to name the functions in the same way as the arguments (it's only a mapping string -> function), you can simply give them (the functions, not the arguments) a slightly modified name, e.g. w_gamma() or gamma_w() (btw I don't get that error, it could have something to do with the __MATH_DECLARING_FLOATN macro used in mathcalls.h).

Neo · October 25, 2021, 5:23am

Also, @technossomy

It's easy to understand your confusion.

Programming languages can be strong or weakly typed.

It might help you to review this wiki page on this topic.

C requires all variables to have a declared type and these types must match in logical operations. C supports a number of implicit conversions and C also allows pointer values to be explicitly cast while Java and Pascal, for example, do not.

Writing code with typed programming languages can be frustrating for developers who come from a weakly typed language, like PHP, for example.

For the past year and a half, I have been programming exclusively in Ruby, which is "strongly typed" defined in the link above.

The error you posted tells you this very generously:

smooth.c:9:7: error: conflicting types for ‘gamma’

and then gives you even more friendly help here:

/usr/include/x86_64-linux-gnu/bits/mathcalls.h:241:1: note: previous declaration of ‘gamma’ was here
  241 | __MATHCALL (gamma,, (_Mdouble_));

So, at least from my seat-by-the-sea, it seems clear what the issue is since the error messages generously talks to you and lets you know you have a variable typing issue to contend with.

There are many ways do deal with these types of issues; a few of which has been described above by our team.

I am addressing your "Any idea how I can circumvent this message?" you made, which seemed to me (and others) you were lacking some understanding of the importance of variable typing in C. Often, you can simply cast these errors away.

Below is a simplified tutorial on this topic:

Hope this helps!

technossomy · October 25, 2021, 6:15pm

Agreed on the use of function pointers being cleaner, but performance was not the reason for posting the question. I was already impressed with the performance: an order of magnitude and a half faster than Python! Learning a new language along the way too, which I expect will have lasting benefits.

bendingrodriguez · October 25, 2021, 8:08pm

of course a compiled binary is faster than a script which has to be interpreted. Maybe you want to take a look at Cython - Wikipedia. There are also various tutorials on the net.

Neo · October 30, 2021, 9:22am

15 posts were split to a new topic: Sidebar Discussion on Programming Languages (C, Python, Fortran, et al.)

Neo · October 29, 2021, 12:16am

Continue programming popularity sidebar here: