Grep for word and any text inside parethesis

I am trying to grep for a word and any text inside parenthesis. Any tools you can think of would be appreciated. Awk, sed, perl, or anything else you can think of.

The words I trying to match is function1 with anything inside parenthesis. Examples of things i would like to match.

int function1(int var1)
int function1(int var1, int var2)
int function1(int var1, int var2, int var3)
double function1(int var1, int var2, int var3)
float function1(int var1, int var2, int var3)

Examples of things i would like to not match.

int function1()
int function1
function1
int function2(int var1)
int function2(int var1, int var2)
int function2(int var1, int var2, int var3)
double function2(int var1, int var2, int var3)
float function2(int var1, int var2, int var3)

@cokedude , what have you tried - show your attempts please.

if this is only for 'code' , then your compiler can probably help , also check ctags manpage ...

what are you actually trying to do - the end use of this effort ?

This is the only way I can think of.

: grep -i int function1(*) /export/home/
-bash: syntax error near unexpected token ‘(’
: grep -i int function1(\*) /export/home/
-bash: syntax error near unexpected token ‘(’
: grep -i int function1(\\*) /export/home/
-bash: syntax error near unexpected token ‘(’

We are using a company built proprietary compiler with limited functionality.

something like ....

cat cokedude.input 
int function1(int var1)
int function1(int var1, int var2)
int function1(int var1, int var2, int var3)
double function1(int var1, int var2, int var3)
float function1(int var1, int var2, int var3)
Examples of things i would like to not match.

int function1()
int function1
function1
int function2(int var1)
int function2(int var1, int var2)
int function2(int var1, int var2, int var3)
double function2(int var1, int var2, int var3)
float function2(int var1, int var2, int var3)

grep -E '^(int|float|double) function1\(.+\)' cokedude.input
int function1(int var1)
int function1(int var1, int var2)
int function1(int var1, int var2, int var3)
double function1(int var1, int var2, int var3)
float function1(int var1, int var2, int var3)

Sorry I forgot to say I am using an ancient sunos with limited tools.

: uname -a
SunOS ah57ndcub04031 5.11 11.4.71.170.2 sun4v sparc sun4v non-global-zone

It does not work.

: grep -E '^(int|float|double) function1\(.+\)' cookies.txt
grep: illegal option -- E
Usage: grep [-c|-l|-q] -bhinsvw pattern file . . .

Try egrep '....'
or

$ nawk '/^(int|float|double) function1\([^)]+\)/' cokedude.input 
int function1(int var1)
int function1(int var1, int var2)
int function1(int var1, int var2, int var3)
double function1(int var1, int var2, int var3)
float function1(int var1, int var2, int var3)

or /usr/xpg4/bin/awk instead of nawk (on Solaris).

... or /usr/xpg4/bin/grep -E

Yes you must quote the *, but you must also quote the embedded space and the ( and )
grep -i int\ function1\(\*\) /export/home/
But another quoting type looks better:
grep -i "int function1(*)" /export/home/
This quoting protects against expansion in the shell. But the shell dequotes it, then invokes the grep.

Your next mistake is that in grep (regular expression) a * means zero or more occurrences of the preceding character. Certainly you mean .* , "any character any times". In regular expression the dot means "any character".

Last a comment regarding the grep options. If your argument is a directory then grep -r (recursive) makes sense, but might require GNU grep i.e. /usr/gnu/bin/grep in Solaris 11.
grep -i is case-insensitive.

I forgot to say case insensitive. I also added void. This does not seem to be working. Noticed we are not using consistent cases.

nawk 'IGNORECASE = 1 /^(int|float|double|void) funCTion1\([^)]+\)/' cookies.txt
nawk 'IGNORECASE = 1; /^(int|float|double|void) funCTion1\([^)]+\)/' cookies.txt

I tried variations of this without any luck.

This was the only way I could get anything in the parenthesis, it also matched blank parenthesis.

: grep -i "int function1(.*)" /export/home/cookies.txt
int function1(void)
int function1();

This only matched blank parenthesis.

: grep -i "int function1(+*)" /export/home/cookies.txt
int function1()
int function1();

This only matched blank parenthesis.

: grep -i "int function1(*)" /export/home/cookies.txt
int function1()
int function1();

Couple of points:

  1. IGNORECASE construct is specific to gawk - not available in other versions of awk (including Solaris' nawk or /usr/xpg4/bin/awk).

  2. If you were to use gawk, the proper way to use IGNORECASE, would be:
    IGNORECASE=1 gawk 'whateverGAWKcodeComesHere' myInputFile

  3. To do string matching ignoring case with other (NONgnu awk's) could be:

awk 'tolower($0) ~ tolower("^(int|float|double) function1\\([^)]+\\)")' cokedude.input

You'll need to work out the specifics for your REGEX matching the strings.

what about ...

perl -ne 'print if /^(int|float|double) function1\(.+\)/' < cokedude.input 
int function1(int var1)
int function1(int var1, int var2)
int function1(int var1, int var2, int var3)
double function1(int var1, int var2, int var3)
float function1(int var1, int var2, int var3)

Because .* is "any character, zero or more times", you must require another "any character":

grep -i "int function1(..*)" /export/home/cookies.txt

or

grep -i "int function1(.*.)" /export/home/cookies.txt

But this would match
int function1() function2()
because the .*. would reach from the first ( to the last ). So instead of "any character" you should have "a character that is not a )".
In regular expression this is [^)]

grep -i "int function1([^)]*[^)])" /export/home/cookies.txt

grep knows the \{m,n\} quantifier that we can use here: the preceding character must occur between m and n times.
Instead of the many int,float,double,... it's maybe sufficient to require a character between a and z: [a-z]

grep -i "[a-z*] function1([^)]\{1,\})" /export/home/cookies.txt

I also put a * in the [] character set, in order to allow
int * function1(x)
This * is not a quantifier because it is in the character set.
\{1,\} means the preceding character (here: from a character set) may exist 1 or more times, no upper limit.