Indexing Variable Names

Hi All

I think I might have bitten off more than I can chew here and I'm hoping some of you guys with advanced pattern matching skills can help me.

What I want to do is index the occurrence of variable names within a library of scripts that I have. Don't ask why, I'm just sad like that...

I have code of this form:

    EXPECTED_FILE=$(m_find_file ${CSV}) 
    if !  m_check_file -f "${EXPECTED_FILE}" s
    then
             m_junit_xml_log_failure ${TT} "${FQN%/*}" "${SCRIPT_NAME}" "Muse Runtime" "Expected file not found" "${XML_REPORT}"
             m_fail 1 "Error: Failed to find expected result file (${CSV}) (${SCRIPT_NAME})" 
    fi

As you can see the variable names are all uppercase and can contain the characters [_A-Z0-9].

First of all I want to isolate all of the lines containing a variable and record the line number.

Then within each line I want to pull out each variable name in isolation, without any surrounding braces.

Then I can use this to build up an indexed list of each instance of each variable name and use that in some tools to inform the developer wherever an instance of a variable name is used or defined.

Sort of an "Intellisense" for BASH if you will.

Thanks in advance for any help with this.

What is your desired output:
something like this?

$ awk -F'[$][{]' '{for(i=2; i<=NF; i++){sub(/\}.*/,x,$i); print NR, $i}}' file
1 CSV
2 EXPECTED_FILE
4 TT
4 FQN%/*
4 SCRIPT_NAME
4 XML_REPORT
5 CSV
5 SCRIPT_NAME

or like this:

$ awk -F'[$][{]' '{for(i=2; i<=NF; i++){sub(/[^_A-Z0-9].*/,x,$i); print NR, $i}}' file
1 CSV
2 EXPECTED_FILE
4 TT
4 FQN
4 SCRIPT_NAME
4 XML_REPORT
5 CSV
5 SCRIPT_NAME

---
Or with GNU grep ( if compiled with Perl regex option):

$ grep -nPo '(?<=\$\{).*?(?=\})' file
1:CSV
2:EXPECTED_FILE
4:TT
4:FQN%/*
4:SCRIPT_NAME
4:XML_REPORT
5:CSV
5:SCRIPT_NAME

or

$ grep -nPo '(?<=\$\{).*?(?=[^_A-Z0-9])' file
1:CSV
2:EXPECTED_FILE
4:TT
4:FQN
4:SCRIPT_NAME
4:XML_REPORT
5:CSV
5:SCRIPT_NAME

Is it sort of something like this you are after? Or have I got the wrong end of the stick here:

awk 'NR==1{print "Line numbers containing variables reported below:"}
    {
    a=0
    for (i=1;i<=NF;i++) {
        if($i ~ /\${.+}/) {
            a=gsub(/[^_A-Z0-9]/,"",$i)
            v[$i]
            }
    }
if (a) {
     print NR
     }
}
END{
    print "\n\nUnique variables reported below:"
    for (x in v) {
        print x
        }
}' inputfile
        

Thank you both for jumping on this so quickly :smiley:

This is perfect for what I need:

awk -F'[$][{]' '{for(i=2; i<=NF; i++){sub(/[^_A-Z0-9].*/,x,$i); print NR, $i}}'

Cheers