awk print in script

ocbit · June 22, 2016, 1:53pm

Hi, Thanks to RudiC I have a functioning awk portion of a script which reads a text file and replaces all matching values in an XML.

I need help with placing print statements in the script to see line by line what is being replaced.

Script:

#!/bin/sh

if [ $# -eq 2 ]
  then
        echo
else
    echo "usage: $0 <Properties> <DeploymentXML>"
    exit
fi

TXT=$1
XML=$2

awk '
FNR == NR       {sub (/\./, "/", $TXT)
                 T[$TXT] = $XML
                 next
                }
                {for (t in T) if ($0 ~ t) TF = t
                }
TF && /<value/  {sub (/>[^<]*</, ">" T[TF] "<")
                 TF = ""
                }
TF && /<machine/{sub (/>[^<]*</, ">" T[TF] "<")
                 TF = ""
                }
1
' FS="~" $TXT $XML > temp &&  mv temp $XML

For example:
Text file contains multiple entries in the format of -

Application.Env~DEV 
Application.ID~99999

Pre-script XML contains multiple entries in the format of -

<name>Application/Env</name> 
<value>XXX</value> 
<name>Application/ID</name> 
<value>00000</value>

Post-script XML:

<name>Application/Env</name> 
<value>DEV</value> 
<name>Application/ID</name> 
<value>99999</value>

Thanks!

Don_Cragun · June 22, 2016, 9:41pm

I readily believe the RudiC helped you create an awk program that could do something similar to what you are describing... But I can't believe that the code you have shown us does anything like what you are describing. (Shell variables are not expanded inside single-quoted strings. You are using code that changes data found in <value> tags and data found in <machine> tags, but your description doesn't say anything about changing data in <machine> tags.

And, since the code shown copies entire input files to corresponding output files (possibly after updating some text), I don't understand what additional text you are hoping to produce nor where you want that additional text to be written???

Please give us a much clearer description of what you are trying to do and show us small representative samples of an input file and the corresponding output file (or files) you hope to produce from that input. (And, be sure that the code that you have shown us does produce the output that you have indicated it currently produces.)

ocbit · June 23, 2016, 9:23am

The code I provided works perfectly fine as I posted it. Why would I post it and say it works if it doesn't? Replacing <machine> data is my addition to it, but that is not relevant to this post.

The original code is:

awk '
FNR == NR       {sub (/\./, "/", $1)
                 T[$1] = $2
                 next
                }
                {for (t in T) if ($0 ~ t) TF = t
                }
TF && /<value/  {sub (/>[^<]*</, ">" T[TF] "<")
                 TF = ""
                }
1
' FS="~" flatfile xmlfile

Currently, the code takes a text input file with format

Application.Env~DEV 
Application.ID~99999
Application.Name~appname

along with an XML input file with format

<name>Application/Env</name> 
<value>TST</value> 
<name>Application/ID</name> 
<value>00000</value>
<name>Application/Name</name> 
<value>Name</value>

and replaces <value> data with the data found after the ~ in the text file.
So afterwards it looks like

<name>Application/Env</name> 
<value>DEV</value> 
<name>Application/ID</name> 
<value>99999</value>
<name>Application/Name</name> 
<value>appname</value>

What I'm now asking is - Where would I put print commands in this script to show on the command line what is being replaced line by line?

RudiC · June 23, 2016, 12:32pm

Sorry, impossible. Inside awk , TXT and XML are uninitialized variables, thus empty, and can't be used neither as a index into an array nor as a reasonable assignment value. Plus, with the leading $-sign, their contents will be interpreted as a field identifier and MUST be numeric. Again, leading nowhere.
Outside, they are just filenames, aren't they, so nothing to replace anything?

Not sure...?

For your original question, place a print just before and just after the replacement statements to see the difference.

Yoda · June 23, 2016, 1:29pm

To see what is getting replaced you can use match function and print , write output to another file:-

awk -F'[~><]' '
        FNR == NR {
                sub ( /\./, "/", $1 )
                T[$1] = $2
                next
        }
        /<name>/ {
                name = $3
        }
        /<value>/ {
                if ( name in T )
                {
                        match ( $0, />[^<]*</ )
                        print "Replacing: ", substr ( $0, RSTART, RLENGTH ) " with >" T[name] "<"
                        sub ( />[^<]*</, ">" T[name] "<" )
                }
        }
        {
                print $0 > "new.xml"
        }
' flatfile xmlfile

ocbit · June 29, 2016, 12:30pm

Yoda, thank you. Exactly what I was after.

---------- Post updated 06-29-16 at 12:30 PM ---------- Previous update was 06-28-16 at 04:15 PM ----------

Yoda, a follow up question on the code:

I put print statements throughout the code and am seeing strings with 2 tokens such as Filesystem.FileLoc are being replaced, whereas strings with more than 2 tokens like Connections.JDBC.XX.USER.NAME are not. I realize the sample I posted previously had 2 tokens but there are possibilities of having more.

What should be done to handle more than 2 tokens? Also where does the $3 variable come from?

This is the code with print in it:

awk -F'[~><]' '
        FNR == NR {
                sub ( /\./, "/", $1 )
                T[$1] = $2
                print "T array: " T[$1]
                next
        }
        /<name>/ {
                name = $3
                print "Name: " name
        }
        /<value>/ {
                if ( name in T )
                {
                        print "name: " name " t: " T
                        match ( $0, />[^<]*</ )
                        print "Replacing: ", substr ( $0, RSTART, RLENGTH ) " with >" T[name] "<"
                        sub ( />[^<]*</, ">" T[name] "<" )
                }
        }
        {
                print $0 > "new.xml"
        }
' flatfile xmlfile

Cmd Line Output:

T array: Local
T array: UName
T array: PWord
T array: /home/THEfile/
T array: /home/THEtrigger/
Name: Connections/JDBC/Install
Name: Connections/JDBC/XX/USER/NAME
Name: Connections/JDBC/XX/PASSWORD
Name: Filesystem/FileLoc
name: Filesystem/FileLoc t:
Replacing:  >/home/file/< with >/home/THEfile/<
Name: Filesystem/TriggerFile
name: Filesystem/TriggerFile t:
Replacing:  >/home/trigger/< with >/home/THEtrigger/<

new.xml

<root>
<name>Connections/JDBC/Install</name>
<value>Remote</value>
<name>Connections/JDBC/XX/USER/NAME</name>
<value>name</value>
<name>Connections/JDBC/XX/PASSWORD</name>
<value>pwd</value>
<name>Filesystem/FileLoc</name>
<value>/home/THEfile/</value>
<name>Filesystem/TriggerFile</name>
<value>/home/THEtrigger/</value>
</root>

flat file:

Connections.JDBC.Install~Local
Connections.JDBC.XX.USER.NAME~UName
Connections.JDBC.XX.PASSWORD~PWord
Filesystem.FileLoc~/home/THEfile/
Filesystem.TriggerFile~/home/THEtrigger/

xml file:

<root>
<name>Connections/JDBC/Install</name>
<value>Remote</value>
<name>Connections/JDBC/XX/USER/NAME</name>
<value>name</value>
<name>Connections/JDBC/XX/PASSWORD</name>
<value>pwd</value>
<name>Filesystem/FileLoc</name>
<value>/home/fileloc/</value>
<name>Filesystem/TriggerFile</name>
<value>/home/triggerfile/</value>
</root>

Yoda · June 29, 2016, 12:48pm

Replace first sub function in the code with gsub

   FNR == NR {
               gsub ( /\./, "/", $1 )

Aia · June 29, 2016, 12:59pm

By using -F'[~><]' lines like <name>Connections/JDBC/Install</name> get split into tokens:

echo "<name>Connections/JDBC/Install</name>" | awk -F'[~><]' '{for(i=1;i<=NF;i++){print i ": " $i}}'

Output:

1:
2: name
3: Connections/JDBC/Install
4: /name
5:

Now, you see $1, $2, $3, $4, $5