Parsing a file and setting to variables.

greetings all, I have a task right now that is somewhat stumping me, and I am not sure what the best approach is to take it.

I have a text file that will contain something similar to the following:

 
 first1, other1
 first2, other2
 first3, other3
 first4, other4
 

I have to generate an XML file that takes the value of first and other, and puts them into an XML file such as below so I can run my program against it:

 
 <application="testApp">
                <task="modify" name="document">
                <mapping first="first1" other="other1" />
                <mapping first="first2" other="other2" />
                <mapping first="first3" other="other3" />
                <mapping first="first4" other="other4" />
                 </task>
 </application>
 

Scratching my head over this one. Whats the best approach at this?

Thanks in advance.

something along these lines: awk -f jeff.awk myTextFile.txt where jeff.awk is:

BEGIN {
  FS=OFS=","
  qq="\""
  printf("<application=%stestApp%s>\n\t<task=%smodify%s name=%sdocument%s>\n", qq, qq,qq,qq,qq,qq)
}
{printf("\t<mapping first=%s%s%s other=%s%s%s />\n", qq, $1, qq, qq, $2, qq)}

END {
  printf("\t</task>\n</application>\n")
}

This looks like it works. A bit more advanced than I had thought. One last question, is how do I output this to a file. I am trying basic redirect of

>> xml.out

and it does not like that.

awk -f jeff.awk myTextFile.txt > xml.out

If you're expecting to append to the file, xml doesn't really work that way. A new file would have to be generated each time.

wow, that was a newb error!!!! you folks rock!

Hello all,

I ran into an issue with this in forming the XML.

I am calling the awk script as shown:

awk -f config/jeff.awk ./input/input.txt >> output.xml

I am getting this error:

 syntax error The source line is 1.
 The error context is
                 >>> . <<<  /location/to/properties/file/config.file
 awk: Quitting
 The source line is 1.

Because within the config.file I made some changes, not sure if there is a more efficient way to do this..

. /location/to/properties/file/config.file
BEGIN {
  FS=OFS=","
  qq="\""
  printf("<application name=%s$FIELD1%s>\n\t<field task=%supdate%s name=%s$FIELD2%s>\n", qq, qq,qq,qq,qq,qq)
}
{printf("\t<mapping db=%s%s%s displayed=%s%s%s />\n", qq, $1, qq, qq, $2, qq)}
END {
  printf("\t</field>\n</application>\n")
}

Am I doing this incorrectly?

No, this is incorrect.
What's in . /location/to/properties/file/config.file ?
What are you trying to do?

variables..

VARIABLE1="something1"
VARIABLE2="something2"

huh? you lost me there.... What variables?
And what's inside ./input/input.txt ?

Now I'm confused what's your input looks like.

Post#7 is quite misleading. After rereading and rereading, the last code tagged block seems to be the config/jeff.awk script to be run by awk . The first line looks like it is meant to "source" /location/to/properties/file/config.file , which is a bash (or, generally, shell) builtin. Of course, awk doesn't understand this and errors out accordingly.
And, not sure why you think you need those variables - they're not used anywhere.

1 Like

Apologies for this being misleading, allow me try and attempt to clarify.

You answered my question in this post. I am trying to source the config file just like I would do in a normal shell script. Obviously, that is one thing that is not working.

BEGIN {
  FS=OFS=","
  qq="\""
  printf("<application name=%s$FIELD1%s>\n\t<field task=%supdate%s name=%s$FIELD2%s>\n", qq, qq,qq,qq,qq,qq)
}
{printf("\t<mapping db=%s%s%s displayed=%s%s%s />\n", qq, $1, qq, qq, $2, qq)}
END {
  printf("\t</field>\n</application>\n")
}

When I remove the "sourcing" at the top of the file, this generates the XML like I expect it to, with the value of application name set as FIELD1, and the value of field name set as FIELD2.

It might have been clearer if I could code a config file to be read from this awk script, much like I have the shell script reading from. I hope this makes sense. Thank you for the clarification and any further assistance.

Sourcing configurations is something awk is not designed for. You can cheat like

awk -f config.awk  -f jeff.awk ./input/input.txt >> output.xml

with config.awk containing

{
VARIABLE1="something1"
VARIABLE2="something2"
}

or other methods presented in these forums like assigning variables outside the script, or using a data file to read the variables upfront (but losing variables' names).

awk is not shell and does not understand shell syntax, awk is awk.

awk -f /location/to/properties/file/config.file -f config/jeff.awk ./input/input.txt >> output.txt

Sourcing a file is not the only shell feature that is not supported in awk . If we assume that you have sourced settings for the following variables:

FIELD1="text1"
FIELD2="text2"

then the awk code:

  printf("<application name=%s$FIELD1%s>\n\t<field task=%supdate%s name=%s$FIELD2%s>\n", qq, qq,qq,qq,qq,qq)

will print:

<application name="$FIELD1">\n\t<field task="update" name="$FIELD2">

not:

<application name="text1">\n\t<field task="update" name="text2">

If the above is the output you want, you would need a printf call more like:

  printf("<application name=%s%s%s>\n\t<field task=%supdate%s name=%s%s%s>\n",
    qq, FIELD1, qq, qq, qq, qq, FIELD2, qq)

because variables and variables preceded by a $ are not expanded inside double-quoted strings in awk and in awk $var expands to the contents of the field number named by the number assigned to the variable named var ; not to the string assigned to the variable named var .